From 6887a16fa13c9b29f8447c85bf36341c3b5b8917 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 24 Jan 2022 17:49:32 -0500 Subject: [PATCH 01/11] start add general code design doc. --- doc/src/Developer.rst | 1 + doc/src/Developer_cxx_vs_c_style.rst | 73 ++++++++++++++++++++++++++++ 2 files changed, 74 insertions(+) create mode 100644 doc/src/Developer_cxx_vs_c_style.rst diff --git a/doc/src/Developer.rst b/doc/src/Developer.rst index fd4a44a8a0..4d82f93625 100644 --- a/doc/src/Developer.rst +++ b/doc/src/Developer.rst @@ -11,6 +11,7 @@ of time and requests from the LAMMPS user community. :maxdepth: 1 Developer_org + Developer_cxx_vs_c_style Developer_parallel Developer_flow Developer_write diff --git a/doc/src/Developer_cxx_vs_c_style.rst b/doc/src/Developer_cxx_vs_c_style.rst new file mode 100644 index 0000000000..868b183ec0 --- /dev/null +++ b/doc/src/Developer_cxx_vs_c_style.rst @@ -0,0 +1,73 @@ +Code design +----------- + +This section discusses some of the code design choices in LAMMPS and +overall strategy in order to assist developers to write new code that +will fit well with the remaining code. Please see the section on +:doc:`Requirements for contributed code ` for more +specific recommendations and guidelines. Here the focus is on overall +strategy and discussion of some relevant C++ programming language +constructs. + +Historically, the basic design philosophy of the LAMMPS C++ code was +that of a "C with classes" style. The was motivated by the desire to +make it easier to modify LAMMPS for people without significant training +in C++ programming and by trying to use data structures and code constructs +that somewhat resemble the previous implementation(s) in Fortran. +A contributing factor for this choice also was that at the time the +implementation of C++ compilers was not always very mature and some of +the advanced features contained bugs or were not functioning exactly +as the standard required; plus there was some disagreement between +compiler vendors about how to interpret the C++ standard documents. + +However, C++ compilers have advanced a lot since then and with the +transition to requiring the C++11 standard in 2020 as the minimum C++ language +standard for LAMMPS, the decision was made to also replace some of the +C-style constructs with equivalent C++ functionality, either from the +C++ standard library or as custom classes or function, in order to +improve readability of the code and to increase code reuse through +abstraction of commonly used functionality. + + +Object oriented code +^^^^^^^^^^^^^^^^^^^^ + +LAMMPS is designed to be an object oriented code, that is each simulation +is represented by an instance of the LAMMPS class. When running in parallel, +of course, each MPI process will create such an instance. This can be seen +in the ``main.cpp`` file where the core steps of running a LAMMPS simulation +are the following 3 lines of code: + +.. code-block:: C++ + + LAMMPS *lammps = new LAMMPS(argc, argv, lammps_comm); + lammps->input->file(); + delete lammps; + +The first line creates a LAMMPS class instance and passes the command line arguments +and the global communicator to its constructor. The second line tells the LAMMPS +instance to process the input (either from standard input or the provided input file) +until the end. And the third line deletes that instance again. The remainder of +the main.cpp file are for error handling, MPI configuration and other special features. + +In the constructor of the LAMMPS class instance the basic LAMMPS class hierachy +is created as shown in :ref:`class-topology`. While processing the input further +class instances are created, or deleted, or replaced and specific member functions +of specific classes are called to trigger actions like creating atoms, computing +forces, computing properties, propagating the system, or writing output. + + +Inheritance and Compositing +=========================== + +Polymorphism +============ + + +I/O and output formatting +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Memory management +^^^^^^^^^^^^^^^^^ + + From 1c7e1faeff8ecb214f30754c64cf38708fe29179 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Fri, 11 Feb 2022 11:59:27 -0500 Subject: [PATCH 02/11] add sections on inheritance, compositing, polymorphism --- doc/src/Developer_cxx_vs_c_style.rst | 164 ++++++++++++++++++-- doc/utils/sphinx-config/false_positives.txt | 1 + 2 files changed, 150 insertions(+), 15 deletions(-) diff --git a/doc/src/Developer_cxx_vs_c_style.rst b/doc/src/Developer_cxx_vs_c_style.rst index 868b183ec0..0b0526e7f9 100644 --- a/doc/src/Developer_cxx_vs_c_style.rst +++ b/doc/src/Developer_cxx_vs_c_style.rst @@ -32,11 +32,11 @@ abstraction of commonly used functionality. Object oriented code ^^^^^^^^^^^^^^^^^^^^ -LAMMPS is designed to be an object oriented code, that is each simulation -is represented by an instance of the LAMMPS class. When running in parallel, -of course, each MPI process will create such an instance. This can be seen -in the ``main.cpp`` file where the core steps of running a LAMMPS simulation -are the following 3 lines of code: +LAMMPS is designed to be an object oriented code, that is each +simulation is represented by an instance of the LAMMPS class. When +running in parallel, of course, each MPI process will create such an +instance. This can be seen in the ``main.cpp`` file where the core +steps of running a LAMMPS simulation are the following 3 lines of code: .. code-block:: C++ @@ -44,30 +44,164 @@ are the following 3 lines of code: lammps->input->file(); delete lammps; -The first line creates a LAMMPS class instance and passes the command line arguments -and the global communicator to its constructor. The second line tells the LAMMPS -instance to process the input (either from standard input or the provided input file) -until the end. And the third line deletes that instance again. The remainder of -the main.cpp file are for error handling, MPI configuration and other special features. +The first line creates a LAMMPS class instance and passes the command +line arguments and the global communicator to its constructor. The +second line tells the LAMMPS instance to process the input (either from +standard input or the provided input file) until the end. And the third +line deletes that instance again. The remainder of the main.cpp file +are for error handling, MPI configuration and other special features. -In the constructor of the LAMMPS class instance the basic LAMMPS class hierachy + +In the constructor of the LAMMPS class instance the basic LAMMPS class hierarchy is created as shown in :ref:`class-topology`. While processing the input further class instances are created, or deleted, or replaced and specific member functions of specific classes are called to trigger actions like creating atoms, computing forces, computing properties, propagating the system, or writing output. - -Inheritance and Compositing +Compositing and Inheritance =========================== +LAMMPS makes extensive use of the object oriented programming (OOP) +principles of *compositing* and *inheritance*. Classes like the +``LAMMPS`` class are a **composite** containing pointers to instances of +other classes like ``Atom``, ``Comm``, ``Force``, ``Neighbor``, +``Modify``, and so on. Each of these classes implement certain +functionality by storing and manipulating data related to the simulation +and providing member functions that trigger certain actions. Some of +those classes like ``Force`` and a composite again containing instances +of classes describing the force interactions or ``Modify`` containing +and calling fixes and computes. In most cases there is only one instance +of those member classes allowed, but in a few cases there can also be +multiple instances and the parent class is maintaining a list of the +pointers of instantiated classes. + +Changing behavior or adjusting how LAMMPS handles the simulation is +implemented via **inheritance** where different variants of the +functionality are realized by creating *derived* classes that can share +common functionality in their base class and provide a consistent +interface where the derived classes replace (dummy or pure) functions in +the base class. The higher level classes can then call those methods of +the instantiated classes without having to know which specific derived +class variant was instantiated. In the LAMMPS documentation those +derived classes are usually referred to a "styles", e.g. pair styles, +fix styles, atom styles and so on. + +This is the origin of the flexibility of LAMMPS and facilitates for +example to compute forces for very different non-bonded potential +functions by having different pair styles (implemented as different +classes derived from the ``Pair`` class) where the evaluation of the +potential function is confined to the implementation of the individual +classes. Whenever a new :doc:`pair_style` or :doc:`bond_style` or +:doc:`comm_style` or similar command is processed in the LAMMPS input +any existing class instance is deleted and a new instance created in +it place. + +Further code sharing is possible by creating derived classes from the +derived classes (for instance to implement an accelerated version of a +pair style) where then only a subset of the methods are replaced with +the accelerated versions. + Polymorphism ============ +Polymorphism and dynamic dispatch are another OOP feature that play an +important part of how LAMMPS selects which code to execute. In a nutshell, +this is a mechanism where the decision of which member function to call +from a class is determined at runtime and not when the code is compiled. +To enable it, the function has to be declared as ``virtual`` and all +corresponding functions in derived classes should be using the ``override`` +property. Below is a brief example. + +.. code-block:: c++ + + class Base { + public: + virtual ~Base() = default; + void call(); + void normal(); + virtual void poly(); + }; + + void Base::call() { + normal(); + poly(); + } + + class Derived : public Base { + public: + ~Derived() override = default; + void normal(); + void poly() override; + }; + + // [....] + + Base *base1 = new Base(); + Base *base2 = new Derived(); + + base1->call(); + base2->call(); + +The difference in behavior of the ``normal()`` and the ``poly()`` member +functions is in which of the two member functions is called when +executing `base1->call()` and `base2->call()`. Without polymorphism, a +function within the base class will call only member functions within +the same scope, that is ``Base::call()`` will always call +``Base::normal()``. But for the `base2->call()` the call for the +virtual member function will be dispatched to ``Derived::poly()`` +instead. This mechanism allows to always call functions within the +scope of the class type that was used to create the class instance, even +if they are assigned to a pointer using the type of a base class. +Thanks to dynamic dispatch, LAMMPS can even use styles that are loaded +at runtime from a shared object file with the :doc:`plugin command `. + +A special case of virtual functions are so-called pure functions. These +are virtual functions that are initialized to 0 in the class declaration +(see example below). + +.. code-block:: c++ + + class Base { + public: + virtual void pure() = 0; + }; + +This has the effect that it will no longer be possible to create an instance +of the base class and that derived classes **must** implement these classes. +Many of the functions listed with the various styles in the section :doc:`Modify` +are such pure functions. The motivation for this is to define the interface +or API of functions. + +However, there are downsides to this. For example, calls virtual functions +from within a constructor, will not be in the scope of the derived class and thus +it is good practice to either avoid calling them or to provide an explicit scope like +in ``Base::poly()``. Furthermore, any destructors in classes containing +virtual functions should be declared virtual, too, so they are processed +in the expected order before types are removed from dynamic dispatch. + +.. admonition:: Important Notes + + In order to be able to detect incompatibilities and to avoid unexpected + behavior already at compile time, it is crucial that all member functions + that are intended to replace a virtual or pure function use the ``override`` + property keyword. For the same reason it should be avoided to use overloads + or default arguments for virtual functions. + +Style Factories +=============== + +In order to create class instances of the different styles, LAMMPS often +uses a programming pattern called `Factory`. Those are functions that create +an instance of a specific derived class, say ``PairLJCut`` and return a pointer +to the type of the common base class of that style, ``Pair`` in this case. +To associate the factory function with the style keyword, an ``std::map`` +class is used in which function pointers are indexed by their keyword +(for example "lj/cut" for ``PairLJCut`` and "morse" ``PairMorse``). +A couple of typedefs help to keep the code readable and a template function +is used to implement the actual factory functions for the individual classes. I/O and output formatting ^^^^^^^^^^^^^^^^^^^^^^^^^ Memory management ^^^^^^^^^^^^^^^^^ - - diff --git a/doc/utils/sphinx-config/false_positives.txt b/doc/utils/sphinx-config/false_positives.txt index 1d4c27822b..935a31069f 100644 --- a/doc/utils/sphinx-config/false_positives.txt +++ b/doc/utils/sphinx-config/false_positives.txt @@ -3408,6 +3408,7 @@ typeI typeJ typeN typeargs +typedefs Tz Tzou ub From 1ab5b9d7fd4baa3891006a571231f495c15df1f3 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Sun, 13 Feb 2022 15:25:51 -0500 Subject: [PATCH 03/11] re-sort list of false poisitives alphabetically with "sort" --- doc/utils/sphinx-config/false_positives.txt | 228 ++++++++++---------- 1 file changed, 114 insertions(+), 114 deletions(-) diff --git a/doc/utils/sphinx-config/false_positives.txt b/doc/utils/sphinx-config/false_positives.txt index 935a31069f..b133bbb1f6 100644 --- a/doc/utils/sphinx-config/false_positives.txt +++ b/doc/utils/sphinx-config/false_positives.txt @@ -52,8 +52,8 @@ aij aimd airebo Aj -ajs ajaramil +ajs akohlmey Aktulga al @@ -119,10 +119,10 @@ Appl Apu arallel arccos -arge Archlinux arcsin arg +arge args argv arrhenius @@ -149,9 +149,9 @@ atc AtC ATC athermal +athomps atime atimestep -athomps atm atomeye atomfile @@ -196,7 +196,6 @@ Bagi Bagnold Baig Bajaj -Bkappa Bal balancer Balankura @@ -215,8 +214,8 @@ barostatting Barostatting Barrat Barros -Bartelt Bartels +Bartelt barycenter barye Bashford @@ -258,7 +257,6 @@ bigint Bij bilayer bilayers -biquadratic binsize binstyle binutils @@ -267,6 +265,7 @@ biomolecule Biomolecules Biophys Biosym +biquadratic bisectioning bispectrum Bispectrum @@ -277,6 +276,7 @@ bitrate bitrates Bitzek Bjerrum +Bkappa Blaise blanchedalmond blocksize @@ -315,14 +315,14 @@ Botu Bouguet Bourne boxcolor -boxlo boxhi -boxxlo +boxlo boxxhi -boxylo +boxxlo boxyhi -boxzlo +boxylo boxzhi +boxzlo bp bpclermont bpls @@ -422,13 +422,14 @@ Chaudhuri checkbox checkmark checkqeq +checksum chemistries Chemnitz Cheng Chenoweth chiral -ChiralIDs chiralIDs +ChiralIDs chirality Cho ChooseOffset @@ -499,12 +500,12 @@ cond conda Conda Condens -Connor conf config configfile configurational conformational +Connor ConstMatrix Contrib cooperativity @@ -561,14 +562,14 @@ cstring cstyle csvr ctrl -Ctypes ctypes +Ctypes cuda Cuda CUDA +cuFFT CuH Cui -cuFFT Cummins Curk Cusentino @@ -630,11 +631,11 @@ de dE De deallocated -decorrelation debye Debye Decius decompositions +decorrelation decrementing deeppink deepskyblue @@ -643,8 +644,8 @@ defn deformable del delaystep -DeleteIDs deleteIDs +DeleteIDs delflag Dellago delocalization @@ -704,8 +705,8 @@ Dihedrals dihydride Dij dimdim -dimensioned dimensionality +dimensioned dimgray dipolar dir @@ -730,10 +731,10 @@ dmg dmi dnf DNi -Dobson Dobnikar -Dodds +Dobson docenv +Dodds dodgerblue dof doi @@ -741,10 +742,10 @@ Donadio Donev dotc Doty +downarrow doxygen doxygenclass doxygenfunction -downarrow Doye Doyl dpd @@ -813,8 +814,8 @@ eco ecoul ecp Ecut -EdgeIDs edgeIDs +EdgeIDs edihed edim edip @@ -826,8 +827,8 @@ ee Eebt ees eFF -efield effm +efield eflag eflux eg @@ -836,10 +837,10 @@ ehex eHEX Ei eigen +eigendecomposition eigensolve eigensolver eigensolvers -eigendecomposition eigenvalue eigenvalues eigenvector @@ -914,15 +915,15 @@ equilibrated equilibrates equilibrating equilibration -Equilibria equilibria +Equilibria equilization equipartitioning -Ercolessi -Erdmann eradius erate erc +Ercolessi +Erdmann erf erfc Erhart @@ -932,10 +933,10 @@ erotate errno Ertas ervel -Espanol -Eshelby eshelby +Eshelby eskm +Espanol esph estretch esu @@ -950,20 +951,20 @@ etol etot etotal etube -Eulerian eulerian +Eulerian eulerimplicit Europhys ev eV +eval +evals evalue Evanseck evdwl -evector evec evecs -eval -evals +evector Everaers Evgeny evirials @@ -992,13 +993,13 @@ fbMC Fc fcc fcm -Fd fd +Fd fdotr fdt +fe Fehlberg Fellinger -fe femtosecond femtoseconds fene @@ -1033,11 +1034,14 @@ filename Filename filenames Filenames -Fily fileper filesystem +filesystems +Fily Fincham Finchham +fingerprintconstants +fingerprintsperelement Finnis Fiorin fixID @@ -1076,8 +1080,8 @@ forestgreen formatarg formulae Forschungszentrum -Fortran fortran +Fortran Fosado fourier fp @@ -1099,6 +1103,7 @@ fstyle fsw ftm ftol +fuer fugacity Fumi func @@ -1106,7 +1111,6 @@ funcs functionalities functionals funroll -fuer fx fy fz @@ -1145,8 +1149,8 @@ Germann Germano gerolf Gerolf -getrusage Gershgorin +getrusage getter gettimeofday gewald @@ -1223,8 +1227,8 @@ gsmooth gstyle GTL Gubbins -Guericke Guenole +Guericke gui Gumbsch Gunsteren @@ -1300,7 +1304,6 @@ histogrammed histogramming hma hmaktulga -hplanck hoc Hochbruck Hofling @@ -1317,6 +1320,7 @@ howto Howto Hoy Hoyt +hplanck Hs hstyle html @@ -1347,8 +1351,8 @@ hyperspherical hysteretic hz IAP -Ibanez iatom +Ibanez ibar ibm icc @@ -1439,8 +1443,8 @@ ipi ipp Ippolito IPv -IPython ipython +IPython Isele isenthalpic ish @@ -1456,8 +1460,8 @@ isotropically isovolume Isralewitz iter -iters iteratively +iters Ith Itsets itype @@ -1600,8 +1604,8 @@ KMP kmu Knizhnik knl -Kofke kofke +Kofke Kohlmeyer Kohn kokkos @@ -1653,15 +1657,15 @@ Lackmann Ladd lagrangian lambdai -lamda LambdaLanczos +lamda lammps Lammps LAMMPS lammpsplot lammpsplugin -Lampis Lamoureux +Lampis Lanczos Lande Landron @@ -1674,8 +1678,8 @@ larentzos Larentzos Laroche lars -LATBOLTZ latboltz +LATBOLTZ latencies Latour latourr @@ -1830,13 +1834,13 @@ Lyulin lz lzma Maaravi -MACHDYN machdyn +MACHDYN Mackay Mackrodt +MacOS Macromolecules macroparticle -MacOS Madura Magda Magdeburg @@ -1920,8 +1924,8 @@ mc McLachlan md mdf -MDI mdi +MDI mdpd mDPD meam @@ -1945,8 +1949,8 @@ Mei Melchor Meloni Melrose -Mem mem +Mem memalign MEMALIGN membered @@ -1960,10 +1964,10 @@ Merz meshless meso mesocnt -MESODPD mesodpd -MESONT +MESODPD mesont +MESONT mesoparticle mesoscale mesoscopic @@ -1998,8 +2002,8 @@ Militzer Minary mincap Mindlin -minhbonds mingw +minhbonds minima minimizations minimizer @@ -2098,6 +2102,7 @@ Muccioli mui Mukherjee Mulders +Müller mult multi multibody @@ -2126,7 +2131,6 @@ muVT mux muy muz -Müller mv mV Mvapich @@ -2146,9 +2150,9 @@ nabla Nagaosa Nakano nall +namedtuple namespace namespaces -namedtuple nan NaN Nandor @@ -2164,8 +2168,8 @@ nanometer nanometers nanoparticle nanoparticles -Nanotube nanotube +Nanotube nanotubes Narulkar nasa @@ -2201,8 +2205,8 @@ ncount nd ndescriptors ndihedrals -ndihedraltypes Ndihedraltype +ndihedraltypes Ndirango ndof Ndof @@ -2214,9 +2218,9 @@ Neel Neelov Negre nelem -nelems Nelement Nelements +nelems nemd netcdf netstat @@ -2250,8 +2254,8 @@ Nicklas Niklasson Nikolskiy nimpropers -nimpropertypes Nimpropertype +nimpropertypes Ninteger NiO Nissila @@ -2265,8 +2269,8 @@ nktv nl nlayers nlen -Nlines nlines +Nlines nlo nlocal Nlocal @@ -2274,16 +2278,16 @@ Nlog nlp nm Nm -Nmax nmax +Nmax nmc -Nmin nmin +Nmin Nmols nn nnodes -Nocedal nO +Nocedal nocite nocoeff nodeless @@ -2336,11 +2340,11 @@ Nrho Nroff nrow nrun +ns Ns Nsample Nskip Nspecies -ns nsq Nstart nstats @@ -2349,9 +2353,9 @@ Nsteplast Nstop nsub Nswap +nt Nt Ntable -nt ntheta nthreads ntimestep @@ -2394,11 +2398,11 @@ ocl octahedral octants Ohara +O'Hearn ohenrich ok Okabe Okamoto -O'Hearn O'Keefe OKeefe oldlace @@ -2456,8 +2460,8 @@ overdamped overlayed Ovito oxdna -oxrna oxDNA +oxrna oxRNA packings padua @@ -2506,7 +2510,6 @@ pc pchain Pchain pcmoves -pmcmoves Pdamp pdb pdf @@ -2565,13 +2568,16 @@ Pieniazek Pieter pIm pimd -pIp Piola +pIp Pisarev Pishevar Pitera pj pjintve +pKa +pKb +pKs planeforce Plathe Plimpton @@ -2580,10 +2586,8 @@ ploop PloS plt plumedfile -pKa -pKb -pKs pmb +pmcmoves Pmolrotate Pmoltrans pN @@ -2605,8 +2609,8 @@ polydisperse polydispersity polyelectrolyte polyhedra -polymorphism Polym +polymorphism popen Popov popstore @@ -2622,11 +2626,12 @@ Potapkin potin Pourtois powderblue +PowerShell ppn pppm -prd Prakash Praprotnik +prd pre Pre prec @@ -2643,8 +2648,8 @@ Priya proc Proc procs -Prony progguide +Prony ps Ps pscreen @@ -2675,8 +2680,8 @@ px Px pxx Pxx -Pxy pxy +Pxy pxz py Py @@ -2693,13 +2698,13 @@ Pyy pyz pz Pz -Pzz pzz +Pzz qbmsst qcore qdist -qE qe +qE qeff qelectron qeq @@ -2775,15 +2780,15 @@ RDideal rdx reacter Readline -realTypeMap -real_t README +real_t realtime +realTypeMap reamin reax -REAXFF -ReaxFF reaxff +ReaxFF +REAXFF rebo recurse recursing @@ -2811,8 +2816,8 @@ Rensselaer reparameterizing repo representable -Reproducibility reproducibility +Reproducibility repuls reqid rescale @@ -2934,10 +2939,10 @@ rxd rxnave rxnsum ry -rz Ryckaert Rycroft Rydbergs +rz Rz Sabry saddlebrown @@ -2970,9 +2975,9 @@ Schimansky Schiotz Schlitter Schmid -Schratt Schoen Schotte +Schratt Schulten Schunk Schuring @@ -3027,8 +3032,8 @@ Shiga Shinoda Shiomi shlib -SHM shm +SHM shockvel shrinkexceed Shugaev @@ -3147,10 +3152,10 @@ stepwise Stesmans Stillinger stk -Stockmayer -Stoddard stochastically stochasticity +Stockmayer +Stoddard stoichiometric stoichiometry Stokesian @@ -3210,8 +3215,8 @@ Swiler Swinburne Swol Swope -Sx sx +Sx sy Sy symplectic @@ -3220,8 +3225,8 @@ sys sysdim Syst systemd -Sz sz +Sz Tabbernor tabinner Tadmor @@ -3268,9 +3273,9 @@ tfmc tfMC tgnpt tgnvt +th Thakkar Thaokar -th thb thei Theodorou @@ -3328,11 +3333,11 @@ Tmin tmp tN Tobias +Toennies Tohoku tokenizer tokyo tol -Toennies tomic toolchain topologies @@ -3404,11 +3409,11 @@ twojmax Tx txt Tyagi +typeargs +typedefs typeI typeJ typeN -typeargs -typedefs Tz Tzou ub @@ -3426,8 +3431,8 @@ uk ul ulb Uleft -uloop Ulomek +uloop ulsph Ultrafast uMech @@ -3585,10 +3590,10 @@ vzcm vzi Waals Wadley -Waroquier wallstyle walltime Waltham +Waroquier wavepacket wB Wbody @@ -3606,12 +3611,12 @@ whitesmoke whitespace Wi Wicaksono -Widom widom +Widom Wijk Wikipedia -Wildcard wildcard +Wildcard wildcards Winkler Wirnsberger @@ -3625,12 +3630,12 @@ Worley Wriggers Wuppertal Wurtzite -Wysocki www wx Wx wy Wy +Wysocki wz Wz xa @@ -3678,10 +3683,10 @@ xyz xz xzhou yaff -yaml -Yanxon YAFF Yamada +yaml +Yanxon Yaser Yazdani Ybar @@ -3712,14 +3717,15 @@ Yuya yx yy yz +Zagaceta Zannoni Zavattieri zbl ZBL Zc zcm -Zeeman zeeman +Zeeman Zemer Zepeda zflag @@ -3733,29 +3739,23 @@ zi Zi ziegenhain Ziegenhain +zincblende Zj zlim zlo +Zm zmax zmin zmq zN zs zst +Zstandard +zstd +Zstd zsu zu zx zy Zybin zz -Zm -PowerShell -filesystems -fingerprintconstants -fingerprintsperelement -Zagaceta -zincblende -Zstandard -Zstd -zstd -checksum From 810717bfe53597dc0e5eac1986c7f8f14b6f9588 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Sun, 13 Feb 2022 15:49:50 -0500 Subject: [PATCH 04/11] discuss stdio vs iostreams and fmtlib --- doc/src/Developer_cxx_vs_c_style.rst | 60 ++++++++++++++++++--- doc/utils/sphinx-config/false_positives.txt | 2 + 2 files changed, 54 insertions(+), 8 deletions(-) diff --git a/doc/src/Developer_cxx_vs_c_style.rst b/doc/src/Developer_cxx_vs_c_style.rst index 0b0526e7f9..e4f04335a4 100644 --- a/doc/src/Developer_cxx_vs_c_style.rst +++ b/doc/src/Developer_cxx_vs_c_style.rst @@ -51,7 +51,6 @@ standard input or the provided input file) until the end. And the third line deletes that instance again. The remainder of the main.cpp file are for error handling, MPI configuration and other special features. - In the constructor of the LAMMPS class instance the basic LAMMPS class hierarchy is created as shown in :ref:`class-topology`. While processing the input further class instances are created, or deleted, or replaced and specific member functions @@ -151,9 +150,10 @@ the same scope, that is ``Base::call()`` will always call virtual member function will be dispatched to ``Derived::poly()`` instead. This mechanism allows to always call functions within the scope of the class type that was used to create the class instance, even -if they are assigned to a pointer using the type of a base class. -Thanks to dynamic dispatch, LAMMPS can even use styles that are loaded -at runtime from a shared object file with the :doc:`plugin command `. +if they are assigned to a pointer using the type of a base class. This +is the desired behavior, and thanks to dynamic dispatch, LAMMPS can even +use styles that are loaded at runtime from a shared object file with the +:doc:`plugin command `. A special case of virtual functions are so-called pure functions. These are virtual functions that are initialized to 0 in the class declaration @@ -170,9 +170,10 @@ This has the effect that it will no longer be possible to create an instance of the base class and that derived classes **must** implement these classes. Many of the functions listed with the various styles in the section :doc:`Modify` are such pure functions. The motivation for this is to define the interface -or API of functions. +or API of functions but defer the implementation of those functionality to +the derived classes. -However, there are downsides to this. For example, calls virtual functions +However, there are downsides to this. For example, calls to virtual functions from within a constructor, will not be in the scope of the derived class and thus it is good practice to either avoid calling them or to provide an explicit scope like in ``Base::poly()``. Furthermore, any destructors in classes containing @@ -185,7 +186,9 @@ in the expected order before types are removed from dynamic dispatch. behavior already at compile time, it is crucial that all member functions that are intended to replace a virtual or pure function use the ``override`` property keyword. For the same reason it should be avoided to use overloads - or default arguments for virtual functions. + or default arguments for virtual functions as they lead to confusion over + which function is supposed to override which and which arguments need to be + declared. Style Factories =============== @@ -198,10 +201,51 @@ To associate the factory function with the style keyword, an ``std::map`` class is used in which function pointers are indexed by their keyword (for example "lj/cut" for ``PairLJCut`` and "morse" ``PairMorse``). A couple of typedefs help to keep the code readable and a template function -is used to implement the actual factory functions for the individual classes. +is used to implement the actual factory functions for the individual classes. I/O and output formatting ^^^^^^^^^^^^^^^^^^^^^^^^^ +C-style stdio versus C++ style iostreams +======================================== + +LAMMPS chooses to use the "stdio" library of the standard C library for +reading from and writing to files and console instead of "iostreams" that were +introduced with C++. This is mainly motivated by the better performance, +better control over formatting, and less effort to achieve specific formatting. + +Since mixing "stdio" and "iostreams" can lead to unexpected behavior using +the latter is strongly discouraged. Also output to the screen should not +use the predefined ``stdout`` FILE pointer, but rather the ``screen`` and +``logfile`` FILE pointers managed by the LAMMPS class. Furthermore, output +should only be done by MPI rank 0 (``comm->me == 0``) and output that is +send to both ``screen`` and ``logfile`` should use the +:cpp:func:`utils::logmesg() convenience function `. + +Formatting with the {fmt} library +=================================== + +The LAMMPS source code includes a copy of the `{fmt} library +`_ which is preferred over formatting with the +"printf()" family of functions. The primary reason is that it allows a +typesafe default format for any type of supported data. This is +particularly useful for formatting integers of a given size (32-bit or +64-bit) which may require different format strings depending on compile +time settings or compilers/operating systems. Furthermore, {fmt} gives +better performance, has more functionality, a familiar formatting syntax +that has similarities to ``format()`` in Python, and provides a facility +that can be used to integrate format strings and a variable number of +arguments into custom functions in a much simpler way that the varargs +mechanism of the C library. Finally, {fmt} has been included into the +C++20 language standard, so changes to adopt it are future proof. + +Formatted strings are most commonly created by calling the +``fmt::format()`` function which will return a string as ``std::string`` +class instance. In contrast to the ``%`` placeholder in ``printf()``, +the {fmt} library uses ``{}`` to embed format descriptors. In the +simplest case, no additional characters are needed as {fmt} will choose +the default format based on the data type of the argument. + + Memory management ^^^^^^^^^^^^^^^^^ diff --git a/doc/utils/sphinx-config/false_positives.txt b/doc/utils/sphinx-config/false_positives.txt index b133bbb1f6..944f8409da 100644 --- a/doc/utils/sphinx-config/false_positives.txt +++ b/doc/utils/sphinx-config/false_positives.txt @@ -3414,6 +3414,7 @@ typedefs typeI typeJ typeN +typesafe Tz Tzou ub @@ -3497,6 +3498,7 @@ Valuev Vanden Vandenbrande Vanduyfhuys +varargs varavg Varshalovich Varshney From 12f746046f67594391a846629f208691793bd4e1 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 14 Feb 2022 08:45:55 -0500 Subject: [PATCH 05/11] finalize {fmt} lib info --- doc/src/Developer_cxx_vs_c_style.rst | 63 +++++++++++++++++++++++++++- 1 file changed, 61 insertions(+), 2 deletions(-) diff --git a/doc/src/Developer_cxx_vs_c_style.rst b/doc/src/Developer_cxx_vs_c_style.rst index e4f04335a4..a40d231318 100644 --- a/doc/src/Developer_cxx_vs_c_style.rst +++ b/doc/src/Developer_cxx_vs_c_style.rst @@ -239,12 +239,71 @@ arguments into custom functions in a much simpler way that the varargs mechanism of the C library. Finally, {fmt} has been included into the C++20 language standard, so changes to adopt it are future proof. -Formatted strings are most commonly created by calling the +Formatted strings are frequently created by calling the ``fmt::format()`` function which will return a string as ``std::string`` class instance. In contrast to the ``%`` placeholder in ``printf()``, the {fmt} library uses ``{}`` to embed format descriptors. In the simplest case, no additional characters are needed as {fmt} will choose -the default format based on the data type of the argument. +the default format based on the data type of the argument. Alternatively +The ``fmt::print()`` function may be used instead of ``printf()`` or +``fprintf()``. In addition, several LAMMPS output functions, that +originally accepted a single string as arguments have been overloaded to +accept a format string with optional arguments as well (e.g. +``Error::all()``, ``Error::one()``, ``utils::logmesg()``). + +Summary of the {fmt} format syntax +================================== + +The syntax of the format string is "{[][:]}", +where either the argument id or the format spec (separated by a colon +':') is optional. The argument id is usually a number starting from 0 +that is the index to the arguments following the format string. By +default these are assigned in order (i.e. 0, 1, 2, 3, 4 etc.). The most +common case for using argument id would be to use the same argument in +multiple places in the format string without having to provide it as an +argument multiple times. In LAMMPS the argument id is rarely used. + +More common is the use of the format specifier, which starts with a +colon. This may optionally be followed by a fill character (default is +' '). If provided, the fill character **must** be followed by an +alignment character ('<', '^', '>' for left, centered, or right +alignment (default)). The alignment character may be used without a fill +character. The next important format parameter would be the minimum +width, which may be followed by a dot '.' and a precision for floating +point numbers. The final character in the format string would be an +indicator for the "presentation", i.e. 'd' for decimal presentation of +integers, 'x' for hexadecimal, 'o' for octal, 'c' for character +etc. This mostly follows the "printf()" scheme but without requiring an +additional length parameter to distinguish between different integer +widths. The {fmt} library will detect those and adapt the formatting +accordingly. For floating point numbers there are correspondingly, 'g' +for generic presentation, 'e' for exponential presentation, and 'f' for +fixed point presentation. + +Thus "{:8}" would represent *any* type argument using at least 8 +characters; "{:<8}" would do this as left aligned, "{:^8}" as centered, +"{:>8}" as right aligned. If a specific presentation is selected, the +argument type must be compatible or else the {fmt} formatting code will +throw an exception. Some format string examples are given below: + +.. code-block:: C + + auto mesg = fmt::format(" CPU time: {:4d}:{:02d}:{:02d}\n", cpuh, cpum, cpus); + mesg = fmt::format("{:<8s}| {:<10.5g} | {:<10.5g} | {:<10.5g} |{:6.1f} |{:6.2f}\n", + label, time_min, time, time_max, time_sq, tmp); + utils::logmesg(lmp,"{:>6} = max # of 1-2 neighbors\n",maxall); + utils::logmesg(lmp,"Lattice spacing in x,y,z = {:.8} {:.8} {:.8}\n", + xlattice,ylattice,zlattice); + +A special feature of the {fmt} library is that format parameters like +the width or the precision may be also provided as arguments. In that +case a nested format is used where a pair of curly braces (with an +optional argument id) "{}" are used instead of the value, for example +"{:{}d}" will consume two integer arguments, the first will be the value +shown and the second the minimum width. + +For more details and examples, please consult the `{fmt} syntax +documentation `_ website. Memory management From 1a6b627fa0034afb00fc71279dc88569b1dcba66 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 14 Feb 2022 11:54:37 -0500 Subject: [PATCH 06/11] add section about memory allocations --- doc/src/Developer_cxx_vs_c_style.rst | 72 +++++++++++++++++++++++++--- 1 file changed, 66 insertions(+), 6 deletions(-) diff --git a/doc/src/Developer_cxx_vs_c_style.rst b/doc/src/Developer_cxx_vs_c_style.rst index a40d231318..993a6aa6a5 100644 --- a/doc/src/Developer_cxx_vs_c_style.rst +++ b/doc/src/Developer_cxx_vs_c_style.rst @@ -5,9 +5,11 @@ This section discusses some of the code design choices in LAMMPS and overall strategy in order to assist developers to write new code that will fit well with the remaining code. Please see the section on :doc:`Requirements for contributed code ` for more -specific recommendations and guidelines. Here the focus is on overall -strategy and discussion of some relevant C++ programming language -constructs. +specific recommendations and guidelines. While that section is +organized more in the form of a checklist for code contributors, the +focus here is on overall code design strategy, choices made between +possible alternatives, and to discuss of some relevant C++ programming +language constructs. Historically, the basic design philosophy of the LAMMPS C++ code was that of a "C with classes" style. The was motivated by the desire to @@ -28,6 +30,17 @@ C++ standard library or as custom classes or function, in order to improve readability of the code and to increase code reuse through abstraction of commonly used functionality. +.. note:: + + Please note that as of spring 2022 there is still a sizable chunk of + legacy code in LAMMPS that has not yet been refactored to reflect these + style conventions in full. LAMMPS has a large code base and many + different contributors and there also is a hierarchy of precedence + in which the code is adapted. Highest priority has the code in the + ``src`` folder, followed by code in packages in order of their popularity + and complexity (simpler code is adapted sooner), followed by code + in the ``lib`` folder. Source code that is downloaded during compilation + is not subject to the conventions discussed here. Object oriented code ^^^^^^^^^^^^^^^^^^^^ @@ -210,9 +223,9 @@ C-style stdio versus C++ style iostreams ======================================== LAMMPS chooses to use the "stdio" library of the standard C library for -reading from and writing to files and console instead of "iostreams" that were -introduced with C++. This is mainly motivated by the better performance, -better control over formatting, and less effort to achieve specific formatting. +reading from and writing to files and console instead of C++ +"iostreams". This is mainly motivated by the better performance, better +control over formatting, and less effort to achieve specific formatting. Since mixing "stdio" and "iostreams" can lead to unexpected behavior using the latter is strongly discouraged. Also output to the screen should not @@ -222,6 +235,11 @@ should only be done by MPI rank 0 (``comm->me == 0``) and output that is send to both ``screen`` and ``logfile`` should use the :cpp:func:`utils::logmesg() convenience function `. +We also discourage the use for stringstreams as the bundled {fmt} library +and the customized tokenizer classes can provide the same functionality +in a cleaner way with better performance. This will also help to retain +a consistent programming style despite the many different contributors. + Formatting with the {fmt} library =================================== @@ -295,6 +313,15 @@ throw an exception. Some format string examples are given below: utils::logmesg(lmp,"Lattice spacing in x,y,z = {:.8} {:.8} {:.8}\n", xlattice,ylattice,zlattice); +which will create the following output lines: + +.. parsed-literal:: + + CPU time: 0:02:16 + Pair | 2.0133 | 2.0133 | 2.0133 | 0.0 | 84.21 + 4 = max # of 1-2 neighbors + Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962 + A special feature of the {fmt} library is that format parameters like the width or the precision may be also provided as arguments. In that case a nested format is used where a pair of curly braces (with an @@ -308,3 +335,36 @@ documentation `_ website. Memory management ^^^^^^^^^^^^^^^^^ + +Dynamical allocation of data and objects should be done with either the +C++ commands "new" and "delete/delete[]" or using member functions of +the ``Memory`` class, most commonly, ``Memory::create()``, +``Memory::grow()``, and ``Memory::destroy()``. The use of ``malloc()``, +``calloc()``, ``realloc()`` and ``free()`` directly is strongly +discouraged. To simplify adapting legacy code into the LAMMPS code base +the member functions ``Memory::smalloc()``, ``Memory::srealloc()``, and +``Memory::sfree()`` are available. + +Using those custom memory allocation functions is motivated by the +following considerations: + +- memory allocation failures on *any* MPI rank during a parallel run will trigger + an immediate abort of the entire parallel calculation instead of stalling it +- a failing "new" will trigger an exception which is also captured by LAMMPS and + triggers a global abort +- allocation of multi-dimensional arrays will be done in a C compatible fashion + but so that the storage of the actual data is stored in one large consecutive block + and thus when MPI communication is needed, only this storage needs to be + communicated (similar to Fortran arrays) +- the "destroy()" and "sfree()" functions may safely be called on NULL pointers +- the "destroy()" functions will nullify the pointer variables making + "use after free" errors easy to detect +- it is possible to use a large than default memory alignment (not on all operating + systems, since the allocated storage pointers must be compatible with ``free()`` + for technical reasons) + +In the practical implementation of code this means that any pointer variables +that are class members should be initialized to a ``nullptr`` value in their +respective constructors. That way it would be safe to call ``Memory::destroy()`` +or ``delete[]`` on them before *any* allocation outside the constructor. +This helps to prevent memory leaks. From fbf95c2cbc530b72df6c43aadf7afeb483335cd9 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 14 Feb 2022 11:54:50 -0500 Subject: [PATCH 07/11] add notes about file parsing --- doc/src/Developer_notes.rst | 51 +++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+) diff --git a/doc/src/Developer_notes.rst b/doc/src/Developer_notes.rst index ab2e3826f2..23344de61b 100644 --- a/doc/src/Developer_notes.rst +++ b/doc/src/Developer_notes.rst @@ -7,6 +7,57 @@ typically document what a variable stores, what a small section of code does, or what a function does and its input/outputs. The topics on this page are intended to document code functionality at a higher level. +Reading and parsing of text and text files +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It is frequently required for a class in LAMMPS to read in additional +data from a file, most commonly potential parameters from a potential +file for manybody potentials. LAMMPS provides several custom classes +and convenience functions to simplify the process. This offers the +following benefits: + +- better code reuse and fewer lines of code needed to implement reading + and parsing data from a file +- better detection of format errors, incompatible data, and better error messages +- exit with an error message instead of silently converting only part of the + text to a number or returning a 0 on unrecognized text and thus reading incorrect values +- re-entrant code through avoiding global static variables (as used by ``strtok()``) +- transparent support for translating unsupported UTF-8 characters to their ASCII equivalents + (the text to value conversion functions **only** accept ASCII characters) + +In most cases (e.g. potential files) the same data is needed on all MPI +ranks. Then it is best to do the reading and parsing only on MPI rank +0, and communicate the data later with one or more ``MPI_Bcast()`` +calls. For reading generic text and potential parameter files the +custom classes :cpp:class:`TextFileReader ` +and :cpp:class:`PotentialFileReader ` +are available. Those classes allow to read the file as individual lines +for which they can return a tokenizer class (see below) for parsing the +line, or they can return blocks of numbers as a vector directly. The +documentation on `File reader classes `_ contains +an example for a typical case. + +When reading per-atom data, the data in the file usually needs include +an atom ID so it can be associated with a particular atom. In that case +the data can be read in multi-line chunks and broadcast to all MPI ranks +with :cpp:func:`utils::read_lines_from_file() +`. Those chunks are then +split into lines, parsed, and applied only to atoms the MPI rank +"owns". + +For splitting a string (incrementally) into words and optionally +converting those to numbers, the :cpp:class:`Tokenizer +` and :cpp:class:`ValueTokenizer +` can be used. Those provide a superset +of the functionality of ``strtok()`` from the C-library and the latter +also includes conversion to different types. Any errors while processing +the string in those classes will result in an exception, which can +be caught and the error processed as needed. Unlike C-library functions +like ``atoi()``, ``atof()``, ``strtol()``, or ``strtod()`` the +conversion to numbers first checks of the string is a valid number +and thus will not silently return an unexpected or incorrect value. + + Fix contributions to instantaneous energy, virial, and cumulative energy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ From 1a436c5aa9fd550d1c8bef281e6accc8efab12f8 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 14 Feb 2022 11:55:04 -0500 Subject: [PATCH 08/11] fix some broken links --- doc/src/Developer_utils.rst | 63 ++++++++++++++++++++----------------- 1 file changed, 35 insertions(+), 28 deletions(-) diff --git a/doc/src/Developer_utils.rst b/doc/src/Developer_utils.rst index 7172f81eb7..39ac9c716b 100644 --- a/doc/src/Developer_utils.rst +++ b/doc/src/Developer_utils.rst @@ -21,18 +21,21 @@ In that case, the functions will stop with an error message, indicating the name of the problematic file, if possible unless the *error* argument is a NULL pointer. -The :cpp:func:`fgets_trunc` function will work similar for ``fgets()`` -but it will read in a whole line (i.e. until the end of line or end -of file), but store only as many characters as will fit into the buffer -including a final newline character and the terminating NULL byte. -If the line in the file is longer it will thus be truncated in the buffer. -This function is used by :cpp:func:`read_lines_from_file` to read individual -lines but make certain they follow the size constraints. +The :cpp:func:`utils::fgets_trunc() ` +function will work similar for ``fgets()`` but it will read in a whole +line (i.e. until the end of line or end of file), but store only as many +characters as will fit into the buffer including a final newline +character and the terminating NULL byte. If the line in the file is +longer it will thus be truncated in the buffer. This function is used +by :cpp:func:`utils::read_lines_from_file() +` to read individual lines but +make certain they follow the size constraints. -The :cpp:func:`read_lines_from_file` function will read the requested -number of lines of a maximum length into a buffer and will return 0 -if successful or 1 if not. It also guarantees that all lines are -terminated with a newline character and the entire buffer with a +The :cpp:func:`utils::read_lines_from_file() +` function will read the +requested number of lines of a maximum length into a buffer and will +return 0 if successful or 1 if not. It also guarantees that all lines +are terminated with a newline character and the entire buffer with a NULL character. ---------- @@ -62,7 +65,7 @@ silently returning the result of a partial conversion or zero in cases where the string is not a valid number. This behavior improves detecting typos or issues when processing input files. -Similarly the :cpp:func:`logical() ` function +Similarly the :cpp:func:`utils::logical() ` function will convert a string into a boolean and will only accept certain words. The *do_abort* flag should be set to ``true`` in case this function @@ -70,8 +73,8 @@ is called only on a single MPI rank, as that will then trigger the a call to ``Error::one()`` for errors instead of ``Error::all()`` and avoids a "hanging" calculation when run in parallel. -Please also see :cpp:func:`is_integer() ` -and :cpp:func:`is_double() ` for testing +Please also see :cpp:func:`utils::is_integer() ` +and :cpp:func:`utils::is_double() ` for testing strings for compliance without conversion. ---------- @@ -393,21 +396,26 @@ A typical code segment would look like this: ---------- +.. file-reader-classes: + File reader classes ------------------- The purpose of the file reader classes is to simplify the recurring task of reading and parsing files. They can use the -:cpp:class:`LAMMPS_NS::ValueTokenizer` class to process the read in -text. The :cpp:class:`LAMMPS_NS::TextFileReader` is a more general -version while :cpp:class:`LAMMPS_NS::PotentialFileReader` is specialized -to implement the behavior expected for looking up and reading/parsing -files with potential parameters in LAMMPS. The potential file reader -class requires a LAMMPS instance, requires to be run on MPI rank 0 only, -will use the :cpp:func:`LAMMPS_NS::utils::get_potential_file_path` -function to look up and open the file, and will call the -:cpp:class:`LAMMPS_NS::Error` class in case of failures to read or to -convert numbers, so that LAMMPS will be aborted. +:cpp:class:`ValueTokenizer ` class to process +the read in text. The :cpp:class:`TextFileReader +` is a more general version while +:cpp:class:`PotentialFileReader ` is +specialized to implement the behavior expected for looking up and +reading/parsing files with potential parameters in LAMMPS. The +potential file reader class requires a LAMMPS instance, requires to be +run on MPI rank 0 only, will use the +:cpp:func:`utils::get_potential_file_path +` function to look up and +open the file, and will call the :cpp:class:`LAMMPS_NS::Error` class in +case of failures to read or to convert numbers, so that LAMMPS will be +aborted. .. code-block:: C++ :caption: Use of PotentialFileReader class in pair style coul/streitz @@ -482,10 +490,10 @@ provided, as that is used to determine whether a new page of memory must be used. The :cpp:class:`MyPage ` class offers two ways to -reserve a chunk: 1) with :cpp:func:`get() ` the -chunk size needs to be known in advance, 2) with :cpp:func:`vget() +reserve a chunk: 1) with :cpp:func:`MyPage::get() ` the +chunk size needs to be known in advance, 2) with :cpp:func:`MyPage::vget() ` a pointer to the next chunk is returned, but -its size is registered later with :cpp:func:`vgot() +its size is registered later with :cpp:func:`MyPage::vgot() `. .. code-block:: C++ @@ -588,4 +596,3 @@ the communication buffers. .. doxygenunion:: LAMMPS_NS::ubuf :project: progguide - From 37cd4ba2ea14e194ea74fe2706f176abe0ce2855 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 14 Feb 2022 11:55:09 -0500 Subject: [PATCH 09/11] spelling --- doc/utils/sphinx-config/false_positives.txt | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/utils/sphinx-config/false_positives.txt b/doc/utils/sphinx-config/false_positives.txt index 944f8409da..dfff998cf5 100644 --- a/doc/utils/sphinx-config/false_positives.txt +++ b/doc/utils/sphinx-config/false_positives.txt @@ -3015,6 +3015,7 @@ Setmask setpoint setvel sfftw +sfree Sg Shan Shanno @@ -3174,6 +3175,7 @@ Streiz strerror strided strietz +stringstreams strmatch strncmp strstr From f84790ba623be9256754ad85342826ba8736aca1 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 14 Feb 2022 15:50:36 -0500 Subject: [PATCH 10/11] add a more specific example to explain how values are rejected and how atoi() fails --- doc/src/Developer_notes.rst | 22 +++++++++++++--------- doc/src/Developer_utils.rst | 4 ++-- src/tokenizer.h | 10 ++++++++++ 3 files changed, 25 insertions(+), 11 deletions(-) diff --git a/doc/src/Developer_notes.rst b/doc/src/Developer_notes.rst index 23344de61b..a15354bb9a 100644 --- a/doc/src/Developer_notes.rst +++ b/doc/src/Developer_notes.rst @@ -48,15 +48,19 @@ split into lines, parsed, and applied only to atoms the MPI rank For splitting a string (incrementally) into words and optionally converting those to numbers, the :cpp:class:`Tokenizer ` and :cpp:class:`ValueTokenizer -` can be used. Those provide a superset -of the functionality of ``strtok()`` from the C-library and the latter -also includes conversion to different types. Any errors while processing -the string in those classes will result in an exception, which can -be caught and the error processed as needed. Unlike C-library functions -like ``atoi()``, ``atof()``, ``strtol()``, or ``strtod()`` the -conversion to numbers first checks of the string is a valid number -and thus will not silently return an unexpected or incorrect value. - +` can be used. Those provide a superset of +the functionality of ``strtok()`` from the C-library and the latter also +includes conversion to different types. Any errors while processing the +string in those classes will result in an exception, which can be caught +and the error processed as needed. Unlike the C-library functions +``atoi()``, ``atof()``, ``strtol()``, or ``strtod()`` the conversion +will check if the converted text is a valid integer of floating point +number and will not silently return an unexpected or incorrect value. +For example, ``atoi()`` will return 12 when converting "12.5" while the +ValueTokenizer class will throw an :cpp:class:`InvalidIntegerException +` if +:cpp:func:`ValueTokenizer::next_int() +` is called on the same string. Fix contributions to instantaneous energy, virial, and cumulative energy ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/doc/src/Developer_utils.rst b/doc/src/Developer_utils.rst index 39ac9c716b..a9df85c899 100644 --- a/doc/src/Developer_utils.rst +++ b/doc/src/Developer_utils.rst @@ -343,11 +343,11 @@ This code example should produce the following output: .. doxygenclass:: LAMMPS_NS::InvalidIntegerException :project: progguide - :members: what + :members: .. doxygenclass:: LAMMPS_NS::InvalidFloatException :project: progguide - :members: what + :members: ---------- diff --git a/src/tokenizer.h b/src/tokenizer.h index 03afa59836..b267e89b23 100644 --- a/src/tokenizer.h +++ b/src/tokenizer.h @@ -52,10 +52,15 @@ class Tokenizer { std::vector as_vector(); }; +/** General Tokenizer exception class */ + class TokenizerException : public std::exception { std::string message; public: + // remove unused default constructor + TokenizerException() = delete; + /** Thrown during retrieving or skipping tokens * * \param msg String with error message @@ -67,7 +72,10 @@ class TokenizerException : public std::exception { const char *what() const noexcept override { return message.c_str(); } }; +/** Exception thrown by ValueTokenizer when trying to convert an invalid integer string */ + class InvalidIntegerException : public TokenizerException { + public: /** Thrown during converting string to integer number * @@ -78,6 +86,8 @@ class InvalidIntegerException : public TokenizerException { } }; +/** Exception thrown by ValueTokenizer when trying to convert an floating point string */ + class InvalidFloatException : public TokenizerException { public: /** Thrown during converting string to floating point number From baf443766a0af0ae6dbcba5b4f2500e0d8c4db2c Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Mon, 14 Feb 2022 16:09:52 -0500 Subject: [PATCH 11/11] fix a few typos or mistyped words and explain some details better --- doc/src/Developer_cxx_vs_c_style.rst | 78 ++++++++++++++++------------ 1 file changed, 46 insertions(+), 32 deletions(-) diff --git a/doc/src/Developer_cxx_vs_c_style.rst b/doc/src/Developer_cxx_vs_c_style.rst index 993a6aa6a5..438e57abc7 100644 --- a/doc/src/Developer_cxx_vs_c_style.rst +++ b/doc/src/Developer_cxx_vs_c_style.rst @@ -77,22 +77,24 @@ LAMMPS makes extensive use of the object oriented programming (OOP) principles of *compositing* and *inheritance*. Classes like the ``LAMMPS`` class are a **composite** containing pointers to instances of other classes like ``Atom``, ``Comm``, ``Force``, ``Neighbor``, -``Modify``, and so on. Each of these classes implement certain +``Modify``, and so on. Each of these classes implement certain functionality by storing and manipulating data related to the simulation and providing member functions that trigger certain actions. Some of -those classes like ``Force`` and a composite again containing instances +those classes like ``Force`` are a composite again containing instances of classes describing the force interactions or ``Modify`` containing -and calling fixes and computes. In most cases there is only one instance -of those member classes allowed, but in a few cases there can also be -multiple instances and the parent class is maintaining a list of the -pointers of instantiated classes. +and calling fixes and computes. In most cases (e.g. ``AtomVec``, ``Comm``, +``Pair``, or ``Bond``) there is only one instance of those member classes +allowed, but in a few cases (e.g. ``Region``, ``Fix``, ``Compute``, or +``Dump``) there can be multiple instances and the parent class is +maintaining a list of the pointers of instantiated classes instead +of a single pointer. -Changing behavior or adjusting how LAMMPS handles the simulation is +Changing behavior or adjusting how LAMMPS handles a simulation is implemented via **inheritance** where different variants of the functionality are realized by creating *derived* classes that can share common functionality in their base class and provide a consistent interface where the derived classes replace (dummy or pure) functions in -the base class. The higher level classes can then call those methods of +the base class. The higher level classes can then call those methods of the instantiated classes without having to know which specific derived class variant was instantiated. In the LAMMPS documentation those derived classes are usually referred to a "styles", e.g. pair styles, @@ -108,6 +110,15 @@ classes. Whenever a new :doc:`pair_style` or :doc:`bond_style` or any existing class instance is deleted and a new instance created in it place. +Classes derived from ``Fix`` or ``Compute`` represent a different facet +of LAMMPS' flexibility as there can be multiple instances of them an +their member functions will be called at different phases of the time +integration process (as explained in `Developer_flow`). This way +multiple manipulations of the entire or parts of the system can be +programmed (with fix styles) or different computations can be performed +and accessed and further processed or output through a common interface +(with compute styles). + Further code sharing is possible by creating derived classes from the derived classes (for instance to implement an accelerated version of a pair style) where then only a subset of the methods are replaced with @@ -179,19 +190,20 @@ are virtual functions that are initialized to 0 in the class declaration virtual void pure() = 0; }; -This has the effect that it will no longer be possible to create an instance -of the base class and that derived classes **must** implement these classes. -Many of the functions listed with the various styles in the section :doc:`Modify` -are such pure functions. The motivation for this is to define the interface -or API of functions but defer the implementation of those functionality to -the derived classes. +This has the effect that it will no longer be possible to create an +instance of the base class and that derived classes **must** implement +these functions. Many of the functions listed with the various class +styles in the section :doc:`Modify` are such pure functions. The +motivation for this is to define the interface or API of the functions +but defer the implementation to the derived classes. -However, there are downsides to this. For example, calls to virtual functions -from within a constructor, will not be in the scope of the derived class and thus -it is good practice to either avoid calling them or to provide an explicit scope like -in ``Base::poly()``. Furthermore, any destructors in classes containing -virtual functions should be declared virtual, too, so they are processed -in the expected order before types are removed from dynamic dispatch. +However, there are downsides to this. For example, calls to virtual +functions from within a constructor, will not be in the scope of the +derived class and thus it is good practice to either avoid calling them +or to provide an explicit scope like in ``Base::poly()``. Furthermore, +any destructors in classes containing virtual functions should be +declared virtual, too, so they are processed in the expected order +before types are removed from dynamic dispatch. .. admonition:: Important Notes @@ -348,20 +360,22 @@ the member functions ``Memory::smalloc()``, ``Memory::srealloc()``, and Using those custom memory allocation functions is motivated by the following considerations: -- memory allocation failures on *any* MPI rank during a parallel run will trigger - an immediate abort of the entire parallel calculation instead of stalling it -- a failing "new" will trigger an exception which is also captured by LAMMPS and - triggers a global abort -- allocation of multi-dimensional arrays will be done in a C compatible fashion - but so that the storage of the actual data is stored in one large consecutive block - and thus when MPI communication is needed, only this storage needs to be - communicated (similar to Fortran arrays) -- the "destroy()" and "sfree()" functions may safely be called on NULL pointers +- memory allocation failures on *any* MPI rank during a parallel run + will trigger an immediate abort of the entire parallel calculation + instead of stalling it +- a failing "new" will trigger an exception which is also captured by + LAMMPS and triggers a global abort +- allocation of multi-dimensional arrays will be done in a C compatible + fashion but so that the storage of the actual data is stored in one + large consecutive block and thus when MPI communication is needed, + only this storage needs to be communicated (similar to Fortran arrays) +- the "destroy()" and "sfree()" functions may safely be called on NULL + pointers - the "destroy()" functions will nullify the pointer variables making "use after free" errors easy to detect -- it is possible to use a large than default memory alignment (not on all operating - systems, since the allocated storage pointers must be compatible with ``free()`` - for technical reasons) +- it is possible to use a larger than default memory alignment (not on + all operating systems, since the allocated storage pointers must be + compatible with ``free()`` for technical reasons) In the practical implementation of code this means that any pointer variables that are class members should be initialized to a ``nullptr`` value in their