Merge pull request #3141 from lammps/developer-doc-tweaks

tweaks to 2 sub-sections of Developer manual
2022-02-18 21:48:29 -05:00
parent 01bbb28418 d88ffcf880
commit 86cb37f7ad
4 changed files with 465 additions and 413 deletions
--- a/doc/src/Developer.rst
+++ b/doc/src/Developer.rst
@ -11,7 +11,7 @@ of time and requests from the LAMMPS user community.
   :maxdepth: 1

   Developer_org
-   Developer_cxx_vs_c_style
+   Developer_code_design
   Developer_parallel
   Developer_flow
   Developer_write
--- a/doc/src/Developer_code_design.rst
+++ b/doc/src/Developer_code_design.rst
@ -0,0 +1,433 @@
+Code design
+-----------
+
+This section explains some of the code design choices in LAMMPS with
+the goal of helping developers write new code similar to the existing
+code.  Please see the section on :doc:`Requirements for contributed
+code <Modify_style>` for more specific recommendations and guidelines.
+While that section is organized more in the form of a checklist for
+code contributors, the focus here is on overall code design strategy,
+choices made between possible alternatives, and discussing some
+relevant C++ programming language constructs.
+
+Historically, the basic design philosophy of the LAMMPS C++ code was a
+"C with classes" style.  The motivation was to make it easy to modify
+LAMMPS for people without significant training in C++ programming.
+Data structures and code constructs were used that resemble the
+previous implementation(s) in Fortran.  A contributing factor to this
+choice also was that at the time, C++ compilers were often not mature
+and some of the advanced features contained bugs or did not function
+as the standard required.  There were also disagreements between
+compiler vendors as to how to interpret the C++ standard documents.
+
+However, C++ compilers have now advanced significantly.  In 2020 we
+decided to to require the C++11 standard as the minimum C++ language
+standard for LAMMPS.  Since then we have begun to also replace some of
+the C-style constructs with equivalent C++ functionality, either from
+the C++ standard library or as custom classes or functions, in order
+to improve readability of the code and to increase code reuse through
+abstraction of commonly used functionality.
+
+.. note::
+
+   Please note that as of spring 2022 there is still a sizable chunk
+   of legacy code in LAMMPS that has not yet been refactored to
+   reflect these style conventions in full.  LAMMPS has a large code
+   base and many different contributors and there also is a hierarchy
+   of precedence in which the code is adapted.  Highest priority has
+   been the code in the ``src`` folder, followed by code in packages
+   in order of their popularity and complexity (simpler code is
+   adapted sooner), followed by code in the ``lib`` folder.  Source
+   code that is downloaded from external packages or libraries during
+   compilation is not subject to the conventions discussed here.
+
+Object oriented code
+^^^^^^^^^^^^^^^^^^^^
+
+LAMMPS is designed to be an object oriented code.  Each simulation is
+represented by an instance of the LAMMPS class.  When running in
+parallel each MPI process creates such an instance.  This can be seen
+in the ``main.cpp`` file where the core steps of running a LAMMPS
+simulation are the following 3 lines of code:
+
+.. code-block:: C++
+
+    LAMMPS *lammps = new LAMMPS(argc, argv, lammps_comm);
+    lammps->input->file();
+    delete lammps;
+
+The first line creates a LAMMPS class instance and passes the command
+line arguments and the global communicator to its constructor.  The
+second line triggers the LAMMPS instance to process the input (either
+from standard input or a provided input file) until the simulation
+ends.  The third line deletes the LAMMPS instance.  The remainder of
+the main.cpp file has code for error handling, MPI configuration, and
+other special features.
+
+The basic LAMMPS class hierarchy which is created by the LAMMPS class
+constructor is shown in :ref:`class-topology`.  When input commands
+are processed, additional class instances are created, or deleted, or
+replaced.  Likewise specific member functions of specific classes are
+called to trigger actions such creating atoms, computing forces,
+computing properties, time-propagating the system, or writing output.
+
+Compositing and Inheritance
+===========================
+
+LAMMPS makes extensive use of the object oriented programming (OOP)
+principles of *compositing* and *inheritance*. Classes like the
+``LAMMPS`` class are a **composite** containing pointers to instances
+of other classes like ``Atom``, ``Comm``, ``Force``, ``Neighbor``,
+``Modify``, and so on.  Each of these classes implement certain
+functionality by storing and manipulating data related to the
+simulation and providing member functions that trigger certain
+actions.  Some of those classes like ``Force`` are themselves
+composites, containing instances of classes describing different force
+interactions.  Similarly the ``Modify`` class contains a list of
+``Fix`` and ``Compute`` classes.  If the input commands that
+correspond to these classes include the word *style*, then LAMMPS
+stores only a single instance of that class.  E.g. *atom_style*,
+*comm_style*, *pair_style*, *bond_style*.  It the input command does
+not include the word *style*, there can be many instances of that
+class defined.  E.g. *region*, *fix*, *compute*, *dump*.
+
+**Inheritance** enables creation of *derived* classes that can share
+common functionality in their base class while providing a consistent
+interface.  The derived classes replace (dummy or pure) functions in
+the base class.  The higher level classes can then call those methods
+of the instantiated classes without having to know which specific
+derived class variant was instantiated.  In LAMMPS these derived
+classes are often referred to as "styles", e.g.  pair styles, fix
+styles, atom styles and so on.
+
+This is the origin of the flexibility of LAMMPS.  For example pair
+styles implement a variety of different non-bonded interatomic
+potentials functions.  All details for the implementation of a
+potential are stored and executed in a single class.
+
+As mentioned above, there can be multiple instances of classes derived
+from the ``Fix`` or ``Compute`` base classes.  They represent a
+different facet of LAMMPS flexibility as they provide methods which
+can be called at different points in time within a timestep, as
+explained in `Developer_flow`.  This allows the input script to tailor
+how a specific simulation is run, what diagnostic computations are
+performed, and how the output of those computations is further
+processed or output.
+
+Additional code sharing is possible by creating derived classes from the
+derived classes (e.g., to implement an accelerated version of a pair
+style) where only a subset of the derived class methods are replaced
+with accelerated versions.
+
+Polymorphism
+============
+
+Polymorphism and dynamic dispatch are another OOP feature that play an
+important role in how LAMMPS selects what code to execute.  In a
+nutshell, this is a mechanism where the decision of which member
+function to call from a class is determined at runtime and not when
+the code is compiled.  To enable it, the function has to be declared
+as ``virtual`` and all corresponding functions in derived classes
+should use the ``override`` property. Below is a brief example.
+
+.. code-block:: c++
+
+   class Base {
+   public:
+    virtual ~Base() = default;
+    void call();
+    void normal();
+    virtual void poly();
+   };
+
+   void Base::call() {
+    normal();
+    poly();
+   }
+
+   class Derived : public Base {
+   public:
+    ~Derived() override = default;
+    void normal();
+    void poly() override;
+   };
+
+   // [....]
+
+   Base *base1 = new Base();
+   Base *base2 = new Derived();
+
+   base1->call();
+   base2->call();
+
+The difference in behavior of the ``normal()`` and the ``poly()`` member
+functions is which of the two member functions is called when executing
+`base1->call()` versus `base2->call()`.  Without polymorphism, a
+function within the base class can only call member functions within the
+same scope, that is ``Base::call()`` will always call
+``Base::normal()``.  But for the `base2->call()` case the call of the
+virtual member function will be dispatched to ``Derived::poly()``
+instead.  This mechanism means that functions are called within the
+scope of the class type that was used to *create* the class instance are
+invoked; even if they are assigned to a pointer using the type of a base
+class.  This is the desired behavior and this way LAMMPS can even use
+styles that are loaded at runtime from a shared object file with the
+:doc:`plugin command <plugin>`.
+
+A special case of virtual functions are so-called pure functions.  These
+are virtual functions that are initialized to 0 in the class declaration
+(see example below).
+
+.. code-block:: c++
+
+   class Base {
+   public:
+    virtual void pure() = 0;
+   };
+
+This has the effect that an instance of the base class cannot be
+created and that derived classes **must** implement these functions.
+Many of the functions listed with the various class styles in the
+section :doc:`Modify` are pure functions.  The motivation for this is
+to define the interface or API of the functions but defer their
+implementation to the derived classes.
+
+However, there are downsides to this. For example, calls to virtual
+functions from within a constructor, will not be in the scope of the
+derived class and thus it is good practice to either avoid calling them
+or to provide an explicit scope such as ``Base::poly()`` or
+``Derived::poly()``.  Furthermore, any destructors in classes containing
+virtual functions should be declared virtual too, so they will be
+processed in the expected order before types are removed from dynamic
+dispatch.
+
+.. admonition:: Important Notes
+
+   In order to be able to detect incompatibilities at compile time and
+   to avoid unexpected behavior, it is crucial that all member functions
+   that are intended to replace a virtual or pure function use the
+   ``override`` property keyword.  For the same reason, the use of
+   overloads or default arguments for virtual functions should be
+   avoided as they lead to confusion over which function is supposed to
+   override which and which arguments need to be declared.
+
+Style Factories
+===============
+
+In order to create class instances for different styles, LAMMPS often
+uses a programming pattern called `Factory`.  Those are functions that
+create an instance of a specific derived class, say ``PairLJCut`` and
+return a pointer to the type of the common base class of that style,
+``Pair`` in this case.  To associate the factory function with the
+style keyword, an ``std::map`` class is used with function pointers
+indexed by their keyword (for example "lj/cut" for ``PairLJCut`` and
+"morse" for ``PairMorse``).  A couple of typedefs help keep the code
+readable and a template function is used to implement the actual
+factory functions for the individual classes.  Below is an example
+of such a factory function from the ``Force`` class as declared in
+``force.h`` and implemented in ``force.cpp``.  The file ``style_pair.h``
+is generated during compilation and includes all main header files
+(i.e. those starting with ``pair_``) of pair styles and then the
+macro ``PairStyle()`` will associate the style name "lj/cut"
+with a factory function creating an instance of the ``PairLJCut``
+class.
+
+.. code-block:: C++
+
+   // from force.h
+   typedef Pair *(*PairCreator)(LAMMPS *);
+   typedef std::map<std::string, PairCreator> PairCreatorMap;
+   PairCreatorMap *pair_map;
+
+   // from force.cpp
+   template <typename S, typename T> static S *style_creator(LAMMPS *lmp)
+   {
+     return new T(lmp);
+   }
+
+   // [...]
+
+   pair_map = new PairCreatorMap();
+
+   #define PAIR_CLASS
+   #define PairStyle(key, Class) (*pair_map)[#key] = &style_creator<Pair, Class>;
+   #include "style_pair.h"
+   #undef PairStyle
+   #undef PAIR_CLASS
+
+   // from pair_lj_cut.h
+
+   #ifdef PAIR_CLASS
+   PairStyle(lj/cut,PairLJCut);
+   #else
+   // [...]
+
+Similar code constructs are present in other files like ``modify.cpp`` and
+``modify.h`` or ``neighbor.cpp`` and ``neighbor.h``.  Those contain
+similar macros and include ``style_*.h`` files for creating class instances
+of styles they manage.
+
+
+I/O and output formatting
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+C-style stdio versus C++ style iostreams
+========================================
+
+LAMMPS uses the "stdio" library of the standard C library for reading
+from and writing to files and console instead of C++ "iostreams".
+This is mainly motivated by better performance, better control over
+formatting, and less effort to achieve specific formatting.
+
+Since mixing "stdio" and "iostreams" can lead to unexpected
+behavior. use of the latter is strongly discouraged.  Also output to
+the screen should not use the predefined ``stdout`` FILE pointer, but
+rather the ``screen`` and ``logfile`` FILE pointers managed by the
+LAMMPS class.  Furthermore, output should generally only be done by
+MPI rank 0 (``comm->me == 0``).  Output that is sent to both
+``screen`` and ``logfile`` should use the :cpp:func:`utils::logmesg()
+convenience function <LAMMPS_NS::utils::logmesg>`.
+
+We also discourage the use of stringstreams because the bundled {fmt}
+library and the customized tokenizer classes can provide the same
+functionality in a cleaner way with better performance.  This also
+helps maintain a consistent programming syntax with code from many
+different contributors.
+
+Formatting with the {fmt} library
+===================================
+
+The LAMMPS source code includes a copy of the `{fmt} library
+<https://fmt.dev>`_ which is preferred over formatting with the
+"printf()" family of functions.  The primary reason is that it allows
+a typesafe default format for any type of supported data.  This is
+particularly useful for formatting integers of a given size (32-bit or
+64-bit) which may require different format strings depending on
+compile time settings or compilers/operating systems.  Furthermore,
+{fmt} gives better performance, has more functionality, a familiar
+formatting syntax that has similarities to ``format()`` in Python, and
+provides a facility that can be used to integrate format strings and a
+variable number of arguments into custom functions in a much simpler
+way than the varargs mechanism of the C library.  Finally, {fmt} has
+been included into the C++20 language standard, so changes to adopt it
+are future-proof.
+
+Formatted strings are frequently created by calling the
+``fmt::format()`` function which will return a string as a
+``std::string`` class instance.  In contrast to the ``%`` placeholder
+in ``printf()``, the {fmt} library uses ``{}`` to embed format
+descriptors.  In the simplest case, no additional characters are
+needed as {fmt} will choose the default format based on the data type
+of the argument.  Otherwise the ``fmt::print()`` function may be
+used instead of ``printf()`` or ``fprintf()``.  In addition, several
+LAMMPS output functions, that originally accepted a single string as
+argument have been overloaded to accept a format string with optional
+arguments as well (e.g., ``Error::all()``, ``Error::one()``,
+``utils::logmesg()``).
+
+Summary of the {fmt} format syntax
+==================================
+
+The syntax of the format string is "{[<argument id>][:<format spec>]}",
+where either the argument id or the format spec (separated by a colon
+':') is optional.  The argument id is usually a number starting from 0
+that is the index to the arguments following the format string.  By
+default these are assigned in order (i.e. 0, 1, 2, 3, 4 etc.).  The most
+common case for using argument id would be to use the same argument in
+multiple places in the format string without having to provide it as an
+argument multiple times. In LAMMPS the argument id is rarely used.
+
+More common is the use of a format specifier, which starts with a colon.
+This may optionally be followed by a fill character (default is ' '). If
+provided, the fill character **must** be followed by an alignment
+character ('<', '^', '>' for left, centered, or right alignment
+(default)).  The alignment character may be used without a fill
+character.  The next important format parameter would be the minimum
+width, which may be followed by a dot '.' and a precision for floating
+point numbers.  The final character in the format string would be an
+indicator for the "presentation", i.e. 'd' for decimal presentation of
+integers, 'x' for hexadecimal, 'o' for octal, 'c' for character etc.
+This mostly follows the "printf()" scheme but without requiring an
+additional length parameter to distinguish between different integer
+widths.  The {fmt} library will detect those and adapt the formatting
+accordingly.  For floating point numbers there are correspondingly, 'g'
+for generic presentation, 'e' for exponential presentation, and 'f' for
+fixed point presentation.
+
+Thus "{:8}" would represent *any* type argument using at least 8
+characters; "{:<8}" would do this as left aligned, "{:^8}" as centered,
+"{:>8}" as right aligned.  If a specific presentation is selected, the
+argument type must be compatible or else the {fmt} formatting code will
+throw an exception. Some format string examples are given below:
+
+.. code-block:: C
+
+   auto mesg = fmt::format("  CPU time: {:4d}:{:02d}:{:02d}\n", cpuh, cpum, cpus);
+   mesg = fmt::format("{:<8s}| {:<10.5g} | {:<10.5g} | {:<10.5g} |{:6.1f} |{:6.2f}\n",
+                      label, time_min, time, time_max, time_sq, tmp);
+   utils::logmesg(lmp,"{:>6} = max # of 1-2 neighbors\n",maxall);
+   utils::logmesg(lmp,"Lattice spacing in x,y,z = {:.8} {:.8} {:.8}\n",
+                  xlattice,ylattice,zlattice);
+
+which will create the following output lines:
+
+.. parsed-literal::
+
+     CPU time:    0:02:16
+     Pair    | 2.0133     | 2.0133     | 2.0133     |   0.0 | 84.21
+          4 = max # of 1-2 neighbors
+     Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
+
+Finally, a special feature of the {fmt} library is that format
+parameters like the width or the precision may be also provided as
+arguments. In that case a nested format is used where a pair of curly
+braces (with an optional argument id) "{}" are used instead of the
+value, for example "{:{}d}" will consume two integer arguments, the
+first will be the value shown and the second the minimum width.
+
+For more details and examples, please consult the `{fmt} syntax
+documentation <https://fmt.dev/latest/syntax.html>`_ website.
+
+
+Memory management
+^^^^^^^^^^^^^^^^^
+
+Dynamical allocation of small data and objects can be done with the
+the C++ commands "new" and "delete/delete[].  Large data should use
+the member functions of the ``Memory`` class, most commonly,
+``Memory::create()``, ``Memory::grow()``, and ``Memory::destroy()``,
+which provide variants for vectors, 2d arrays, 3d arrays, etc.
+These can also be used for small data.
+
+The use of ``malloc()``, ``calloc()``, ``realloc()`` and ``free()``
+directly is strongly discouraged.  To simplify adapting legacy code
+into the LAMMPS code base the member functions ``Memory::smalloc()``,
+``Memory::srealloc()``, and ``Memory::sfree()`` are available, which
+perform additional error checks for safety.
+
+Use of these custom memory allocation functions is motivated by the
+following considerations:
+
+- memory allocation failures on *any* MPI rank during a parallel run
+  will trigger an immediate abort of the entire parallel calculation
+  instead of stalling it
+- a failing "new" will trigger an exception which is also captured by
+  LAMMPS and triggers a global abort
+- allocation of multi-dimensional arrays will be done in a C compatible
+  fashion but so that the storage of the actual data is stored in one
+  large contiguous block.  Thus when MPI communication is needed,
+  the data can be communicated directly (similar to Fortran arrays).
+- the "destroy()" and "sfree()" functions may safely be called on NULL
+  pointers
+- the "destroy()" functions will nullify the pointer variables making
+  "use after free" errors easy to detect
+- it is possible to use a larger than default memory alignment (not on
+  all operating systems, since the allocated storage pointers must be
+  compatible with ``free()`` for technical reasons)
+
+In the practical implementation of code this means that any pointer
+variables that are class members should be initialized to a
+``nullptr`` value in their respective constructors.  That way it is
+safe to call ``Memory::destroy()`` or ``delete[]`` on them before
+*any* allocation outside the constructor.  This helps prevent memory
+leaks.
--- a/doc/src/Developer_cxx_vs_c_style.rst
+++ b/doc/src/Developer_cxx_vs_c_style.rst
@ -1,384 +0,0 @@
-Code design
-----------
-
-This section discusses some of the code design choices in LAMMPS and
-overall strategy in order to assist developers to write new code that
-will fit well with the remaining code.  Please see the section on
-:doc:`Requirements for contributed code <Modify_style>` for more
-specific recommendations and guidelines.  While that section is
-organized more in the form of a checklist for code contributors, the
-focus here is on overall code design strategy, choices made between
-possible alternatives, and to discuss of some relevant C++ programming
-language constructs.
-
-Historically, the basic design philosophy of the LAMMPS C++ code was
-that of a "C with classes" style.  The was motivated by the desire to
-make it easier to modify LAMMPS for people without significant training
-in C++ programming and by trying to use data structures and code constructs
-that somewhat resemble the previous implementation(s) in Fortran.
-A contributing factor for this choice also was that at the time the
-implementation of C++ compilers was not always very mature and some of
-the advanced features contained bugs or were not functioning exactly
-as the standard required; plus there was some disagreement between
-compiler vendors about how to interpret the C++ standard documents.
-
-However, C++ compilers have advanced a lot since then and with the
-transition to requiring the C++11 standard in 2020 as the minimum C++ language
-standard for LAMMPS, the decision was made to also replace some of the
-C-style constructs with equivalent C++ functionality, either from the
-C++ standard library or as custom classes or function, in order to
-improve readability of the code and to increase code reuse through
-abstraction of commonly used functionality.
-
-.. note::
-
-   Please note that as of spring 2022 there is still a sizable chunk of
-   legacy code in LAMMPS that has not yet been refactored to reflect these
-   style conventions in full.  LAMMPS has a large code base and many
-   different contributors and there also is a hierarchy of precedence
-   in which the code is adapted.  Highest priority has the code in the
-   ``src`` folder, followed by code in packages in order of their popularity
-   and complexity (simpler code is adapted sooner), followed by code
-   in the ``lib`` folder.  Source code that is downloaded during compilation
-   is not subject to the conventions discussed here.
-
-Object oriented code
-^^^^^^^^^^^^^^^^^^^^
-
-LAMMPS is designed to be an object oriented code, that is each
-simulation is represented by an instance of the LAMMPS class.  When
-running in parallel, of course, each MPI process will create such an
-instance.  This can be seen in the ``main.cpp`` file where the core
-steps of running a LAMMPS simulation are the following 3 lines of code:
-
-.. code-block:: C++
-
-    LAMMPS *lammps = new LAMMPS(argc, argv, lammps_comm);
-    lammps->input->file();
-    delete lammps;
-
-The first line creates a LAMMPS class instance and passes the command
-line arguments and the global communicator to its constructor.  The
-second line tells the LAMMPS instance to process the input (either from
-standard input or the provided input file) until the end.  And the third
-line deletes that instance again.  The remainder of the main.cpp file
-are for error handling, MPI configuration and other special features.
-
-In the constructor of the LAMMPS class instance the basic LAMMPS class hierarchy
-is created as shown in :ref:`class-topology`.  While processing the input further
-class instances are created, or deleted, or replaced and specific member functions
-of specific classes are called to trigger actions like creating atoms, computing
-forces, computing properties, propagating the system, or writing output.
-
-Compositing and Inheritance
-===========================
-
-LAMMPS makes extensive use of the object oriented programming (OOP)
-principles of *compositing* and *inheritance*. Classes like the
-``LAMMPS`` class are a **composite** containing pointers to instances of
-other classes like ``Atom``, ``Comm``, ``Force``, ``Neighbor``,
-``Modify``, and so on.  Each of these classes implement certain
-functionality by storing and manipulating data related to the simulation
-and providing member functions that trigger certain actions.  Some of
-those classes like ``Force`` are a composite again containing instances
-of classes describing the force interactions or ``Modify`` containing
-and calling fixes and computes.  In most cases (e.g. ``AtomVec``, ``Comm``,
-``Pair``, or ``Bond``) there is only one instance of those member classes
-allowed, but in a few cases (e.g. ``Region``, ``Fix``, ``Compute``, or
-``Dump``) there can be multiple instances and the parent class is
-maintaining a list of the pointers of instantiated classes instead
-of a single pointer.
-
-Changing behavior or adjusting how LAMMPS handles a simulation is
-implemented via **inheritance** where different variants of the
-functionality are realized by creating *derived* classes that can share
-common functionality in their base class and provide a consistent
-interface where the derived classes replace (dummy or pure) functions in
-the base class.  The higher level classes can then call those methods of
-the instantiated classes without having to know which specific derived
-class variant was instantiated.  In the LAMMPS documentation those
-derived classes are usually referred to a "styles", e.g.  pair styles,
-fix styles, atom styles and so on.
-
-This is the origin of the flexibility of LAMMPS and facilitates for
-example to compute forces for very different non-bonded potential
-functions by having different pair styles (implemented as different
-classes derived from the ``Pair`` class) where the evaluation of the
-potential function is confined to the implementation of the individual
-classes.  Whenever a new :doc:`pair_style` or :doc:`bond_style` or
-:doc:`comm_style` or similar command is processed in the LAMMPS input
-any existing class instance is deleted and a new instance created in
-it place.
-
-Classes derived from ``Fix`` or ``Compute`` represent a different facet
-of LAMMPS' flexibility as there can be multiple instances of them an
-their member functions will be called at different phases of the time
-integration process (as explained in `Developer_flow`).  This way
-multiple manipulations of the entire or parts of the system can be
-programmed (with fix styles) or different computations can be performed
-and accessed and further processed or output through a common interface
-(with compute styles).
-
-Further code sharing is possible by creating derived classes from the
-derived classes (for instance to implement an accelerated version of a
-pair style) where then only a subset of the methods are replaced with
-the accelerated versions.
-
-Polymorphism
-============
-
-Polymorphism and dynamic dispatch are another OOP feature that play an
-important part of how LAMMPS selects which code to execute.  In a nutshell,
-this is a mechanism where the decision of which member function to call
-from a class is determined at runtime and not when the code is compiled.
-To enable it, the function has to be declared as ``virtual`` and all
-corresponding functions in derived classes should be using the ``override``
-property. Below is a brief example.
-
-.. code-block:: c++
-
-   class Base {
-   public:
-    virtual ~Base() = default;
-    void call();
-    void normal();
-    virtual void poly();
-   };
-
-   void Base::call() {
-    normal();
-    poly();
-   }
-
-   class Derived : public Base {
-   public:
-    ~Derived() override = default;
-    void normal();
-    void poly() override;
-   };
-
-   // [....]
-
-   Base *base1 = new Base();
-   Base *base2 = new Derived();
-
-   base1->call();
-   base2->call();
-
-The difference in behavior of the ``normal()`` and the ``poly()`` member
-functions is in which of the two member functions is called when
-executing `base1->call()` and `base2->call()`.  Without polymorphism, a
-function within the base class will call only member functions within
-the same scope, that is ``Base::call()`` will always call
-``Base::normal()``.  But for the `base2->call()` the call for the
-virtual member function will be dispatched to ``Derived::poly()``
-instead.  This mechanism allows to always call functions within the
-scope of the class type that was used to create the class instance, even
-if they are assigned to a pointer using the type of a base class. This
-is the desired behavior, and thanks to dynamic dispatch, LAMMPS can even
-use styles that are loaded at runtime from a shared object file with the
-:doc:`plugin command <plugin>`.
-
-A special case of virtual functions are so-called pure functions. These
-are virtual functions that are initialized to 0 in the class declaration
-(see example below).
-
-.. code-block:: c++
-
-   class Base {
-   public:
-    virtual void pure() = 0;
-   };
-
-This has the effect that it will no longer be possible to create an
-instance of the base class and that derived classes **must** implement
-these functions.  Many of the functions listed with the various class
-styles in the section :doc:`Modify` are such pure functions.  The
-motivation for this is to define the interface or API of the functions
-but defer the implementation to the derived classes.
-
-However, there are downsides to this. For example, calls to virtual
-functions from within a constructor, will not be in the scope of the
-derived class and thus it is good practice to either avoid calling them
-or to provide an explicit scope like in ``Base::poly()``.  Furthermore,
-any destructors in classes containing virtual functions should be
-declared virtual, too, so they are processed in the expected order
-before types are removed from dynamic dispatch.
-
-.. admonition:: Important Notes
-
-   In order to be able to detect incompatibilities and to avoid unexpected
-   behavior already at compile time, it is crucial that all member functions
-   that are intended to replace a virtual or pure function use the ``override``
-   property keyword.  For the same reason it should be avoided to use overloads
-   or default arguments for virtual functions as they lead to confusion over
-   which function is supposed to override which and which arguments need to be
-   declared.
-
-Style Factories
-===============
-
-In order to create class instances of the different styles, LAMMPS often
-uses a programming pattern called `Factory`.  Those are functions that create
-an instance of a specific derived class, say ``PairLJCut`` and return a pointer
-to the type of the common base class of that style, ``Pair`` in this case.
-To associate the factory function with the style keyword, an ``std::map``
-class is used in which function pointers are indexed by their keyword
-(for example "lj/cut" for ``PairLJCut`` and "morse" ``PairMorse``).
-A couple of typedefs help to keep the code readable and a template function
-is used to implement the actual factory functions for the individual classes.
-
-I/O and output formatting
-^^^^^^^^^^^^^^^^^^^^^^^^^
-
-C-style stdio versus C++ style iostreams
-========================================
-
-LAMMPS chooses to use the "stdio" library of the standard C library for
-reading from and writing to files and console instead of C++
-"iostreams".  This is mainly motivated by the better performance, better
-control over formatting, and less effort to achieve specific formatting.
-
-Since mixing "stdio" and "iostreams" can lead to unexpected behavior using
-the latter is strongly discouraged.  Also output to the screen should not
-use the predefined ``stdout`` FILE pointer, but rather the ``screen`` and
-``logfile`` FILE pointers managed by the LAMMPS class.  Furthermore, output
-should only be done by MPI rank 0 (``comm->me == 0``) and output that is
-send to both ``screen`` and ``logfile`` should use the
-:cpp:func:`utils::logmesg() convenience function <LAMMPS_NS::utils::logmesg>`.
-
-We also discourage the use for stringstreams as the bundled {fmt} library
-and the customized tokenizer classes can provide the same functionality
-in a cleaner way with better performance. This will also help to retain
-a consistent programming style despite the many different contributors.
-
-Formatting with the {fmt} library
-===================================
-
-The LAMMPS source code includes a copy of the `{fmt} library
-<https://fmt.dev>`_ which is preferred over formatting with the
-"printf()" family of functions.  The primary reason is that it allows a
-typesafe default format for any type of supported data.  This is
-particularly useful for formatting integers of a given size (32-bit or
-64-bit) which may require different format strings depending on compile
-time settings or compilers/operating systems.  Furthermore, {fmt} gives
-better performance, has more functionality, a familiar formatting syntax
-that has similarities to ``format()`` in Python, and provides a facility
-that can be used to integrate format strings and a variable number of
-arguments into custom functions in a much simpler way that the varargs
-mechanism of the C library.  Finally, {fmt} has been included into the
-C++20 language standard, so changes to adopt it are future proof.
-
-Formatted strings are frequently created by calling the
-``fmt::format()`` function which will return a string as ``std::string``
-class instance.  In contrast to the ``%`` placeholder in ``printf()``,
-the {fmt} library uses ``{}`` to embed format descriptors.  In the
-simplest case, no additional characters are needed as {fmt} will choose
-the default format based on the data type of the argument. Alternatively
-The ``fmt::print()`` function may be used instead of ``printf()`` or
-``fprintf()``.  In addition, several LAMMPS output functions, that
-originally accepted a single string as arguments have been overloaded to
-accept a format string with optional arguments as well (e.g.
-``Error::all()``, ``Error::one()``, ``utils::logmesg()``).
-
-Summary of the {fmt} format syntax
-==================================
-
-The syntax of the format string is "{[<argument id>][:<format spec>]}",
-where either the argument id or the format spec (separated by a colon
-':') is optional.  The argument id is usually a number starting from 0
-that is the index to the arguments following the format string.  By
-default these are assigned in order (i.e. 0, 1, 2, 3, 4 etc.).  The most
-common case for using argument id would be to use the same argument in
-multiple places in the format string without having to provide it as an
-argument multiple times. In LAMMPS the argument id is rarely used.
-
-More common is the use of the format specifier, which starts with a
-colon.  This may optionally be followed by a fill character (default is
-' '). If provided, the fill character **must** be followed by an
-alignment character ('<', '^', '>' for left, centered, or right
-alignment (default)). The alignment character may be used without a fill
-character. The next important format parameter would be the minimum
-width, which may be followed by a dot '.'  and a precision for floating
-point numbers. The final character in the format string would be an
-indicator for the "presentation", i.e. 'd' for decimal presentation of
-integers, 'x' for hexadecimal, 'o' for octal, 'c' for character
-etc. This mostly follows the "printf()" scheme but without requiring an
-additional length parameter to distinguish between different integer
-widths. The {fmt} library will detect those and adapt the formatting
-accordingly.  For floating point numbers there are correspondingly, 'g'
-for generic presentation, 'e' for exponential presentation, and 'f' for
-fixed point presentation.
-
-Thus "{:8}" would represent *any* type argument using at least 8
-characters; "{:<8}" would do this as left aligned, "{:^8}" as centered,
-"{:>8}" as right aligned.  If a specific presentation is selected, the
-argument type must be compatible or else the {fmt} formatting code will
-throw an exception. Some format string examples are given below:
-
-.. code-block:: C
-
-   auto mesg = fmt::format("  CPU time: {:4d}:{:02d}:{:02d}\n", cpuh, cpum, cpus);
-   mesg = fmt::format("{:<8s}| {:<10.5g} | {:<10.5g} | {:<10.5g} |{:6.1f} |{:6.2f}\n",
-                      label, time_min, time, time_max, time_sq, tmp);
-   utils::logmesg(lmp,"{:>6} = max # of 1-2 neighbors\n",maxall);
-   utils::logmesg(lmp,"Lattice spacing in x,y,z = {:.8} {:.8} {:.8}\n",
-                  xlattice,ylattice,zlattice);
-
-which will create the following output lines:
-
-.. parsed-literal::
-
-     CPU time:    0:02:16
-     Pair    | 2.0133     | 2.0133     | 2.0133     |   0.0 | 84.21
-          4 = max # of 1-2 neighbors
-     Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
-
-A special feature of the {fmt} library is that format parameters like
-the width or the precision may be also provided as arguments. In that
-case a nested format is used where a pair of curly braces (with an
-optional argument id) "{}" are used instead of the value, for example
-"{:{}d}" will consume two integer arguments, the first will be the value
-shown and the second the minimum width.
-
-For more details and examples, please consult the `{fmt} syntax
-documentation <https://fmt.dev/latest/syntax.html>`_ website.
-
-
-Memory management
-^^^^^^^^^^^^^^^^^
-
-Dynamical allocation of data and objects should be done with either the
-C++ commands "new" and "delete/delete[]" or using member functions of
-the ``Memory`` class, most commonly, ``Memory::create()``,
-``Memory::grow()``, and ``Memory::destroy()``.  The use of ``malloc()``,
-``calloc()``, ``realloc()`` and ``free()`` directly is strongly
-discouraged.  To simplify adapting legacy code into the LAMMPS code base
-the member functions ``Memory::smalloc()``, ``Memory::srealloc()``, and
-``Memory::sfree()`` are available.
-
-Using those custom memory allocation functions is motivated by the
-following considerations:
-
- memory allocation failures on *any* MPI rank during a parallel run
-  will trigger an immediate abort of the entire parallel calculation
-  instead of stalling it
- a failing "new" will trigger an exception which is also captured by
-  LAMMPS and triggers a global abort
- allocation of multi-dimensional arrays will be done in a C compatible
-  fashion but so that the storage of the actual data is stored in one
-  large consecutive block and thus when MPI communication is needed,
-  only this storage needs to be communicated (similar to Fortran arrays)
- the "destroy()" and "sfree()" functions may safely be called on NULL
-  pointers
- the "destroy()" functions will nullify the pointer variables making
-  "use after free" errors easy to detect
- it is possible to use a larger than default memory alignment (not on
-  all operating systems, since the allocated storage pointers must be
-  compatible with ``free()`` for technical reasons)
-
-In the practical implementation of code this means that any pointer variables
-that are class members should be initialized to a ``nullptr`` value in their
-respective constructors.  That way it would be safe to call ``Memory::destroy()``
-or ``delete[]`` on them before *any* allocation outside the constructor.
-This helps to prevent memory leaks.
--- a/doc/src/Developer_notes.rst
+++ b/doc/src/Developer_notes.rst
@ -11,9 +11,9 @@ Reading and parsing of text and text files
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 It is frequently required for a class in LAMMPS to read in additional
-data from a file, most commonly potential parameters from a potential
-file for manybody potentials.  LAMMPS provides several custom classes
-and convenience functions to simplify the process.  This offers the
+data from a file, e.g. potential parameters from a potential file for
+manybody potentials.  LAMMPS provides several custom classes and
+convenience functions to simplify the process.  They offer the
 following benefits:

 - better code reuse and fewer lines of code needed to implement reading
@ -23,24 +23,26 @@ following benefits:
  text to a number or returning a 0 on unrecognized text and thus reading incorrect values
 - re-entrant code through avoiding global static variables (as used by ``strtok()``)
 - transparent support for translating unsupported UTF-8 characters to their ASCII equivalents
-  (the text to value conversion functions **only** accept ASCII characters)
+  (the text-to-value conversion functions **only** accept ASCII characters)

-In most cases (e.g. potential files) the same data is needed on all MPI
-ranks.  Then it is best to do the reading and parsing only on MPI rank
-0, and communicate the data later with one or more ``MPI_Bcast()``
-calls.  For reading generic text and potential parameter files the
-custom classes :cpp:class:`TextFileReader <LAMMPS_NS::TextFileReader>`
-and :cpp:class:`PotentialFileReader <LAMMPS_NS::PotentialFileReader>`
-are available. Those classes allow to read the file as individual lines
-for which they can return a tokenizer class (see below) for parsing the
-line, or they can return blocks of numbers as a vector directly.  The
-documentation on `File reader classes <file-reader-classes>`_ contains
-an example for a typical case.
+In most cases (e.g. potential files) the same data is needed on all
+MPI ranks.  Then it is best to do the reading and parsing only on MPI
+rank 0, and communicate the data later with one or more
+``MPI_Bcast()`` calls.  For reading generic text and potential
+parameter files the custom classes :cpp:class:`TextFileReader
+<LAMMPS_NS::TextFileReader>` and :cpp:class:`PotentialFileReader
+<LAMMPS_NS::PotentialFileReader>` are available. They allow reading
+the file as individual lines for which they can return a tokenizer
+class (see below) for parsing the line.  Or they can return blocks of
+numbers as a vector directly.  The documentation on `File reader
+classes <file-reader-classes>`_ contains an example for a typical
+case.

-When reading per-atom data, the data in the file usually needs include
-an atom ID so it can be associated with a particular atom.  In that case
-the data can be read in multi-line chunks and broadcast to all MPI ranks
-with :cpp:func:`utils::read_lines_from_file()
+When reading per-atom data, the data on each line of the file usually
+needs to include an atom ID so it can be associated with a particular
+atom.  In that case the data can be read in multi-line chunks and
+broadcast to all MPI ranks with
+:cpp:func:`utils::read_lines_from_file()
 <LAMMPS_NS::utils::read_lines_from_file>`.  Those chunks are then
 split into lines, parsed, and applied only to atoms the MPI rank
 "owns".
@ -49,15 +51,16 @@ For splitting a string (incrementally) into words and optionally
 converting those to numbers, the :cpp:class:`Tokenizer
 <LAMMPS_NS::Tokenizer>` and :cpp:class:`ValueTokenizer
 <LAMMPS_NS::ValueTokenizer>` can be used.  Those provide a superset of
-the functionality of ``strtok()`` from the C-library and the latter also
-includes conversion to different types.  Any errors while processing the
-string in those classes will result in an exception, which can be caught
-and the error processed as needed.  Unlike the C-library functions
-``atoi()``, ``atof()``, ``strtol()``, or ``strtod()`` the conversion
-will check if the converted text is a valid integer of floating point
-number and will not silently return an unexpected or incorrect value.
-For example, ``atoi()`` will return 12 when converting "12.5" while the
-ValueTokenizer class will throw an :cpp:class:`InvalidIntegerException
+the functionality of ``strtok()`` from the C-library and the latter
+also includes conversion to different types.  Any errors while
+processing the string in those classes will result in an exception,
+which can be caught and the error processed as needed.  Unlike the
+C-library functions ``atoi()``, ``atof()``, ``strtol()``, or
+``strtod()`` the conversion will check if the converted text is a
+valid integer or floating point number and will not silently return an
+unexpected or incorrect value.  For example, ``atoi()`` will return 12
+when converting "12.5", while the ValueTokenizer class will throw an
+:cpp:class:`InvalidIntegerException
 <LAMMPS_NS::InvalidIntegerException>` if
 :cpp:func:`ValueTokenizer::next_int()
 <LAMMPS_NS::ValueTokenizer::next_int>` is called on the same string.