starting grammar, punctuation, and spelling review for developer info sections

2023-02-04 18:03:30 -05:00
parent a0a7e76cc3
commit d1550bf9f6
15 changed files with 366 additions and 357 deletions
--- a/doc/src/Developer_code_design.rst
+++ b/doc/src/Developer_code_design.rst
@ -1,52 +1,53 @@
 Code design
 -----------

-This section explains some of the code design choices in LAMMPS with
-the goal of helping developers write new code similar to the existing
-code.  Please see the section on :doc:`Requirements for contributed
-code <Modify_style>` for more specific recommendations and guidelines.
-While that section is organized more in the form of a checklist for
-code contributors, the focus here is on overall code design strategy,
-choices made between possible alternatives, and discussing some
-relevant C++ programming language constructs.
+This section explains some code design choices in LAMMPS with the goal
+of helping developers write new code similar to the existing code.
+Please see the section on :doc:`Requirements for contributed code
+<Modify_style>` for more specific recommendations and guidelines.  While
+that section is organized more in the form of a checklist for code
+contributors, the focus here is on overall code design strategy, choices
+made between possible alternatives, and discussing some relevant C++
+programming language constructs.

 Historically, the basic design philosophy of the LAMMPS C++ code was a
 "C with classes" style.  The motivation was to make it easy to modify
-LAMMPS for people without significant training in C++ programming.
-Data structures and code constructs were used that resemble the
-previous implementation(s) in Fortran.  A contributing factor to this
-choice also was that at the time, C++ compilers were often not mature
-and some of the advanced features contained bugs or did not function
-as the standard required.  There were also disagreements between
-compiler vendors as to how to interpret the C++ standard documents.
+LAMMPS for people without significant training in C++ programming.  Data
+structures and code constructs were used that resemble the previous
+implementation(s) in Fortran.  A contributing factor to this choice was
+that at the time, C++ compilers were often not mature and some advanced
+features contained bugs or did not function as the standard required.
+There were also disagreements between compiler vendors as to how to
+interpret the C++ standard documents.

-However, C++ compilers have now advanced significantly.  In 2020 we
-decided to to require the C++11 standard as the minimum C++ language
-standard for LAMMPS.  Since then we have begun to also replace some of
-the C-style constructs with equivalent C++ functionality, either from
-the C++ standard library or as custom classes or functions, in order
-to improve readability of the code and to increase code reuse through
-abstraction of commonly used functionality.
+However, C++ compilers and the C++ programming language have advanced
+significantly.  In 2020, the LAMMPS developers decided to require the
+C++11 standard as the minimum C++ language standard for LAMMPS.  Since
+then, we have begun to replace C-style constructs with equivalent C++
+functionality.  This was taken either from the C++ standard library or
+implemented as custom classes or functions.  The goal is to improve
+readability of the code and to increase code reuse through abstraction
+of commonly used functionality.

 .. note::

-   Please note that as of spring 2022 there is still a sizable chunk
-   of legacy code in LAMMPS that has not yet been refactored to
-   reflect these style conventions in full.  LAMMPS has a large code
-   base and many different contributors and there also is a hierarchy
-   of precedence in which the code is adapted.  Highest priority has
-   been the code in the ``src`` folder, followed by code in packages
-   in order of their popularity and complexity (simpler code is
-   adapted sooner), followed by code in the ``lib`` folder.  Source
-   code that is downloaded from external packages or libraries during
-   compilation is not subject to the conventions discussed here.
+   Please note that as of spring 2023 there is still a sizable chunk of
+   legacy code in LAMMPS that has not yet been refactored to reflect
+   these style conventions in full.  LAMMPS has a large code base and
+   many contributors.  There is also a hierarchy of precedence in which
+   the code is adapted.  Highest priority has been the code in the
+   ``src`` folder, followed by code in packages in order of their
+   popularity and complexity (simpler code gets adapted sooner), followed
+   by code in the ``lib`` folder.  Source code that is downloaded from
+   external packages or libraries during compilation is not subject to
+   the conventions discussed here.

-Object oriented code
+Object-oriented code
 ^^^^^^^^^^^^^^^^^^^^

-LAMMPS is designed to be an object oriented code.  Each simulation is
+LAMMPS is designed to be an object-oriented code.  Each simulation is
 represented by an instance of the LAMMPS class.  When running in
-parallel each MPI process creates such an instance.  This can be seen
+parallel, each MPI process creates such an instance.  This can be seen
 in the ``main.cpp`` file where the core steps of running a LAMMPS
 simulation are the following 3 lines of code:

@ -67,14 +68,14 @@ other special features.
 The basic LAMMPS class hierarchy which is created by the LAMMPS class
 constructor is shown in :ref:`class-topology`.  When input commands
 are processed, additional class instances are created, or deleted, or
-replaced.  Likewise specific member functions of specific classes are
+replaced.  Likewise, specific member functions of specific classes are
 called to trigger actions such creating atoms, computing forces,
 computing properties, time-propagating the system, or writing output.

 Compositing and Inheritance
 ===========================

-LAMMPS makes extensive use of the object oriented programming (OOP)
+LAMMPS makes extensive use of the object-oriented programming (OOP)
 principles of *compositing* and *inheritance*. Classes like the
 ``LAMMPS`` class are a **composite** containing pointers to instances
 of other classes like ``Atom``, ``Comm``, ``Force``, ``Neighbor``,
@ -83,7 +84,7 @@ functionality by storing and manipulating data related to the
 simulation and providing member functions that trigger certain
 actions.  Some of those classes like ``Force`` are themselves
 composites, containing instances of classes describing different force
-interactions.  Similarly the ``Modify`` class contains a list of
+interactions.  Similarly, the ``Modify`` class contains a list of
 ``Fix`` and ``Compute`` classes.  If the input commands that
 correspond to these classes include the word *style*, then LAMMPS
 stores only a single instance of that class.  E.g. *atom_style*,
@ -100,19 +101,18 @@ derived class variant was instantiated.  In LAMMPS these derived
 classes are often referred to as "styles", e.g.  pair styles, fix
 styles, atom styles and so on.

-This is the origin of the flexibility of LAMMPS.  For example pair
+This is the origin of the flexibility of LAMMPS.  For example, pair
 styles implement a variety of different non-bonded interatomic
 potentials functions.  All details for the implementation of a
 potential are stored and executed in a single class.

 As mentioned above, there can be multiple instances of classes derived
 from the ``Fix`` or ``Compute`` base classes.  They represent a
-different facet of LAMMPS flexibility as they provide methods which
-can be called at different points in time within a timestep, as
-explained in `Developer_flow`.  This allows the input script to tailor
-how a specific simulation is run, what diagnostic computations are
-performed, and how the output of those computations is further
-processed or output.
+different facet of LAMMPS' flexibility, as they provide methods which
+can be called at different points within a timestep, as explained in
+`Developer_flow`.  This allows the input script to tailor how a specific
+simulation is run, what diagnostic computations are performed, and how
+the output of those computations is further processed or output.

 Additional code sharing is possible by creating derived classes from the
 derived classes (e.g., to implement an accelerated version of a pair
@ -164,15 +164,15 @@ The difference in behavior of the ``normal()`` and the ``poly()`` member
 functions is which of the two member functions is called when executing
 `base1->call()` versus `base2->call()`.  Without polymorphism, a
 function within the base class can only call member functions within the
-same scope, that is ``Base::call()`` will always call
-``Base::normal()``.  But for the `base2->call()` case the call of the
+same scope: that is, ``Base::call()`` will always call
+``Base::normal()``.  But for the `base2->call()` case, the call of the
 virtual member function will be dispatched to ``Derived::poly()``
-instead.  This mechanism means that functions are called within the
-scope of the class type that was used to *create* the class instance are
-invoked; even if they are assigned to a pointer using the type of a base
-class.  This is the desired behavior and this way LAMMPS can even use
-styles that are loaded at runtime from a shared object file with the
-:doc:`plugin command <plugin>`.
+instead.  This mechanism results in calling functions that are within
+the scope of the class that was used to *create* the instance, even if
+they are assigned to a pointer for their base class.  This is the
+desired behavior, and this way LAMMPS can even use styles that are loaded
+at runtime from a shared object file with the :doc:`plugin command
+<plugin>`.

 A special case of virtual functions are so-called pure functions.  These
 are virtual functions that are initialized to 0 in the class declaration
@ -189,12 +189,12 @@ This has the effect that an instance of the base class cannot be
 created and that derived classes **must** implement these functions.
 Many of the functions listed with the various class styles in the
 section :doc:`Modify` are pure functions.  The motivation for this is
-to define the interface or API of the functions but defer their
+to define the interface or API of the functions, but defer their
 implementation to the derived classes.

 However, there are downsides to this. For example, calls to virtual
-functions from within a constructor, will not be in the scope of the
-derived class and thus it is good practice to either avoid calling them
+functions from within a constructor, will *not* be in the scope of the
+derived class, and thus it is good practice to either avoid calling them
 or to provide an explicit scope such as ``Base::poly()`` or
 ``Derived::poly()``.  Furthermore, any destructors in classes containing
 virtual functions should be declared virtual too, so they will be
@ -208,8 +208,8 @@ dispatch.
   that are intended to replace a virtual or pure function use the
   ``override`` property keyword.  For the same reason, the use of
   overloads or default arguments for virtual functions should be
-   avoided as they lead to confusion over which function is supposed to
-   override which and which arguments need to be declared.
+   avoided, as they lead to confusion over which function is supposed to
+   override which, and which arguments need to be declared.

 Style Factories
 ===============
@ -219,10 +219,10 @@ uses a programming pattern called `Factory`.  Those are functions that
 create an instance of a specific derived class, say ``PairLJCut`` and
 return a pointer to the type of the common base class of that style,
 ``Pair`` in this case.  To associate the factory function with the
-style keyword, an ``std::map`` class is used with function pointers
+style keyword, a ``std::map`` class is used with function pointers
 indexed by their keyword (for example "lj/cut" for ``PairLJCut`` and
 "morse" for ``PairMorse``).  A couple of typedefs help keep the code
-readable and a template function is used to implement the actual
+readable, and a template function is used to implement the actual
 factory functions for the individual classes.  Below is an example
 of such a factory function from the ``Force`` class as declared in
 ``force.h`` and implemented in ``force.cpp``.  The file ``style_pair.h``
@ -279,26 +279,26 @@ from and writing to files and console instead of C++ "iostreams".
 This is mainly motivated by better performance, better control over
 formatting, and less effort to achieve specific formatting.

-Since mixing "stdio" and "iostreams" can lead to unexpected
-behavior. use of the latter is strongly discouraged.  Also output to
-the screen should not use the predefined ``stdout`` FILE pointer, but
-rather the ``screen`` and ``logfile`` FILE pointers managed by the
-LAMMPS class.  Furthermore, output should generally only be done by
-MPI rank 0 (``comm->me == 0``).  Output that is sent to both
-``screen`` and ``logfile`` should use the :cpp:func:`utils::logmesg()
-convenience function <LAMMPS_NS::utils::logmesg>`.
+Since mixing "stdio" and "iostreams" can lead to unexpected behavior,
+use of the latter is strongly discouraged.  Output to the screen should
+*not* use the predefined ``stdout`` FILE pointer, but rather the
+``screen`` and ``logfile`` FILE pointers managed by the LAMMPS class.
+Furthermore, output should generally only be done by MPI rank 0
+(``comm->me == 0``).  Output that is sent to both ``screen`` and
+``logfile`` should use the :cpp:func:`utils::logmesg() convenience
+function <LAMMPS_NS::utils::logmesg>`.

-We also discourage the use of stringstreams because the bundled {fmt}
-library and the customized tokenizer classes can provide the same
-functionality in a cleaner way with better performance.  This also
-helps maintain a consistent programming syntax with code from many
-different contributors.
+We discourage the use of stringstreams because the bundled {fmt} library
+and the customized tokenizer classes provide the same functionality in a
+cleaner way with better performance.  This also helps maintain a
+consistent programming syntax with code from many different
+contributors.

 Formatting with the {fmt} library
 ===================================

 The LAMMPS source code includes a copy of the `{fmt} library
-<https://fmt.dev>`_ which is preferred over formatting with the
+<https://fmt.dev>`_, which is preferred over formatting with the
 "printf()" family of functions.  The primary reason is that it allows
 a typesafe default format for any type of supported data.  This is
 particularly useful for formatting integers of a given size (32-bit or
@ -313,17 +313,16 @@ been included into the C++20 language standard, so changes to adopt it
 are future-proof.

 Formatted strings are frequently created by calling the
-``fmt::format()`` function which will return a string as a
-``std::string`` class instance.  In contrast to the ``%`` placeholder
-in ``printf()``, the {fmt} library uses ``{}`` to embed format
-descriptors.  In the simplest case, no additional characters are
-needed as {fmt} will choose the default format based on the data type
-of the argument.  Otherwise the ``fmt::print()`` function may be
-used instead of ``printf()`` or ``fprintf()``.  In addition, several
-LAMMPS output functions, that originally accepted a single string as
-argument have been overloaded to accept a format string with optional
-arguments as well (e.g., ``Error::all()``, ``Error::one()``,
-``utils::logmesg()``).
+``fmt::format()`` function, which will return a string as a
+``std::string`` class instance.  In contrast to the ``%`` placeholder in
+``printf()``, the {fmt} library uses ``{}`` to embed format descriptors.
+In the simplest case, no additional characters are needed, as {fmt} will
+choose the default format based on the data type of the argument.
+Otherwise, the ``fmt::print()`` function may be used instead of
+``printf()`` or ``fprintf()``.  In addition, several LAMMPS output
+functions, that originally accepted a single string as argument have
+been overloaded to accept a format string with optional arguments as
+well (e.g., ``Error::all()``, ``Error::one()``, ``utils::logmesg()``).

 Summary of the {fmt} format syntax
 ==================================
@ -332,10 +331,11 @@ The syntax of the format string is "{[<argument id>][:<format spec>]}",
 where either the argument id or the format spec (separated by a colon
 ':') is optional.  The argument id is usually a number starting from 0
 that is the index to the arguments following the format string.  By
-default these are assigned in order (i.e. 0, 1, 2, 3, 4 etc.).  The most
-common case for using argument id would be to use the same argument in
-multiple places in the format string without having to provide it as an
-argument multiple times. In LAMMPS the argument id is rarely used.
+default, these are assigned in order (i.e. 0, 1, 2, 3, 4 etc.).  The
+most common case for using argument id would be to use the same argument
+in multiple places in the format string without having to provide it as
+an argument multiple times. The argument id is rarely used in the LAMMPS
+source code.

 More common is the use of a format specifier, which starts with a colon.
 This may optionally be followed by a fill character (default is ' '). If
@ -347,18 +347,19 @@ width, which may be followed by a dot '.' and a precision for floating
 point numbers.  The final character in the format string would be an
 indicator for the "presentation", i.e. 'd' for decimal presentation of
 integers, 'x' for hexadecimal, 'o' for octal, 'c' for character etc.
-This mostly follows the "printf()" scheme but without requiring an
+This mostly follows the "printf()" scheme, but without requiring an
 additional length parameter to distinguish between different integer
 widths.  The {fmt} library will detect those and adapt the formatting
 accordingly.  For floating point numbers there are correspondingly, 'g'
 for generic presentation, 'e' for exponential presentation, and 'f' for
 fixed point presentation.

-Thus "{:8}" would represent *any* type argument using at least 8
-characters; "{:<8}" would do this as left aligned, "{:^8}" as centered,
-"{:>8}" as right aligned.  If a specific presentation is selected, the
-argument type must be compatible or else the {fmt} formatting code will
-throw an exception. Some format string examples are given below:
+The format string "{:8}" would thus represent *any* type argument and be
+replaced by at least 8 characters; "{:<8}" would do this as left
+aligned, "{:^8}" as centered, "{:>8}" as right aligned.  If a specific
+presentation is selected, the argument type must be compatible or else
+the {fmt} formatting code will throw an exception.  Some format string
+examples are given below:

 .. code-block:: c++

@ -392,12 +393,12 @@ documentation <https://fmt.dev/latest/syntax.html>`_ website.
 Memory management
 ^^^^^^^^^^^^^^^^^

-Dynamical allocation of small data and objects can be done with the
-the C++ commands "new" and "delete/delete[].  Large data should use
-the member functions of the ``Memory`` class, most commonly,
-``Memory::create()``, ``Memory::grow()``, and ``Memory::destroy()``,
-which provide variants for vectors, 2d arrays, 3d arrays, etc.
-These can also be used for small data.
+Dynamical allocation of small data and objects can be done with the C++
+commands "new" and "delete/delete[]".  Large data should use the member
+functions of the ``Memory`` class, most commonly, ``Memory::create()``,
+``Memory::grow()``, and ``Memory::destroy()``, which provide variants
+for vectors, 2d arrays, 3d arrays, etc.  These can also be used for
+small data.

 The use of ``malloc()``, ``calloc()``, ``realloc()`` and ``free()``
 directly is strongly discouraged.  To simplify adapting legacy code
@ -408,26 +409,24 @@ perform additional error checks for safety.
 Use of these custom memory allocation functions is motivated by the
 following considerations:

- memory allocation failures on *any* MPI rank during a parallel run
-  will trigger an immediate abort of the entire parallel calculation
-  instead of stalling it
- a failing "new" will trigger an exception which is also captured by
-  LAMMPS and triggers a global abort
- allocation of multi-dimensional arrays will be done in a C compatible
-  fashion but so that the storage of the actual data is stored in one
-  large contiguous block.  Thus when MPI communication is needed,
+- Memory allocation failures on *any* MPI rank during a parallel run
+  will trigger an immediate abort of the entire parallel calculation.
+- A failing "new" will trigger an exception, which is also captured by
+  LAMMPS and triggers a global abort.
+- Allocation of multidimensional arrays will be done in a C compatible
+  fashion, but such that the storage of the actual data is stored in one
+  large contiguous block.  Thus, when MPI communication is needed,
  the data can be communicated directly (similar to Fortran arrays).
- the "destroy()" and "sfree()" functions may safely be called on NULL
-  pointers
- the "destroy()" functions will nullify the pointer variables making
-  "use after free" errors easy to detect
- it is possible to use a larger than default memory alignment (not on
+- The "destroy()" and "sfree()" functions may safely be called on NULL
+  pointers.
+- The "destroy()" functions will nullify the pointer variables, thus
+  making "use after free" errors easy to detect.
+- It is possible to use a larger than default memory alignment (not on
  all operating systems, since the allocated storage pointers must be
-  compatible with ``free()`` for technical reasons)
+  compatible with ``free()`` for technical reasons).

-In the practical implementation of code this means that any pointer
-variables that are class members should be initialized to a
-``nullptr`` value in their respective constructors.  That way it is
-safe to call ``Memory::destroy()`` or ``delete[]`` on them before
-*any* allocation outside the constructor.  This helps prevent memory
-leaks.
+In the practical implementation of code this means, that any pointer
+variables, that are class members should be initialized to a ``nullptr``
+value in their respective constructors.  That way, it is safe to call
+``Memory::destroy()`` or ``delete[]`` on them before *any* allocation
+outside the constructor.  This helps prevent memory leaks.