starting grammar, punctuation, and spelling review for developer info sections
This commit is contained in:
@ -1,52 +1,53 @@
|
||||
Code design
|
||||
-----------
|
||||
|
||||
This section explains some of the code design choices in LAMMPS with
|
||||
the goal of helping developers write new code similar to the existing
|
||||
code. Please see the section on :doc:`Requirements for contributed
|
||||
code <Modify_style>` for more specific recommendations and guidelines.
|
||||
While that section is organized more in the form of a checklist for
|
||||
code contributors, the focus here is on overall code design strategy,
|
||||
choices made between possible alternatives, and discussing some
|
||||
relevant C++ programming language constructs.
|
||||
This section explains some code design choices in LAMMPS with the goal
|
||||
of helping developers write new code similar to the existing code.
|
||||
Please see the section on :doc:`Requirements for contributed code
|
||||
<Modify_style>` for more specific recommendations and guidelines. While
|
||||
that section is organized more in the form of a checklist for code
|
||||
contributors, the focus here is on overall code design strategy, choices
|
||||
made between possible alternatives, and discussing some relevant C++
|
||||
programming language constructs.
|
||||
|
||||
Historically, the basic design philosophy of the LAMMPS C++ code was a
|
||||
"C with classes" style. The motivation was to make it easy to modify
|
||||
LAMMPS for people without significant training in C++ programming.
|
||||
Data structures and code constructs were used that resemble the
|
||||
previous implementation(s) in Fortran. A contributing factor to this
|
||||
choice also was that at the time, C++ compilers were often not mature
|
||||
and some of the advanced features contained bugs or did not function
|
||||
as the standard required. There were also disagreements between
|
||||
compiler vendors as to how to interpret the C++ standard documents.
|
||||
LAMMPS for people without significant training in C++ programming. Data
|
||||
structures and code constructs were used that resemble the previous
|
||||
implementation(s) in Fortran. A contributing factor to this choice was
|
||||
that at the time, C++ compilers were often not mature and some advanced
|
||||
features contained bugs or did not function as the standard required.
|
||||
There were also disagreements between compiler vendors as to how to
|
||||
interpret the C++ standard documents.
|
||||
|
||||
However, C++ compilers have now advanced significantly. In 2020 we
|
||||
decided to to require the C++11 standard as the minimum C++ language
|
||||
standard for LAMMPS. Since then we have begun to also replace some of
|
||||
the C-style constructs with equivalent C++ functionality, either from
|
||||
the C++ standard library or as custom classes or functions, in order
|
||||
to improve readability of the code and to increase code reuse through
|
||||
abstraction of commonly used functionality.
|
||||
However, C++ compilers and the C++ programming language have advanced
|
||||
significantly. In 2020, the LAMMPS developers decided to require the
|
||||
C++11 standard as the minimum C++ language standard for LAMMPS. Since
|
||||
then, we have begun to replace C-style constructs with equivalent C++
|
||||
functionality. This was taken either from the C++ standard library or
|
||||
implemented as custom classes or functions. The goal is to improve
|
||||
readability of the code and to increase code reuse through abstraction
|
||||
of commonly used functionality.
|
||||
|
||||
.. note::
|
||||
|
||||
Please note that as of spring 2022 there is still a sizable chunk
|
||||
of legacy code in LAMMPS that has not yet been refactored to
|
||||
reflect these style conventions in full. LAMMPS has a large code
|
||||
base and many different contributors and there also is a hierarchy
|
||||
of precedence in which the code is adapted. Highest priority has
|
||||
been the code in the ``src`` folder, followed by code in packages
|
||||
in order of their popularity and complexity (simpler code is
|
||||
adapted sooner), followed by code in the ``lib`` folder. Source
|
||||
code that is downloaded from external packages or libraries during
|
||||
compilation is not subject to the conventions discussed here.
|
||||
Please note that as of spring 2023 there is still a sizable chunk of
|
||||
legacy code in LAMMPS that has not yet been refactored to reflect
|
||||
these style conventions in full. LAMMPS has a large code base and
|
||||
many contributors. There is also a hierarchy of precedence in which
|
||||
the code is adapted. Highest priority has been the code in the
|
||||
``src`` folder, followed by code in packages in order of their
|
||||
popularity and complexity (simpler code gets adapted sooner), followed
|
||||
by code in the ``lib`` folder. Source code that is downloaded from
|
||||
external packages or libraries during compilation is not subject to
|
||||
the conventions discussed here.
|
||||
|
||||
Object oriented code
|
||||
Object-oriented code
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
LAMMPS is designed to be an object oriented code. Each simulation is
|
||||
LAMMPS is designed to be an object-oriented code. Each simulation is
|
||||
represented by an instance of the LAMMPS class. When running in
|
||||
parallel each MPI process creates such an instance. This can be seen
|
||||
parallel, each MPI process creates such an instance. This can be seen
|
||||
in the ``main.cpp`` file where the core steps of running a LAMMPS
|
||||
simulation are the following 3 lines of code:
|
||||
|
||||
@ -67,14 +68,14 @@ other special features.
|
||||
The basic LAMMPS class hierarchy which is created by the LAMMPS class
|
||||
constructor is shown in :ref:`class-topology`. When input commands
|
||||
are processed, additional class instances are created, or deleted, or
|
||||
replaced. Likewise specific member functions of specific classes are
|
||||
replaced. Likewise, specific member functions of specific classes are
|
||||
called to trigger actions such creating atoms, computing forces,
|
||||
computing properties, time-propagating the system, or writing output.
|
||||
|
||||
Compositing and Inheritance
|
||||
===========================
|
||||
|
||||
LAMMPS makes extensive use of the object oriented programming (OOP)
|
||||
LAMMPS makes extensive use of the object-oriented programming (OOP)
|
||||
principles of *compositing* and *inheritance*. Classes like the
|
||||
``LAMMPS`` class are a **composite** containing pointers to instances
|
||||
of other classes like ``Atom``, ``Comm``, ``Force``, ``Neighbor``,
|
||||
@ -83,7 +84,7 @@ functionality by storing and manipulating data related to the
|
||||
simulation and providing member functions that trigger certain
|
||||
actions. Some of those classes like ``Force`` are themselves
|
||||
composites, containing instances of classes describing different force
|
||||
interactions. Similarly the ``Modify`` class contains a list of
|
||||
interactions. Similarly, the ``Modify`` class contains a list of
|
||||
``Fix`` and ``Compute`` classes. If the input commands that
|
||||
correspond to these classes include the word *style*, then LAMMPS
|
||||
stores only a single instance of that class. E.g. *atom_style*,
|
||||
@ -100,19 +101,18 @@ derived class variant was instantiated. In LAMMPS these derived
|
||||
classes are often referred to as "styles", e.g. pair styles, fix
|
||||
styles, atom styles and so on.
|
||||
|
||||
This is the origin of the flexibility of LAMMPS. For example pair
|
||||
This is the origin of the flexibility of LAMMPS. For example, pair
|
||||
styles implement a variety of different non-bonded interatomic
|
||||
potentials functions. All details for the implementation of a
|
||||
potential are stored and executed in a single class.
|
||||
|
||||
As mentioned above, there can be multiple instances of classes derived
|
||||
from the ``Fix`` or ``Compute`` base classes. They represent a
|
||||
different facet of LAMMPS flexibility as they provide methods which
|
||||
can be called at different points in time within a timestep, as
|
||||
explained in `Developer_flow`. This allows the input script to tailor
|
||||
how a specific simulation is run, what diagnostic computations are
|
||||
performed, and how the output of those computations is further
|
||||
processed or output.
|
||||
different facet of LAMMPS' flexibility, as they provide methods which
|
||||
can be called at different points within a timestep, as explained in
|
||||
`Developer_flow`. This allows the input script to tailor how a specific
|
||||
simulation is run, what diagnostic computations are performed, and how
|
||||
the output of those computations is further processed or output.
|
||||
|
||||
Additional code sharing is possible by creating derived classes from the
|
||||
derived classes (e.g., to implement an accelerated version of a pair
|
||||
@ -164,15 +164,15 @@ The difference in behavior of the ``normal()`` and the ``poly()`` member
|
||||
functions is which of the two member functions is called when executing
|
||||
`base1->call()` versus `base2->call()`. Without polymorphism, a
|
||||
function within the base class can only call member functions within the
|
||||
same scope, that is ``Base::call()`` will always call
|
||||
``Base::normal()``. But for the `base2->call()` case the call of the
|
||||
same scope: that is, ``Base::call()`` will always call
|
||||
``Base::normal()``. But for the `base2->call()` case, the call of the
|
||||
virtual member function will be dispatched to ``Derived::poly()``
|
||||
instead. This mechanism means that functions are called within the
|
||||
scope of the class type that was used to *create* the class instance are
|
||||
invoked; even if they are assigned to a pointer using the type of a base
|
||||
class. This is the desired behavior and this way LAMMPS can even use
|
||||
styles that are loaded at runtime from a shared object file with the
|
||||
:doc:`plugin command <plugin>`.
|
||||
instead. This mechanism results in calling functions that are within
|
||||
the scope of the class that was used to *create* the instance, even if
|
||||
they are assigned to a pointer for their base class. This is the
|
||||
desired behavior, and this way LAMMPS can even use styles that are loaded
|
||||
at runtime from a shared object file with the :doc:`plugin command
|
||||
<plugin>`.
|
||||
|
||||
A special case of virtual functions are so-called pure functions. These
|
||||
are virtual functions that are initialized to 0 in the class declaration
|
||||
@ -189,12 +189,12 @@ This has the effect that an instance of the base class cannot be
|
||||
created and that derived classes **must** implement these functions.
|
||||
Many of the functions listed with the various class styles in the
|
||||
section :doc:`Modify` are pure functions. The motivation for this is
|
||||
to define the interface or API of the functions but defer their
|
||||
to define the interface or API of the functions, but defer their
|
||||
implementation to the derived classes.
|
||||
|
||||
However, there are downsides to this. For example, calls to virtual
|
||||
functions from within a constructor, will not be in the scope of the
|
||||
derived class and thus it is good practice to either avoid calling them
|
||||
functions from within a constructor, will *not* be in the scope of the
|
||||
derived class, and thus it is good practice to either avoid calling them
|
||||
or to provide an explicit scope such as ``Base::poly()`` or
|
||||
``Derived::poly()``. Furthermore, any destructors in classes containing
|
||||
virtual functions should be declared virtual too, so they will be
|
||||
@ -208,8 +208,8 @@ dispatch.
|
||||
that are intended to replace a virtual or pure function use the
|
||||
``override`` property keyword. For the same reason, the use of
|
||||
overloads or default arguments for virtual functions should be
|
||||
avoided as they lead to confusion over which function is supposed to
|
||||
override which and which arguments need to be declared.
|
||||
avoided, as they lead to confusion over which function is supposed to
|
||||
override which, and which arguments need to be declared.
|
||||
|
||||
Style Factories
|
||||
===============
|
||||
@ -219,10 +219,10 @@ uses a programming pattern called `Factory`. Those are functions that
|
||||
create an instance of a specific derived class, say ``PairLJCut`` and
|
||||
return a pointer to the type of the common base class of that style,
|
||||
``Pair`` in this case. To associate the factory function with the
|
||||
style keyword, an ``std::map`` class is used with function pointers
|
||||
style keyword, a ``std::map`` class is used with function pointers
|
||||
indexed by their keyword (for example "lj/cut" for ``PairLJCut`` and
|
||||
"morse" for ``PairMorse``). A couple of typedefs help keep the code
|
||||
readable and a template function is used to implement the actual
|
||||
readable, and a template function is used to implement the actual
|
||||
factory functions for the individual classes. Below is an example
|
||||
of such a factory function from the ``Force`` class as declared in
|
||||
``force.h`` and implemented in ``force.cpp``. The file ``style_pair.h``
|
||||
@ -279,26 +279,26 @@ from and writing to files and console instead of C++ "iostreams".
|
||||
This is mainly motivated by better performance, better control over
|
||||
formatting, and less effort to achieve specific formatting.
|
||||
|
||||
Since mixing "stdio" and "iostreams" can lead to unexpected
|
||||
behavior. use of the latter is strongly discouraged. Also output to
|
||||
the screen should not use the predefined ``stdout`` FILE pointer, but
|
||||
rather the ``screen`` and ``logfile`` FILE pointers managed by the
|
||||
LAMMPS class. Furthermore, output should generally only be done by
|
||||
MPI rank 0 (``comm->me == 0``). Output that is sent to both
|
||||
``screen`` and ``logfile`` should use the :cpp:func:`utils::logmesg()
|
||||
convenience function <LAMMPS_NS::utils::logmesg>`.
|
||||
Since mixing "stdio" and "iostreams" can lead to unexpected behavior,
|
||||
use of the latter is strongly discouraged. Output to the screen should
|
||||
*not* use the predefined ``stdout`` FILE pointer, but rather the
|
||||
``screen`` and ``logfile`` FILE pointers managed by the LAMMPS class.
|
||||
Furthermore, output should generally only be done by MPI rank 0
|
||||
(``comm->me == 0``). Output that is sent to both ``screen`` and
|
||||
``logfile`` should use the :cpp:func:`utils::logmesg() convenience
|
||||
function <LAMMPS_NS::utils::logmesg>`.
|
||||
|
||||
We also discourage the use of stringstreams because the bundled {fmt}
|
||||
library and the customized tokenizer classes can provide the same
|
||||
functionality in a cleaner way with better performance. This also
|
||||
helps maintain a consistent programming syntax with code from many
|
||||
different contributors.
|
||||
We discourage the use of stringstreams because the bundled {fmt} library
|
||||
and the customized tokenizer classes provide the same functionality in a
|
||||
cleaner way with better performance. This also helps maintain a
|
||||
consistent programming syntax with code from many different
|
||||
contributors.
|
||||
|
||||
Formatting with the {fmt} library
|
||||
===================================
|
||||
|
||||
The LAMMPS source code includes a copy of the `{fmt} library
|
||||
<https://fmt.dev>`_ which is preferred over formatting with the
|
||||
<https://fmt.dev>`_, which is preferred over formatting with the
|
||||
"printf()" family of functions. The primary reason is that it allows
|
||||
a typesafe default format for any type of supported data. This is
|
||||
particularly useful for formatting integers of a given size (32-bit or
|
||||
@ -313,17 +313,16 @@ been included into the C++20 language standard, so changes to adopt it
|
||||
are future-proof.
|
||||
|
||||
Formatted strings are frequently created by calling the
|
||||
``fmt::format()`` function which will return a string as a
|
||||
``std::string`` class instance. In contrast to the ``%`` placeholder
|
||||
in ``printf()``, the {fmt} library uses ``{}`` to embed format
|
||||
descriptors. In the simplest case, no additional characters are
|
||||
needed as {fmt} will choose the default format based on the data type
|
||||
of the argument. Otherwise the ``fmt::print()`` function may be
|
||||
used instead of ``printf()`` or ``fprintf()``. In addition, several
|
||||
LAMMPS output functions, that originally accepted a single string as
|
||||
argument have been overloaded to accept a format string with optional
|
||||
arguments as well (e.g., ``Error::all()``, ``Error::one()``,
|
||||
``utils::logmesg()``).
|
||||
``fmt::format()`` function, which will return a string as a
|
||||
``std::string`` class instance. In contrast to the ``%`` placeholder in
|
||||
``printf()``, the {fmt} library uses ``{}`` to embed format descriptors.
|
||||
In the simplest case, no additional characters are needed, as {fmt} will
|
||||
choose the default format based on the data type of the argument.
|
||||
Otherwise, the ``fmt::print()`` function may be used instead of
|
||||
``printf()`` or ``fprintf()``. In addition, several LAMMPS output
|
||||
functions, that originally accepted a single string as argument have
|
||||
been overloaded to accept a format string with optional arguments as
|
||||
well (e.g., ``Error::all()``, ``Error::one()``, ``utils::logmesg()``).
|
||||
|
||||
Summary of the {fmt} format syntax
|
||||
==================================
|
||||
@ -332,10 +331,11 @@ The syntax of the format string is "{[<argument id>][:<format spec>]}",
|
||||
where either the argument id or the format spec (separated by a colon
|
||||
':') is optional. The argument id is usually a number starting from 0
|
||||
that is the index to the arguments following the format string. By
|
||||
default these are assigned in order (i.e. 0, 1, 2, 3, 4 etc.). The most
|
||||
common case for using argument id would be to use the same argument in
|
||||
multiple places in the format string without having to provide it as an
|
||||
argument multiple times. In LAMMPS the argument id is rarely used.
|
||||
default, these are assigned in order (i.e. 0, 1, 2, 3, 4 etc.). The
|
||||
most common case for using argument id would be to use the same argument
|
||||
in multiple places in the format string without having to provide it as
|
||||
an argument multiple times. The argument id is rarely used in the LAMMPS
|
||||
source code.
|
||||
|
||||
More common is the use of a format specifier, which starts with a colon.
|
||||
This may optionally be followed by a fill character (default is ' '). If
|
||||
@ -347,18 +347,19 @@ width, which may be followed by a dot '.' and a precision for floating
|
||||
point numbers. The final character in the format string would be an
|
||||
indicator for the "presentation", i.e. 'd' for decimal presentation of
|
||||
integers, 'x' for hexadecimal, 'o' for octal, 'c' for character etc.
|
||||
This mostly follows the "printf()" scheme but without requiring an
|
||||
This mostly follows the "printf()" scheme, but without requiring an
|
||||
additional length parameter to distinguish between different integer
|
||||
widths. The {fmt} library will detect those and adapt the formatting
|
||||
accordingly. For floating point numbers there are correspondingly, 'g'
|
||||
for generic presentation, 'e' for exponential presentation, and 'f' for
|
||||
fixed point presentation.
|
||||
|
||||
Thus "{:8}" would represent *any* type argument using at least 8
|
||||
characters; "{:<8}" would do this as left aligned, "{:^8}" as centered,
|
||||
"{:>8}" as right aligned. If a specific presentation is selected, the
|
||||
argument type must be compatible or else the {fmt} formatting code will
|
||||
throw an exception. Some format string examples are given below:
|
||||
The format string "{:8}" would thus represent *any* type argument and be
|
||||
replaced by at least 8 characters; "{:<8}" would do this as left
|
||||
aligned, "{:^8}" as centered, "{:>8}" as right aligned. If a specific
|
||||
presentation is selected, the argument type must be compatible or else
|
||||
the {fmt} formatting code will throw an exception. Some format string
|
||||
examples are given below:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
@ -392,12 +393,12 @@ documentation <https://fmt.dev/latest/syntax.html>`_ website.
|
||||
Memory management
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
Dynamical allocation of small data and objects can be done with the
|
||||
the C++ commands "new" and "delete/delete[]. Large data should use
|
||||
the member functions of the ``Memory`` class, most commonly,
|
||||
``Memory::create()``, ``Memory::grow()``, and ``Memory::destroy()``,
|
||||
which provide variants for vectors, 2d arrays, 3d arrays, etc.
|
||||
These can also be used for small data.
|
||||
Dynamical allocation of small data and objects can be done with the C++
|
||||
commands "new" and "delete/delete[]". Large data should use the member
|
||||
functions of the ``Memory`` class, most commonly, ``Memory::create()``,
|
||||
``Memory::grow()``, and ``Memory::destroy()``, which provide variants
|
||||
for vectors, 2d arrays, 3d arrays, etc. These can also be used for
|
||||
small data.
|
||||
|
||||
The use of ``malloc()``, ``calloc()``, ``realloc()`` and ``free()``
|
||||
directly is strongly discouraged. To simplify adapting legacy code
|
||||
@ -408,26 +409,24 @@ perform additional error checks for safety.
|
||||
Use of these custom memory allocation functions is motivated by the
|
||||
following considerations:
|
||||
|
||||
- memory allocation failures on *any* MPI rank during a parallel run
|
||||
will trigger an immediate abort of the entire parallel calculation
|
||||
instead of stalling it
|
||||
- a failing "new" will trigger an exception which is also captured by
|
||||
LAMMPS and triggers a global abort
|
||||
- allocation of multi-dimensional arrays will be done in a C compatible
|
||||
fashion but so that the storage of the actual data is stored in one
|
||||
large contiguous block. Thus when MPI communication is needed,
|
||||
- Memory allocation failures on *any* MPI rank during a parallel run
|
||||
will trigger an immediate abort of the entire parallel calculation.
|
||||
- A failing "new" will trigger an exception, which is also captured by
|
||||
LAMMPS and triggers a global abort.
|
||||
- Allocation of multidimensional arrays will be done in a C compatible
|
||||
fashion, but such that the storage of the actual data is stored in one
|
||||
large contiguous block. Thus, when MPI communication is needed,
|
||||
the data can be communicated directly (similar to Fortran arrays).
|
||||
- the "destroy()" and "sfree()" functions may safely be called on NULL
|
||||
pointers
|
||||
- the "destroy()" functions will nullify the pointer variables making
|
||||
"use after free" errors easy to detect
|
||||
- it is possible to use a larger than default memory alignment (not on
|
||||
- The "destroy()" and "sfree()" functions may safely be called on NULL
|
||||
pointers.
|
||||
- The "destroy()" functions will nullify the pointer variables, thus
|
||||
making "use after free" errors easy to detect.
|
||||
- It is possible to use a larger than default memory alignment (not on
|
||||
all operating systems, since the allocated storage pointers must be
|
||||
compatible with ``free()`` for technical reasons)
|
||||
compatible with ``free()`` for technical reasons).
|
||||
|
||||
In the practical implementation of code this means that any pointer
|
||||
variables that are class members should be initialized to a
|
||||
``nullptr`` value in their respective constructors. That way it is
|
||||
safe to call ``Memory::destroy()`` or ``delete[]`` on them before
|
||||
*any* allocation outside the constructor. This helps prevent memory
|
||||
leaks.
|
||||
In the practical implementation of code this means, that any pointer
|
||||
variables, that are class members should be initialized to a ``nullptr``
|
||||
value in their respective constructors. That way, it is safe to call
|
||||
``Memory::destroy()`` or ``delete[]`` on them before *any* allocation
|
||||
outside the constructor. This helps prevent memory leaks.
|
||||
|
||||
Reference in New Issue
Block a user