Edits to developer doc files
This commit is contained in:
@ -2,7 +2,7 @@ Accessing per-atom data
|
||||
-----------------------
|
||||
|
||||
This page discusses how per-atom data is managed in LAMMPS, how it can
|
||||
be accessed, what communication patters apply, and some of the utility
|
||||
be accessed, what communication patterns apply, and some of the utility
|
||||
functions that exist for a variety of purposes.
|
||||
|
||||
|
||||
@ -14,11 +14,11 @@ As described on the :doc:`parallel partitioning algorithms
|
||||
simulation domain, either in a *brick* or *tiled* manner. Each MPI
|
||||
process *owns* exactly one subdomain and the atoms within it. To compute
|
||||
forces for tuples of atoms that are spread across sub-domain boundaries,
|
||||
also a "halo" of *ghost* atoms are maintained within a the communication
|
||||
also a "halo" of *ghost* atoms are maintained within the communication
|
||||
cutoff distance of its subdomain.
|
||||
|
||||
The total number of atoms is stored in `Atom::natoms` (within any
|
||||
typical class this can be referred to at `atom->natoms`. The number of
|
||||
typical class this can be referred to at `atom->natoms`). The number of
|
||||
*owned* (or "local" atoms) are stored in `Atom::nlocal`; the number of
|
||||
*ghost* atoms is stored in `Atom::nghost`. The sum of `Atom::nlocal`
|
||||
over all MPI processes should be `Atom::natoms`. This is by default
|
||||
@ -27,8 +27,8 @@ LAMMPS stops with a "lost atoms" error. For convenience also the
|
||||
property `Atom::nmax` is available, this is the maximum of
|
||||
`Atom::nlocal + Atom::nghost` across all MPI processes.
|
||||
|
||||
Per-atom properties are either managed by the atom style, or individual
|
||||
classes. or as custom arrays by the individual classes. If only access
|
||||
Per-atom properties are either managed by the atom style, individual
|
||||
classes, or as custom arrays by the individual classes. If only access
|
||||
to *owned* atoms is needed, they are usually allocated to be of size
|
||||
`Atom::nlocal`, otherwise of size `Atom::nmax`. Please note that not all
|
||||
per-atom properties are available or updated on *ghost* atoms. For
|
||||
@ -61,7 +61,7 @@ can be found via the `Atom::sametag` array. It points to the next atom
|
||||
index with the same tag or -1 if there are no more atoms with the same
|
||||
tag. The list will be exhaustive when starting with an index of an
|
||||
*owned* atom, since the atom IDs are unique, so there can only be one
|
||||
such atom. Example code to count atoms with same atom ID in subdomain:
|
||||
such atom. Example code to count atoms with same atom ID in a subdomain:
|
||||
|
||||
.. code-block:: c++
|
||||
|
||||
|
||||
@ -69,7 +69,7 @@ The basic LAMMPS class hierarchy which is created by the LAMMPS class
|
||||
constructor is shown in :ref:`class-topology`. When input commands
|
||||
are processed, additional class instances are created, or deleted, or
|
||||
replaced. Likewise, specific member functions of specific classes are
|
||||
called to trigger actions such creating atoms, computing forces,
|
||||
called to trigger actions such as creating atoms, computing forces,
|
||||
computing properties, time-propagating the system, or writing output.
|
||||
|
||||
Compositing and Inheritance
|
||||
@ -110,9 +110,10 @@ As mentioned above, there can be multiple instances of classes derived
|
||||
from the ``Fix`` or ``Compute`` base classes. They represent a
|
||||
different facet of LAMMPS' flexibility, as they provide methods which
|
||||
can be called at different points within a timestep, as explained in
|
||||
`Developer_flow`. This allows the input script to tailor how a specific
|
||||
simulation is run, what diagnostic computations are performed, and how
|
||||
the output of those computations is further processed or output.
|
||||
the :doc:`How a timestep works <Developer_flow>` doc page. This allows
|
||||
the input script to tailor how a specific simulation is run, what
|
||||
diagnostic computations are performed, and how the output of those
|
||||
computations is further processed or output.
|
||||
|
||||
Additional code sharing is possible by creating derived classes from the
|
||||
derived classes (e.g., to implement an accelerated version of a pair
|
||||
|
||||
@ -128,7 +128,7 @@ reflect particles off box boundaries in the :doc:`FixWallReflect class
|
||||
The ``decide()`` method in the Neighbor class determines whether
|
||||
neighbor lists need to be rebuilt on the current timestep (conditions
|
||||
can be changed using the :doc:`neigh_modify every/delay/check
|
||||
<neigh_modify>` command. If not, coordinates of ghost atoms are
|
||||
<neigh_modify>` command). If not, coordinates of ghost atoms are
|
||||
acquired by each processor via the ``forward_comm()`` method of the Comm
|
||||
class. If neighbor lists need to be built, several operations within
|
||||
the inner if clause of the pseudocode are first invoked. The
|
||||
|
||||
@ -4,8 +4,7 @@ Communication
|
||||
Following the selected partitioning scheme, all per-atom data is
|
||||
distributed across the MPI processes, which allows LAMMPS to handle very
|
||||
large systems provided it uses a correspondingly large number of MPI
|
||||
processes. Since The per-atom data (atom IDs, positions, velocities,
|
||||
types, etc.) To be able to compute the short-range interactions, MPI
|
||||
processes. To be able to compute the short-range interactions, MPI
|
||||
processes need not only access to the data of atoms they "own" but also
|
||||
information about atoms from neighboring subdomains, in LAMMPS referred
|
||||
to as "ghost" atoms. These are copies of atoms storing required
|
||||
@ -37,7 +36,7 @@ be larger than half the simulation domain.
|
||||
|
||||
Efficient communication patterns are needed to update the "ghost" atom
|
||||
data, since that needs to be done at every MD time step or minimization
|
||||
step. The diagrams of the `ghost-atom-comm` figure illustrate how ghost
|
||||
step. The diagrams of the :ref:`ghost-atom-comm` figure illustrate how ghost
|
||||
atom communication is performed in two stages for a 2d simulation (three
|
||||
in 3d) for both a regular and irregular partitioning of the simulation
|
||||
box. For the regular case (left) atoms are exchanged first in the
|
||||
|
||||
@ -93,7 +93,7 @@ processors, since each tile in the initial tiling overlaps with a
|
||||
handful of tiles in the final tiling.
|
||||
|
||||
The transformations could also be done using collective communication
|
||||
across all $P$ processors with a single call to ``MPI_Alltoall()``, but
|
||||
across all :math:`P` processors with a single call to ``MPI_Alltoall()``, but
|
||||
this is typically much slower. However, for the specialized brick and
|
||||
pencil tiling illustrated in :ref:`fft-parallel` figure, collective
|
||||
communication across the entire MPI communicator is not required. In
|
||||
@ -138,7 +138,7 @@ grid/particle operations that LAMMPS supports:
|
||||
:math:`O(P^{\frac{1}{2}})`.
|
||||
|
||||
- For efficiency in performing 1d FFTs, the grid transpose
|
||||
operations illustrated in Figure \ref{fig:fft} also involve
|
||||
operations illustrated in Figure :ref:`fft-parallel` also involve
|
||||
reordering the 3d data so that a different dimension is contiguous
|
||||
in memory. This reordering can be done during the packing or
|
||||
unpacking of buffers for MPI communication.
|
||||
|
||||
@ -149,7 +149,7 @@ supports:
|
||||
|
||||
- Dependent on the "pair" setting of the :doc:`newton <newton>` command,
|
||||
the "half" neighbor lists may contain **all** pairs of atoms where
|
||||
atom *j* is a ghost atom (i.e. when the newton pair setting is *off*)
|
||||
atom *j* is a ghost atom (i.e. when the newton pair setting is *off*).
|
||||
For the newton pair *on* setting the atom *j* is only added to the
|
||||
list if its *z* coordinate is larger, or if equal the *y* coordinate
|
||||
is larger, and that is equal, too, the *x* coordinate is larger. For
|
||||
|
||||
@ -1,13 +1,13 @@
|
||||
OpenMP Parallelism
|
||||
^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The styles in the INTEL, KOKKOS, and OPENMP package offer to use OpenMP
|
||||
The styles in the INTEL, KOKKOS, and OPENMP packages offer to use OpenMP
|
||||
thread parallelism to predominantly distribute loops over local data
|
||||
and thus follow an orthogonal parallelization strategy to the
|
||||
decomposition into spatial domains used by the :doc:`MPI partitioning
|
||||
<Developer_par_part>`. For clarity, this section discusses only the
|
||||
implementation in the OPENMP package, as it is the simplest. The INTEL
|
||||
and KOKKOS package offer additional options and are more complex since
|
||||
and KOKKOS packages offer additional options and are more complex since
|
||||
they support more features and different hardware like co-processors
|
||||
or GPUs.
|
||||
|
||||
@ -16,7 +16,7 @@ keep the changes to the source code small, so that it would be easier to
|
||||
maintain the code and keep it in sync with the non-threaded standard
|
||||
implementation. This is achieved by a) making the OPENMP version a
|
||||
derived class from the regular version (e.g. ``PairLJCutOMP`` from
|
||||
``PairLJCut``) and overriding only methods that are multi-threaded or
|
||||
``PairLJCut``) and only overriding methods that are multi-threaded or
|
||||
need to be modified to support multi-threading (similar to what was done
|
||||
in the OPT package), b) keeping the structure in the modified code very
|
||||
similar so that side-by-side comparisons are still useful, and c)
|
||||
|
||||
Reference in New Issue
Block a user