Edits to developer doc files
This commit is contained in:
@ -2,7 +2,7 @@ Accessing per-atom data
|
|||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
This page discusses how per-atom data is managed in LAMMPS, how it can
|
This page discusses how per-atom data is managed in LAMMPS, how it can
|
||||||
be accessed, what communication patters apply, and some of the utility
|
be accessed, what communication patterns apply, and some of the utility
|
||||||
functions that exist for a variety of purposes.
|
functions that exist for a variety of purposes.
|
||||||
|
|
||||||
|
|
||||||
@ -14,11 +14,11 @@ As described on the :doc:`parallel partitioning algorithms
|
|||||||
simulation domain, either in a *brick* or *tiled* manner. Each MPI
|
simulation domain, either in a *brick* or *tiled* manner. Each MPI
|
||||||
process *owns* exactly one subdomain and the atoms within it. To compute
|
process *owns* exactly one subdomain and the atoms within it. To compute
|
||||||
forces for tuples of atoms that are spread across sub-domain boundaries,
|
forces for tuples of atoms that are spread across sub-domain boundaries,
|
||||||
also a "halo" of *ghost* atoms are maintained within a the communication
|
also a "halo" of *ghost* atoms are maintained within the communication
|
||||||
cutoff distance of its subdomain.
|
cutoff distance of its subdomain.
|
||||||
|
|
||||||
The total number of atoms is stored in `Atom::natoms` (within any
|
The total number of atoms is stored in `Atom::natoms` (within any
|
||||||
typical class this can be referred to at `atom->natoms`. The number of
|
typical class this can be referred to at `atom->natoms`). The number of
|
||||||
*owned* (or "local" atoms) are stored in `Atom::nlocal`; the number of
|
*owned* (or "local" atoms) are stored in `Atom::nlocal`; the number of
|
||||||
*ghost* atoms is stored in `Atom::nghost`. The sum of `Atom::nlocal`
|
*ghost* atoms is stored in `Atom::nghost`. The sum of `Atom::nlocal`
|
||||||
over all MPI processes should be `Atom::natoms`. This is by default
|
over all MPI processes should be `Atom::natoms`. This is by default
|
||||||
@ -27,8 +27,8 @@ LAMMPS stops with a "lost atoms" error. For convenience also the
|
|||||||
property `Atom::nmax` is available, this is the maximum of
|
property `Atom::nmax` is available, this is the maximum of
|
||||||
`Atom::nlocal + Atom::nghost` across all MPI processes.
|
`Atom::nlocal + Atom::nghost` across all MPI processes.
|
||||||
|
|
||||||
Per-atom properties are either managed by the atom style, or individual
|
Per-atom properties are either managed by the atom style, individual
|
||||||
classes. or as custom arrays by the individual classes. If only access
|
classes, or as custom arrays by the individual classes. If only access
|
||||||
to *owned* atoms is needed, they are usually allocated to be of size
|
to *owned* atoms is needed, they are usually allocated to be of size
|
||||||
`Atom::nlocal`, otherwise of size `Atom::nmax`. Please note that not all
|
`Atom::nlocal`, otherwise of size `Atom::nmax`. Please note that not all
|
||||||
per-atom properties are available or updated on *ghost* atoms. For
|
per-atom properties are available or updated on *ghost* atoms. For
|
||||||
@ -61,7 +61,7 @@ can be found via the `Atom::sametag` array. It points to the next atom
|
|||||||
index with the same tag or -1 if there are no more atoms with the same
|
index with the same tag or -1 if there are no more atoms with the same
|
||||||
tag. The list will be exhaustive when starting with an index of an
|
tag. The list will be exhaustive when starting with an index of an
|
||||||
*owned* atom, since the atom IDs are unique, so there can only be one
|
*owned* atom, since the atom IDs are unique, so there can only be one
|
||||||
such atom. Example code to count atoms with same atom ID in subdomain:
|
such atom. Example code to count atoms with same atom ID in a subdomain:
|
||||||
|
|
||||||
.. code-block:: c++
|
.. code-block:: c++
|
||||||
|
|
||||||
|
|||||||
@ -69,7 +69,7 @@ The basic LAMMPS class hierarchy which is created by the LAMMPS class
|
|||||||
constructor is shown in :ref:`class-topology`. When input commands
|
constructor is shown in :ref:`class-topology`. When input commands
|
||||||
are processed, additional class instances are created, or deleted, or
|
are processed, additional class instances are created, or deleted, or
|
||||||
replaced. Likewise, specific member functions of specific classes are
|
replaced. Likewise, specific member functions of specific classes are
|
||||||
called to trigger actions such creating atoms, computing forces,
|
called to trigger actions such as creating atoms, computing forces,
|
||||||
computing properties, time-propagating the system, or writing output.
|
computing properties, time-propagating the system, or writing output.
|
||||||
|
|
||||||
Compositing and Inheritance
|
Compositing and Inheritance
|
||||||
@ -110,9 +110,10 @@ As mentioned above, there can be multiple instances of classes derived
|
|||||||
from the ``Fix`` or ``Compute`` base classes. They represent a
|
from the ``Fix`` or ``Compute`` base classes. They represent a
|
||||||
different facet of LAMMPS' flexibility, as they provide methods which
|
different facet of LAMMPS' flexibility, as they provide methods which
|
||||||
can be called at different points within a timestep, as explained in
|
can be called at different points within a timestep, as explained in
|
||||||
`Developer_flow`. This allows the input script to tailor how a specific
|
the :doc:`How a timestep works <Developer_flow>` doc page. This allows
|
||||||
simulation is run, what diagnostic computations are performed, and how
|
the input script to tailor how a specific simulation is run, what
|
||||||
the output of those computations is further processed or output.
|
diagnostic computations are performed, and how the output of those
|
||||||
|
computations is further processed or output.
|
||||||
|
|
||||||
Additional code sharing is possible by creating derived classes from the
|
Additional code sharing is possible by creating derived classes from the
|
||||||
derived classes (e.g., to implement an accelerated version of a pair
|
derived classes (e.g., to implement an accelerated version of a pair
|
||||||
|
|||||||
@ -128,7 +128,7 @@ reflect particles off box boundaries in the :doc:`FixWallReflect class
|
|||||||
The ``decide()`` method in the Neighbor class determines whether
|
The ``decide()`` method in the Neighbor class determines whether
|
||||||
neighbor lists need to be rebuilt on the current timestep (conditions
|
neighbor lists need to be rebuilt on the current timestep (conditions
|
||||||
can be changed using the :doc:`neigh_modify every/delay/check
|
can be changed using the :doc:`neigh_modify every/delay/check
|
||||||
<neigh_modify>` command. If not, coordinates of ghost atoms are
|
<neigh_modify>` command). If not, coordinates of ghost atoms are
|
||||||
acquired by each processor via the ``forward_comm()`` method of the Comm
|
acquired by each processor via the ``forward_comm()`` method of the Comm
|
||||||
class. If neighbor lists need to be built, several operations within
|
class. If neighbor lists need to be built, several operations within
|
||||||
the inner if clause of the pseudocode are first invoked. The
|
the inner if clause of the pseudocode are first invoked. The
|
||||||
|
|||||||
@ -4,8 +4,7 @@ Communication
|
|||||||
Following the selected partitioning scheme, all per-atom data is
|
Following the selected partitioning scheme, all per-atom data is
|
||||||
distributed across the MPI processes, which allows LAMMPS to handle very
|
distributed across the MPI processes, which allows LAMMPS to handle very
|
||||||
large systems provided it uses a correspondingly large number of MPI
|
large systems provided it uses a correspondingly large number of MPI
|
||||||
processes. Since The per-atom data (atom IDs, positions, velocities,
|
processes. To be able to compute the short-range interactions, MPI
|
||||||
types, etc.) To be able to compute the short-range interactions, MPI
|
|
||||||
processes need not only access to the data of atoms they "own" but also
|
processes need not only access to the data of atoms they "own" but also
|
||||||
information about atoms from neighboring subdomains, in LAMMPS referred
|
information about atoms from neighboring subdomains, in LAMMPS referred
|
||||||
to as "ghost" atoms. These are copies of atoms storing required
|
to as "ghost" atoms. These are copies of atoms storing required
|
||||||
@ -37,7 +36,7 @@ be larger than half the simulation domain.
|
|||||||
|
|
||||||
Efficient communication patterns are needed to update the "ghost" atom
|
Efficient communication patterns are needed to update the "ghost" atom
|
||||||
data, since that needs to be done at every MD time step or minimization
|
data, since that needs to be done at every MD time step or minimization
|
||||||
step. The diagrams of the `ghost-atom-comm` figure illustrate how ghost
|
step. The diagrams of the :ref:`ghost-atom-comm` figure illustrate how ghost
|
||||||
atom communication is performed in two stages for a 2d simulation (three
|
atom communication is performed in two stages for a 2d simulation (three
|
||||||
in 3d) for both a regular and irregular partitioning of the simulation
|
in 3d) for both a regular and irregular partitioning of the simulation
|
||||||
box. For the regular case (left) atoms are exchanged first in the
|
box. For the regular case (left) atoms are exchanged first in the
|
||||||
|
|||||||
@ -93,7 +93,7 @@ processors, since each tile in the initial tiling overlaps with a
|
|||||||
handful of tiles in the final tiling.
|
handful of tiles in the final tiling.
|
||||||
|
|
||||||
The transformations could also be done using collective communication
|
The transformations could also be done using collective communication
|
||||||
across all $P$ processors with a single call to ``MPI_Alltoall()``, but
|
across all :math:`P` processors with a single call to ``MPI_Alltoall()``, but
|
||||||
this is typically much slower. However, for the specialized brick and
|
this is typically much slower. However, for the specialized brick and
|
||||||
pencil tiling illustrated in :ref:`fft-parallel` figure, collective
|
pencil tiling illustrated in :ref:`fft-parallel` figure, collective
|
||||||
communication across the entire MPI communicator is not required. In
|
communication across the entire MPI communicator is not required. In
|
||||||
@ -138,7 +138,7 @@ grid/particle operations that LAMMPS supports:
|
|||||||
:math:`O(P^{\frac{1}{2}})`.
|
:math:`O(P^{\frac{1}{2}})`.
|
||||||
|
|
||||||
- For efficiency in performing 1d FFTs, the grid transpose
|
- For efficiency in performing 1d FFTs, the grid transpose
|
||||||
operations illustrated in Figure \ref{fig:fft} also involve
|
operations illustrated in Figure :ref:`fft-parallel` also involve
|
||||||
reordering the 3d data so that a different dimension is contiguous
|
reordering the 3d data so that a different dimension is contiguous
|
||||||
in memory. This reordering can be done during the packing or
|
in memory. This reordering can be done during the packing or
|
||||||
unpacking of buffers for MPI communication.
|
unpacking of buffers for MPI communication.
|
||||||
|
|||||||
@ -149,7 +149,7 @@ supports:
|
|||||||
|
|
||||||
- Dependent on the "pair" setting of the :doc:`newton <newton>` command,
|
- Dependent on the "pair" setting of the :doc:`newton <newton>` command,
|
||||||
the "half" neighbor lists may contain **all** pairs of atoms where
|
the "half" neighbor lists may contain **all** pairs of atoms where
|
||||||
atom *j* is a ghost atom (i.e. when the newton pair setting is *off*)
|
atom *j* is a ghost atom (i.e. when the newton pair setting is *off*).
|
||||||
For the newton pair *on* setting the atom *j* is only added to the
|
For the newton pair *on* setting the atom *j* is only added to the
|
||||||
list if its *z* coordinate is larger, or if equal the *y* coordinate
|
list if its *z* coordinate is larger, or if equal the *y* coordinate
|
||||||
is larger, and that is equal, too, the *x* coordinate is larger. For
|
is larger, and that is equal, too, the *x* coordinate is larger. For
|
||||||
|
|||||||
@ -1,13 +1,13 @@
|
|||||||
OpenMP Parallelism
|
OpenMP Parallelism
|
||||||
^^^^^^^^^^^^^^^^^^
|
^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
The styles in the INTEL, KOKKOS, and OPENMP package offer to use OpenMP
|
The styles in the INTEL, KOKKOS, and OPENMP packages offer to use OpenMP
|
||||||
thread parallelism to predominantly distribute loops over local data
|
thread parallelism to predominantly distribute loops over local data
|
||||||
and thus follow an orthogonal parallelization strategy to the
|
and thus follow an orthogonal parallelization strategy to the
|
||||||
decomposition into spatial domains used by the :doc:`MPI partitioning
|
decomposition into spatial domains used by the :doc:`MPI partitioning
|
||||||
<Developer_par_part>`. For clarity, this section discusses only the
|
<Developer_par_part>`. For clarity, this section discusses only the
|
||||||
implementation in the OPENMP package, as it is the simplest. The INTEL
|
implementation in the OPENMP package, as it is the simplest. The INTEL
|
||||||
and KOKKOS package offer additional options and are more complex since
|
and KOKKOS packages offer additional options and are more complex since
|
||||||
they support more features and different hardware like co-processors
|
they support more features and different hardware like co-processors
|
||||||
or GPUs.
|
or GPUs.
|
||||||
|
|
||||||
@ -16,7 +16,7 @@ keep the changes to the source code small, so that it would be easier to
|
|||||||
maintain the code and keep it in sync with the non-threaded standard
|
maintain the code and keep it in sync with the non-threaded standard
|
||||||
implementation. This is achieved by a) making the OPENMP version a
|
implementation. This is achieved by a) making the OPENMP version a
|
||||||
derived class from the regular version (e.g. ``PairLJCutOMP`` from
|
derived class from the regular version (e.g. ``PairLJCutOMP`` from
|
||||||
``PairLJCut``) and overriding only methods that are multi-threaded or
|
``PairLJCut``) and only overriding methods that are multi-threaded or
|
||||||
need to be modified to support multi-threading (similar to what was done
|
need to be modified to support multi-threading (similar to what was done
|
||||||
in the OPT package), b) keeping the structure in the modified code very
|
in the OPT package), b) keeping the structure in the modified code very
|
||||||
similar so that side-by-side comparisons are still useful, and c)
|
similar so that side-by-side comparisons are still useful, and c)
|
||||||
|
|||||||
Reference in New Issue
Block a user