From 60fe24acb47aa47b65da4f1331bce0fbab0d38e5 Mon Sep 17 00:00:00 2001 From: jtclemm Date: Wed, 14 Aug 2024 10:41:52 -0600 Subject: [PATCH] Edits to developer doc files --- doc/src/Developer_atom.rst | 12 ++++++------ doc/src/Developer_code_design.rst | 9 +++++---- doc/src/Developer_flow.rst | 2 +- doc/src/Developer_par_comm.rst | 5 ++--- doc/src/Developer_par_long.rst | 4 ++-- doc/src/Developer_par_neigh.rst | 2 +- doc/src/Developer_par_openmp.rst | 6 +++--- 7 files changed, 20 insertions(+), 20 deletions(-) diff --git a/doc/src/Developer_atom.rst b/doc/src/Developer_atom.rst index 8bc187ae7f..5171943c85 100644 --- a/doc/src/Developer_atom.rst +++ b/doc/src/Developer_atom.rst @@ -2,7 +2,7 @@ Accessing per-atom data ----------------------- This page discusses how per-atom data is managed in LAMMPS, how it can -be accessed, what communication patters apply, and some of the utility +be accessed, what communication patterns apply, and some of the utility functions that exist for a variety of purposes. @@ -14,11 +14,11 @@ As described on the :doc:`parallel partitioning algorithms simulation domain, either in a *brick* or *tiled* manner. Each MPI process *owns* exactly one subdomain and the atoms within it. To compute forces for tuples of atoms that are spread across sub-domain boundaries, -also a "halo" of *ghost* atoms are maintained within a the communication +also a "halo" of *ghost* atoms are maintained within the communication cutoff distance of its subdomain. The total number of atoms is stored in `Atom::natoms` (within any -typical class this can be referred to at `atom->natoms`. The number of +typical class this can be referred to at `atom->natoms`). The number of *owned* (or "local" atoms) are stored in `Atom::nlocal`; the number of *ghost* atoms is stored in `Atom::nghost`. The sum of `Atom::nlocal` over all MPI processes should be `Atom::natoms`. This is by default @@ -27,8 +27,8 @@ LAMMPS stops with a "lost atoms" error. For convenience also the property `Atom::nmax` is available, this is the maximum of `Atom::nlocal + Atom::nghost` across all MPI processes. -Per-atom properties are either managed by the atom style, or individual -classes. or as custom arrays by the individual classes. If only access +Per-atom properties are either managed by the atom style, individual +classes, or as custom arrays by the individual classes. If only access to *owned* atoms is needed, they are usually allocated to be of size `Atom::nlocal`, otherwise of size `Atom::nmax`. Please note that not all per-atom properties are available or updated on *ghost* atoms. For @@ -61,7 +61,7 @@ can be found via the `Atom::sametag` array. It points to the next atom index with the same tag or -1 if there are no more atoms with the same tag. The list will be exhaustive when starting with an index of an *owned* atom, since the atom IDs are unique, so there can only be one -such atom. Example code to count atoms with same atom ID in subdomain: +such atom. Example code to count atoms with same atom ID in a subdomain: .. code-block:: c++ diff --git a/doc/src/Developer_code_design.rst b/doc/src/Developer_code_design.rst index 844dbd0512..974266ec7f 100644 --- a/doc/src/Developer_code_design.rst +++ b/doc/src/Developer_code_design.rst @@ -69,7 +69,7 @@ The basic LAMMPS class hierarchy which is created by the LAMMPS class constructor is shown in :ref:`class-topology`. When input commands are processed, additional class instances are created, or deleted, or replaced. Likewise, specific member functions of specific classes are -called to trigger actions such creating atoms, computing forces, +called to trigger actions such as creating atoms, computing forces, computing properties, time-propagating the system, or writing output. Compositing and Inheritance @@ -110,9 +110,10 @@ As mentioned above, there can be multiple instances of classes derived from the ``Fix`` or ``Compute`` base classes. They represent a different facet of LAMMPS' flexibility, as they provide methods which can be called at different points within a timestep, as explained in -`Developer_flow`. This allows the input script to tailor how a specific -simulation is run, what diagnostic computations are performed, and how -the output of those computations is further processed or output. +the :doc:`How a timestep works ` doc page. This allows +the input script to tailor how a specific simulation is run, what +diagnostic computations are performed, and how the output of those +computations is further processed or output. Additional code sharing is possible by creating derived classes from the derived classes (e.g., to implement an accelerated version of a pair diff --git a/doc/src/Developer_flow.rst b/doc/src/Developer_flow.rst index 115d6ee6ae..17d75879ca 100644 --- a/doc/src/Developer_flow.rst +++ b/doc/src/Developer_flow.rst @@ -128,7 +128,7 @@ reflect particles off box boundaries in the :doc:`FixWallReflect class The ``decide()`` method in the Neighbor class determines whether neighbor lists need to be rebuilt on the current timestep (conditions can be changed using the :doc:`neigh_modify every/delay/check -` command. If not, coordinates of ghost atoms are +` command). If not, coordinates of ghost atoms are acquired by each processor via the ``forward_comm()`` method of the Comm class. If neighbor lists need to be built, several operations within the inner if clause of the pseudocode are first invoked. The diff --git a/doc/src/Developer_par_comm.rst b/doc/src/Developer_par_comm.rst index 88f0bf7fe9..2f94beb96f 100644 --- a/doc/src/Developer_par_comm.rst +++ b/doc/src/Developer_par_comm.rst @@ -4,8 +4,7 @@ Communication Following the selected partitioning scheme, all per-atom data is distributed across the MPI processes, which allows LAMMPS to handle very large systems provided it uses a correspondingly large number of MPI -processes. Since The per-atom data (atom IDs, positions, velocities, -types, etc.) To be able to compute the short-range interactions, MPI +processes. To be able to compute the short-range interactions, MPI processes need not only access to the data of atoms they "own" but also information about atoms from neighboring subdomains, in LAMMPS referred to as "ghost" atoms. These are copies of atoms storing required @@ -37,7 +36,7 @@ be larger than half the simulation domain. Efficient communication patterns are needed to update the "ghost" atom data, since that needs to be done at every MD time step or minimization -step. The diagrams of the `ghost-atom-comm` figure illustrate how ghost +step. The diagrams of the :ref:`ghost-atom-comm` figure illustrate how ghost atom communication is performed in two stages for a 2d simulation (three in 3d) for both a regular and irregular partitioning of the simulation box. For the regular case (left) atoms are exchanged first in the diff --git a/doc/src/Developer_par_long.rst b/doc/src/Developer_par_long.rst index 73b92e47c2..ecbcd81e64 100644 --- a/doc/src/Developer_par_long.rst +++ b/doc/src/Developer_par_long.rst @@ -93,7 +93,7 @@ processors, since each tile in the initial tiling overlaps with a handful of tiles in the final tiling. The transformations could also be done using collective communication -across all $P$ processors with a single call to ``MPI_Alltoall()``, but +across all :math:`P` processors with a single call to ``MPI_Alltoall()``, but this is typically much slower. However, for the specialized brick and pencil tiling illustrated in :ref:`fft-parallel` figure, collective communication across the entire MPI communicator is not required. In @@ -138,7 +138,7 @@ grid/particle operations that LAMMPS supports: :math:`O(P^{\frac{1}{2}})`. - For efficiency in performing 1d FFTs, the grid transpose - operations illustrated in Figure \ref{fig:fft} also involve + operations illustrated in Figure :ref:`fft-parallel` also involve reordering the 3d data so that a different dimension is contiguous in memory. This reordering can be done during the packing or unpacking of buffers for MPI communication. diff --git a/doc/src/Developer_par_neigh.rst b/doc/src/Developer_par_neigh.rst index c3eaf9c0f7..9fa279c7d0 100644 --- a/doc/src/Developer_par_neigh.rst +++ b/doc/src/Developer_par_neigh.rst @@ -149,7 +149,7 @@ supports: - Dependent on the "pair" setting of the :doc:`newton ` command, the "half" neighbor lists may contain **all** pairs of atoms where - atom *j* is a ghost atom (i.e. when the newton pair setting is *off*) + atom *j* is a ghost atom (i.e. when the newton pair setting is *off*). For the newton pair *on* setting the atom *j* is only added to the list if its *z* coordinate is larger, or if equal the *y* coordinate is larger, and that is equal, too, the *x* coordinate is larger. For diff --git a/doc/src/Developer_par_openmp.rst b/doc/src/Developer_par_openmp.rst index 7e2021de67..5551e185df 100644 --- a/doc/src/Developer_par_openmp.rst +++ b/doc/src/Developer_par_openmp.rst @@ -1,13 +1,13 @@ OpenMP Parallelism ^^^^^^^^^^^^^^^^^^ -The styles in the INTEL, KOKKOS, and OPENMP package offer to use OpenMP +The styles in the INTEL, KOKKOS, and OPENMP packages offer to use OpenMP thread parallelism to predominantly distribute loops over local data and thus follow an orthogonal parallelization strategy to the decomposition into spatial domains used by the :doc:`MPI partitioning `. For clarity, this section discusses only the implementation in the OPENMP package, as it is the simplest. The INTEL -and KOKKOS package offer additional options and are more complex since +and KOKKOS packages offer additional options and are more complex since they support more features and different hardware like co-processors or GPUs. @@ -16,7 +16,7 @@ keep the changes to the source code small, so that it would be easier to maintain the code and keep it in sync with the non-threaded standard implementation. This is achieved by a) making the OPENMP version a derived class from the regular version (e.g. ``PairLJCutOMP`` from -``PairLJCut``) and overriding only methods that are multi-threaded or +``PairLJCut``) and only overriding methods that are multi-threaded or need to be modified to support multi-threading (similar to what was done in the OPT package), b) keeping the structure in the modified code very similar so that side-by-side comparisons are still useful, and c)