From f65f79ef82a08f56dfad8577b58b34deb0288cc6 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Sun, 22 Jan 2023 08:33:04 -0500 Subject: [PATCH] revise based on suggestions from languagetool.org --- doc/src/Developer_comm_ops.rst | 4 +- doc/src/Developer_flow.rst | 6 +- doc/src/Developer_grid.rst | 22 +++--- doc/src/Developer_notes.rst | 4 +- doc/src/Developer_par_comm.rst | 20 +++--- doc/src/Developer_par_long.rst | 10 +-- doc/src/Developer_par_neigh.rst | 14 ++-- doc/src/Developer_par_part.rst | 28 ++++---- doc/src/Developer_parallel.rst | 2 +- doc/src/Developer_utils.rst | 2 +- doc/src/Errors_messages.rst | 14 ++-- doc/src/Errors_warnings.rst | 12 ++-- doc/src/Howto_grid.rst | 4 +- doc/src/Howto_output.rst | 4 +- doc/src/Howto_peri.rst | 10 +-- doc/src/Howto_triclinic.rst | 2 +- doc/src/Intro_citing.rst | 10 +-- doc/src/Intro_features.rst | 8 +-- doc/src/Intro_nonfeatures.rst | 78 +++++++++++---------- doc/src/Intro_overview.rst | 30 ++++---- doc/src/Intro_portability.rst | 34 ++++----- doc/src/Library_properties.rst | 2 +- doc/src/Manual.rst | 25 +++---- doc/src/Manual_version.rst | 35 ++++----- doc/src/Python_atoms.rst | 2 +- doc/src/Python_properties.rst | 2 +- doc/src/Speed.rst | 2 +- doc/src/Speed_gpu.rst | 8 +-- doc/src/Speed_kokkos.rst | 4 +- doc/src/Speed_omp.rst | 4 +- doc/src/atom_modify.rst | 2 +- doc/src/balance.rst | 32 ++++----- doc/src/boundary.rst | 2 +- doc/src/comm_modify.rst | 4 +- doc/src/compute.rst | 2 +- doc/src/compute_pressure.rst | 4 +- doc/src/compute_property_grid.rst | 2 +- doc/src/compute_sna_atom.rst | 2 +- doc/src/dump_image.rst | 20 +++--- doc/src/fix.rst | 2 +- doc/src/fix_ave_grid.rst | 2 +- doc/src/fix_balance.rst | 20 +++--- doc/src/fix_box_relax.rst | 2 +- doc/src/fix_deform.rst | 2 +- doc/src/fix_lb_fluid.rst | 2 +- doc/src/fix_nh.rst | 2 +- doc/src/fix_npt_cauchy.rst | 2 +- doc/src/fix_shardlow.rst | 2 +- doc/src/fix_ttm.rst | 6 +- doc/src/package.rst | 4 +- doc/src/pair_dsmc.rst | 2 +- doc/src/pair_none.rst | 2 +- doc/src/processors.rst | 12 ++-- doc/src/replicate.rst | 4 +- doc/src/thermo_modify.rst | 2 +- doc/utils/sphinx-config/false_positives.txt | 2 + 56 files changed, 275 insertions(+), 267 deletions(-) diff --git a/doc/src/Developer_comm_ops.rst b/doc/src/Developer_comm_ops.rst index d343fc073a..5eac8e46de 100644 --- a/doc/src/Developer_comm_ops.rst +++ b/doc/src/Developer_comm_ops.rst @@ -14,8 +14,8 @@ Owned and ghost atoms As described on the :doc:`parallel partitioning algorithms ` page, LAMMPS spatially decomposes the simulation domain, either in a *brick* or *tiled* manner. Each processor (MPI -task) owns atoms within its sub-domain and additionally stores ghost -atoms within a cutoff distance of its sub-domain. +task) owns atoms within its subdomain and additionally stores ghost +atoms within a cutoff distance of its subdomain. Forward and reverse communication ================================= diff --git a/doc/src/Developer_flow.rst b/doc/src/Developer_flow.rst index d6f35a4b70..d582043492 100644 --- a/doc/src/Developer_flow.rst +++ b/doc/src/Developer_flow.rst @@ -139,7 +139,7 @@ Periodic boundary conditions are then applied by the Domain class via its ``pbc()`` method to remap particles that have moved outside the simulation box back into the box. Note that this is not done every timestep, but only when neighbor lists are rebuilt. This is so that -each processor's sub-domain will have consistent (nearby) atom +each processor's subdomain will have consistent (nearby) atom coordinates for its owned and ghost atoms. It is also why dumped atom coordinates may be slightly outside the simulation box if not dumped on a step where the neighbor lists are rebuilt. @@ -153,10 +153,10 @@ method of the Comm class and ``setup_bins()`` method of the Neighbor class perform the update. The code is now ready to migrate atoms that have left a processor's -geometric sub-domain to new processors. The ``exchange()`` method of +geometric subdomain to new processors. The ``exchange()`` method of the Comm class performs this operation. The ``borders()`` method of the Comm class then identifies ghost atoms surrounding each processor's -sub-domain and communicates ghost atom information to neighboring +subdomain and communicates ghost atom information to neighboring processors. It does this by looping over all the atoms owned by a processor to make lists of those to send to each neighbor processor. On subsequent timesteps, the lists are used by the ``Comm::forward_comm()`` diff --git a/doc/src/Developer_grid.rst b/doc/src/Developer_grid.rst index f60a688c5b..cbc3610c9f 100644 --- a/doc/src/Developer_grid.rst +++ b/doc/src/Developer_grid.rst @@ -28,9 +28,9 @@ grid. More specifically, a grid point is defined for each cell (by default the center point), and a processor owns a grid cell if its point is -within the processor's spatial sub-domain. The union of processor -sub-domains is the global simulation box. If a grid point is on the -boundary of two sub-domains, the lower processor owns the grid cell. A +within the processor's spatial subdomain. The union of processor +subdomains is the global simulation box. If a grid point is on the +boundary of two subdomains, the lower processor owns the grid cell. A processor may also store copies of ghost cells which surround its owned cells. @@ -62,7 +62,7 @@ y-dimension. It is even possible to define a 1x1x1 3d grid, though it may be inefficient to use it in a computational sense. Note that the choice of grid size is independent of the number of -processors or their layout in a grid of processor sub-domains which +processors or their layout in a grid of processor subdomains which overlays the simulations domain. Depending on the distributed grid size, a single processor may own many 1000s or no grid cells. @@ -235,7 +235,7 @@ invoked, because they influence its operation. void set_zfactor(double factor); Processors own a grid cell if a point within the grid cell is inside -the processor's sub-domain. By default this is the center point of the +the processor's subdomain. By default this is the center point of the grid cell. The *set_shift_grid()* method can change this. The *shift* argument is a value from 0.0 to 1.0 (inclusive) which is the offset of the point within the grid cell in each dimension. The default is 0.5 @@ -245,9 +245,9 @@ typically no need to change the default as it is optimal for minimizing the number of ghost cells needed. If a processor maps its particles to grid cells, it needs to allow for -its particles being outside its sub-domain between reneighboring. The +its particles being outside its subdomain between reneighboring. The *distance* argument of the *set_distance()* method sets the furthest -distance outside a processor's sub-domain which a particle can move. +distance outside a processor's subdomain which a particle can move. Typically this is half the neighbor skin distance, assuming reneighboring is done appropriately. This distance is used in determining how many ghost cells a processor needs to store to enable @@ -295,7 +295,7 @@ to the Grid class via the *set_zfactor()* method (*set_yfactor()* for 2d grids). The Grid class will then assign ownership of the 1/3 of grid cells that overlay the simulation box to the processors which also overlay the simulation box. The remaining 2/3 of the grid cells -are assigned to processors whose sub-domains are adjacent to the upper +are assigned to processors whose subdomains are adjacent to the upper z boundary of the simulation box. ---------- @@ -549,13 +549,13 @@ Grid class remap methods for load balancing The following methods are used when a load-balancing operation, triggered by the :doc:`balance ` or :doc:`fix balance ` commands, changes the partitioning of the simulation -domain into processor sub-domains. +domain into processor subdomains. In order to work with load-balancing, any style command (compute, fix, pair, or kspace style) which allocates a grid and stores per-grid data should define a *reset_grid()* method; it takes no arguments. It will be called by the two balance commands after they have reset processor -sub-domains and migrated atoms (particles) to new owning processors. +subdomains and migrated atoms (particles) to new owning processors. The *reset_grid()* method will typically perform some or all of the following operations. See the src/fix_ave_grid.cpp and src/EXTRA_FIX/fix_ttm_grid.cpp files for examples of *reset_grid()* @@ -564,7 +564,7 @@ functions. First, the *reset_grid()* method can instantiate new grid(s) of the same global size, then call *setup_grid()* to partition them via the -new processor sub-domains. At this point, it can invoke the +new processor subdomains. At this point, it can invoke the *identical()* method which compares the owned and ghost grid cell index bounds between two grids, the old grid passed as a pointer argument, and the new grid whose *identical()* method is being called. diff --git a/doc/src/Developer_notes.rst b/doc/src/Developer_notes.rst index a781737d6f..92121cca15 100644 --- a/doc/src/Developer_notes.rst +++ b/doc/src/Developer_notes.rst @@ -102,7 +102,7 @@ build is then :doc:`processed in parallel `. The most commonly required neighbor list is a so-called "half" neighbor list, where each pair of atoms is listed only once (except when the :doc:`newton command setting ` for pair is off; in that case -pairs straddling sub-domains or periodic boundaries will be listed twice). +pairs straddling subdomains or periodic boundaries will be listed twice). Thus these are the default settings when a neighbor list request is created in: .. code-block:: c++ @@ -361,7 +361,7 @@ allocated as a 1d vector or 3d array. Either way, the ordering of values within contiguous memory x fastest, then y, z slowest. For the ``3d decomposition`` of the grid, the global grid is -partitioned into bricks that correspond to the sub-domains of the +partitioned into bricks that correspond to the subdomains of the simulation box that each processor owns. Often, this is a regular 3d array (Px by Py by Pz) of bricks, where P = number of processors = Px * Py * Pz. More generally it can be a tiled decomposition, where diff --git a/doc/src/Developer_par_comm.rst b/doc/src/Developer_par_comm.rst index 2e108dda13..ec3ef87f47 100644 --- a/doc/src/Developer_par_comm.rst +++ b/doc/src/Developer_par_comm.rst @@ -7,16 +7,16 @@ large systems provided it uses a correspondingly large number of MPI processes. Since The per-atom data (atom IDs, positions, velocities, types, etc.) To be able to compute the short-range interactions MPI processes need not only access to data of atoms they "own" but also -information about atoms from neighboring sub-domains, in LAMMPS referred +information about atoms from neighboring subdomains, in LAMMPS referred to as "ghost" atoms. These are copies of atoms storing required per-atom data for up to the communication cutoff distance. The green dashed-line boxes in the :ref:`domain-decomposition` figure illustrate -the extended ghost-atom sub-domain for one processor. +the extended ghost-atom subdomain for one processor. This approach is also used to implement periodic boundary conditions: atoms that lie within the cutoff distance across a periodic boundary are also stored as ghost atoms and taken from the periodic -replication of the sub-domain, which may be the same sub-domain, e.g. if +replication of the subdomain, which may be the same subdomain, e.g. if running in serial. As a consequence of this, force computation in LAMMPS is not subject to minimum image conventions and thus cutoffs may be larger than half the simulation domain. @@ -28,10 +28,10 @@ be larger than half the simulation domain. ghost atom communication This figure shows the ghost atom communication patterns between - sub-domains for "brick" (left) and "tiled" communication styles for + subdomains for "brick" (left) and "tiled" communication styles for 2d simulations. The numbers indicate MPI process ranks. Here the - sub-domains are drawn spatially separated for clarity. The - dashed-line box is the extended sub-domain of processor 0 which + subdomains are drawn spatially separated for clarity. The + dashed-line box is the extended subdomain of processor 0 which includes its ghost atoms. The red- and blue-shaded boxes are the regions of communicated ghost atoms. @@ -42,7 +42,7 @@ atom communication is performed in two stages for a 2d simulation (three in 3d) for both a regular and irregular partitioning of the simulation box. For the regular case (left) atoms are exchanged first in the *x*-direction, then in *y*, with four neighbors in the grid of processor -sub-domains. +subdomains. In the *x* stage, processor ranks 1 and 2 send owned atoms in their red-shaded regions to rank 0 (and vice versa). Then in the *y* stage, @@ -55,7 +55,7 @@ For the irregular case (right) the two stages are similar, but a processor can have more than one neighbor in each direction. In the *x* stage, MPI ranks 1,2,3 send owned atoms in their red-shaded regions to rank 0 (and vice versa). These include only atoms between the lower -and upper *y*-boundary of rank 0's sub-domain. In the *y* stage, ranks +and upper *y*-boundary of rank 0's subdomain. In the *y* stage, ranks 4,5,6 send atoms in their blue-shaded regions to rank 0. This may include ghost atoms they received in the *x* stage, but only if they are needed by rank 0 to fill its extended ghost atom regions in the @@ -110,11 +110,11 @@ performed in LAMMPS: over 3x the length of a stretched bond for dihedral interactions. It can also exceed the periodic box size. For the regular communication pattern (left), if the cutoff distance extends beyond a neighbor - processor's sub-domain, then multiple exchanges are performed in the + processor's subdomain, then multiple exchanges are performed in the same direction. Each exchange is with the same neighbor processor, but buffers are packed/unpacked using a different list of atoms. For forward communication, in the first exchange a processor sends only owned atoms. In subsequent exchanges, it sends ghost atoms received in previous exchanges. For the irregular pattern (right) overlaps of - a processor's extended ghost-atom sub-domain with all other processors + a processor's extended ghost-atom subdomain with all other processors in each dimension are detected. diff --git a/doc/src/Developer_par_long.rst b/doc/src/Developer_par_long.rst index f297cf3fa6..cd240ffd21 100644 --- a/doc/src/Developer_par_long.rst +++ b/doc/src/Developer_par_long.rst @@ -20,7 +20,7 @@ e) electric field values from grid points near each atom are interpolated to com For any of the spatial-decomposition partitioning schemes each processor owns the brick-shaped portion of FFT grid points contained within its -sub-domain. The two interpolation operations use a stencil of grid +subdomain. The two interpolation operations use a stencil of grid points surrounding each atom. To accommodate the stencil size, each processor also stores a few layers of ghost grid points surrounding its brick. Forward and reverse communication of grid point values is @@ -64,7 +64,7 @@ direction of the 1d FFTs it has to perform. LAMMPS uses the pencil-decomposition algorithm as shown in the :ref:`fft-parallel` figure. Initially (far left), each processor owns a brick of same-color grid -cells (actually grid points) contained within in its sub-domain. A +cells (actually grid points) contained within in its subdomain. A brick-to-pencil communication operation converts this layout to 1d pencils in the *x*-dimension (center left). Again, cells of the same color are owned by the same processor. Each processor can then compute @@ -161,8 +161,8 @@ grid/particle operations that LAMMPS supports: ` calculation and then use the :doc:`verlet/split integrator ` to perform the PPPM computation on a dedicated, separate partition of MPI processes. This uses an integer - "1:*p*" mapping of *p* sub-domains of the atom decomposition to one - sub-domain of the FFT grid decomposition and where pairwise non-bonded + "1:*p*" mapping of *p* subdomains of the atom decomposition to one + subdomain of the FFT grid decomposition and where pairwise non-bonded and bonded forces and energies are computed on the larger partition and the PPPM kspace computation concurrently on the smaller partition. @@ -172,7 +172,7 @@ grid/particle operations that LAMMPS supports: - LAMMPS implements a ``GridComm`` class which overlays the simulation domain with a regular grid, partitions it across processors in a - manner consistent with processor sub-domains, and provides methods for + manner consistent with processor subdomains, and provides methods for forward and reverse communication of owned and ghost grid point values. It is used for PPPM as an FFT grid (as outlined above) and also for the MSM algorithm which uses a cascade of grid sizes from diff --git a/doc/src/Developer_par_neigh.rst b/doc/src/Developer_par_neigh.rst index 4b286d77d8..43cbf1b0e2 100644 --- a/doc/src/Developer_par_neigh.rst +++ b/doc/src/Developer_par_neigh.rst @@ -22,7 +22,7 @@ last reneighboring; this and other options of the neighbor list rebuild can be adjusted with the :doc:`neigh_modify ` command. On steps when reneighboring is performed, atoms which have moved outside -their owning processor's sub-domain are first migrated to new processors +their owning processor's subdomain are first migrated to new processors via communication. Periodic boundary conditions are also (only) enforced on these steps to ensure each atom is re-assigned to the correct processor. After migration, the atoms owned by each processor @@ -39,12 +39,12 @@ its settings modified with the :doc:`atom_modify ` command. neighbor list stencils - A 2d simulation sub-domain (thick black line) and the corresponding + A 2d simulation subdomain (thick black line) and the corresponding ghost atom cutoff region (dashed blue line) for both orthogonal (left) and triclinic (right) domains. A regular grid of neighbor bins (thin lines) overlays the entire simulation domain and need not - align with sub-domain boundaries; only the portion overlapping the - augmented sub-domain is shown. In the triclinic case it overlaps the + align with subdomain boundaries; only the portion overlapping the + augmented subdomain is shown. In the triclinic case it overlaps the bounding box of the tilted rectangle. The blue- and red-shaded bins represent a stencil of bins searched to find neighbors of a particular atom (black dot). @@ -52,8 +52,8 @@ its settings modified with the :doc:`atom_modify ` command. To build a local neighbor list in linear time, the simulation domain is overlaid (conceptually) with a regular 3d (or 2d) grid of neighbor bins, as shown in the :ref:`neighbor-stencil` figure for 2d models and a -single MPI processor's sub-domain. Each processor stores a set of -neighbor bins which overlap its sub-domain extended by the neighbor +single MPI processor's subdomain. Each processor stores a set of +neighbor bins which overlap its subdomain extended by the neighbor cutoff distance :math:`R_n`. As illustrated, the bins need not align with processor boundaries; an integer number in each dimension is fit to the size of the entire simulation box. @@ -144,7 +144,7 @@ supports: - For small and sparse systems and as a fallback method, LAMMPS also supports neighbor list construction without binning by using a full - :math:`O(N^2)` loop over all *i,j* atom pairs in a sub-domain when + :math:`O(N^2)` loop over all *i,j* atom pairs in a subdomain when using the :doc:`neighbor nsq ` command. - Dependent on the "pair" setting of the :doc:`newton ` command, diff --git a/doc/src/Developer_par_part.rst b/doc/src/Developer_par_part.rst index f797f559e2..be42857f63 100644 --- a/doc/src/Developer_par_part.rst +++ b/doc/src/Developer_par_part.rst @@ -15,8 +15,8 @@ distributed-memory parallelism is set with the :doc:`comm_style command for MPI parallelization: "brick" on the left with an orthogonal (left) and a triclinic (middle) simulation domain, and a "tiled" decomposition (right). The black lines show the division into - sub-domains and the contained atoms are "owned" by the corresponding - MPI process. The green dashed lines indicate how sub-domains are + subdomains and the contained atoms are "owned" by the corresponding + MPI process. The green dashed lines indicate how subdomains are extended with "ghost" atoms up to the communication cutoff distance. The LAMMPS simulation box is a 3d or 2d volume, which can be orthogonal @@ -32,14 +32,14 @@ means the position of the box face adjusts continuously to enclose all the atoms. For distributed-memory MPI parallelism, the simulation box is spatially -decomposed (partitioned) into non-overlapping sub-domains which fill the +decomposed (partitioned) into non-overlapping subdomains which fill the box. The default partitioning, "brick", is most suitable when atom density is roughly uniform, as shown in the left-side images of the -:ref:`domain-decomposition` figure. The sub-domains comprise a regular -grid and all sub-domains are identical in size and shape. Both the +:ref:`domain-decomposition` figure. The subdomains comprise a regular +grid and all subdomains are identical in size and shape. Both the orthogonal and triclinic boxes can deform continuously during a simulation, e.g. to compress a solid or shear a liquid, in which case -the processor sub-domains likewise deform. +the processor subdomains likewise deform. For models with non-uniform density, the number of particles per @@ -50,14 +50,14 @@ load. For such models, LAMMPS supports multiple strategies to reduce the load imbalance: - The processor grid decomposition is by default based on the simulation - cell volume and tries to optimize the volume to surface ratio for the sub-domains. + cell volume and tries to optimize the volume to surface ratio for the subdomains. This can be changed with the :doc:`processors command `. -- The parallel planes defining the size of the sub-domains can be shifted +- The parallel planes defining the size of the subdomains can be shifted with the :doc:`balance command `. Which can be done in addition to choosing a more optimal processor grid. - The recursive bisectioning algorithm in combination with the "tiled" communication style can produce a partitioning with equal numbers of - particles in each sub-domain. + particles in each subdomain. .. |decomp1| image:: img/decomp-regular.png @@ -76,14 +76,14 @@ the load imbalance: The pictures above demonstrate different decompositions for a 2d system with 12 MPI ranks. The atom colors indicate the load imbalance of each -sub-domain with green being optimal and red the least optimal. +subdomain with green being optimal and red the least optimal. Due to the vacuum in the system, the default decomposition is unbalanced with several MPI ranks without atoms (left). By forcing a 1x12x1 processor grid, every MPI rank does computations now, but number of -atoms per sub-domain is still uneven and the thin slice shape increases -the amount of communication between sub-domains (center left). With a -2x6x1 processor grid and shifting the sub-domain divisions, the load +atoms per subdomain is still uneven and the thin slice shape increases +the amount of communication between subdomains (center left). With a +2x6x1 processor grid and shifting the subdomain divisions, the load imbalance is further reduced and the amount of communication required -between sub-domains is less (center right). And using the recursive +between subdomains is less (center right). And using the recursive bisectioning leads to further improved decomposition (right). diff --git a/doc/src/Developer_parallel.rst b/doc/src/Developer_parallel.rst index f649920dc5..cbe61a1754 100644 --- a/doc/src/Developer_parallel.rst +++ b/doc/src/Developer_parallel.rst @@ -7,7 +7,7 @@ decomposition. The parallelization aims to be efficient, and resulting in good strong scaling (= good speedup for the same system) and good weak scaling (= the computational cost of enlarging the system is proportional to the system size). Additional parallelization using GPUs -or OpenMP can also be applied within the sub-domain assigned to an MPI +or OpenMP can also be applied within the subdomain assigned to an MPI process. For clarity, most of the following illustrations show the 2d simulation case. The underlying algorithms in those cases, however, apply to both 2d and 3d cases equally well. diff --git a/doc/src/Developer_utils.rst b/doc/src/Developer_utils.rst index 35a763b7a1..53ec0ad343 100644 --- a/doc/src/Developer_utils.rst +++ b/doc/src/Developer_utils.rst @@ -647,7 +647,7 @@ Communication buffer coding with *ubuf* --------------------------------------- LAMMPS uses communication buffers where it collects data from various -class instances and then exchanges the data with neighboring sub-domains. +class instances and then exchanges the data with neighboring subdomains. For simplicity those buffers are defined as ``double`` buffers and used for doubles and integer numbers. This presents a unique problem when 64-bit integers are used. While the storage needed for a ``double`` diff --git a/doc/src/Errors_messages.rst b/doc/src/Errors_messages.rst index b2a3c3cafb..b157d53007 100644 --- a/doc/src/Errors_messages.rst +++ b/doc/src/Errors_messages.rst @@ -5635,7 +5635,7 @@ Doc page with :doc:`WARNING messages ` Lost atoms are checked for each time thermo output is done. See the thermo_modify lost command for options. Lost atoms usually indicate bad dynamics, e.g. atoms have been blown far out of the simulation - box, or moved further than one processor's sub-domain away before + box, or moved further than one processor's subdomain away before reneighboring. *MEAM library error %d* @@ -6266,14 +6266,14 @@ keyword to allow for additional bonds to be formed One or more atoms are attempting to map their charge to a MSM grid point that is not owned by a processor. This is likely for one of two reasons, both of them bad. First, it may mean that an atom near the - boundary of a processor's sub-domain has moved more than 1/2 the + boundary of a processor's subdomain has moved more than 1/2 the :doc:`neighbor skin distance ` without neighbor lists being rebuilt and atoms being migrated to new processors. This also means you may be missing pairwise interactions that need to be computed. The solution is to change the re-neighboring criteria via the :doc:`neigh_modify ` command. The safest settings are "delay 0 every 1 check yes". Second, it may mean that an atom has - moved far outside a processor's sub-domain or even the entire + moved far outside a processor's subdomain or even the entire simulation box. This indicates bad physics, e.g. due to highly overlapping atoms, too large a timestep, etc. @@ -6281,14 +6281,14 @@ keyword to allow for additional bonds to be formed One or more atoms are attempting to map their charge to a PPPM grid point that is not owned by a processor. This is likely for one of two reasons, both of them bad. First, it may mean that an atom near the - boundary of a processor's sub-domain has moved more than 1/2 the + boundary of a processor's subdomain has moved more than 1/2 the :doc:`neighbor skin distance ` without neighbor lists being rebuilt and atoms being migrated to new processors. This also means you may be missing pairwise interactions that need to be computed. The solution is to change the re-neighboring criteria via the :doc:`neigh_modify ` command. The safest settings are "delay 0 every 1 check yes". Second, it may mean that an atom has - moved far outside a processor's sub-domain or even the entire + moved far outside a processor's subdomain or even the entire simulation box. This indicates bad physics, e.g. due to highly overlapping atoms, too large a timestep, etc. @@ -6296,14 +6296,14 @@ keyword to allow for additional bonds to be formed One or more atoms are attempting to map their charge to a PPPM grid point that is not owned by a processor. This is likely for one of two reasons, both of them bad. First, it may mean that an atom near the - boundary of a processor's sub-domain has moved more than 1/2 the + boundary of a processor's subdomain has moved more than 1/2 the :doc:`neighbor skin distance ` without neighbor lists being rebuilt and atoms being migrated to new processors. This also means you may be missing pairwise interactions that need to be computed. The solution is to change the re-neighboring criteria via the :doc:`neigh_modify ` command. The safest settings are "delay 0 every 1 check yes". Second, it may mean that an atom has - moved far outside a processor's sub-domain or even the entire + moved far outside a processor's subdomain or even the entire simulation box. This indicates bad physics, e.g. due to highly overlapping atoms, too large a timestep, etc. diff --git a/doc/src/Errors_warnings.rst b/doc/src/Errors_warnings.rst index f488eaaa88..95f4aa773e 100644 --- a/doc/src/Errors_warnings.rst +++ b/doc/src/Errors_warnings.rst @@ -109,9 +109,9 @@ Doc page with :doc:`ERROR messages ` *Communication cutoff is shorter than a bond length based estimate. This may lead to errors.* Since LAMMPS stores topology data with individual atoms, all atoms comprising a bond, angle, dihedral or improper must be present on any - sub-domain that "owns" the atom with the information, either as a + subdomain that "owns" the atom with the information, either as a local or a ghost atom. The communication cutoff is what determines up - to what distance from a sub-domain boundary ghost atoms are created. + to what distance from a subdomain boundary ghost atoms are created. The communication cutoff is by default the largest non-bonded cutoff plus the neighbor skin distance, but for short or non-bonded cutoffs and/or long bonds, this may not be sufficient. This warning indicates @@ -398,7 +398,7 @@ This will most likely cause errors in kinetic fluctuations. Lost atoms are checked for each time thermo output is done. See the thermo_modify lost command for options. Lost atoms usually indicate bad dynamics, e.g. atoms have been blown far out of the simulation - box, or moved further than one processor's sub-domain away before + box, or moved further than one processor's subdomain away before reneighboring. *MSM mesh too small, increasing to 2 points in each direction* @@ -582,13 +582,13 @@ This will most likely cause errors in kinetic fluctuations. needed. The requested volume fraction may be too high, or other atoms may be in the insertion region. -*Proc sub-domain size < neighbor skin, could lead to lost atoms* +*Proc subdomain size < neighbor skin, could lead to lost atoms* The decomposition of the physical domain (likely due to load - balancing) has led to a processor's sub-domain being smaller than the + balancing) has led to a processor's subdomain being smaller than the neighbor skin in one or more dimensions. Since reneighboring is triggered by atoms moving the skin distance, this may lead to lost atoms, if an atom moves all the way across a neighboring processor's - sub-domain before reneighboring is triggered. + subdomain before reneighboring is triggered. *Reducing PPPM order b/c stencil extends beyond nearest neighbor processor* This may lead to a larger grid than desired. See the kspace_modify overlap diff --git a/doc/src/Howto_grid.rst b/doc/src/Howto_grid.rst index 74646b4bd2..f8e4ccfb75 100644 --- a/doc/src/Howto_grid.rst +++ b/doc/src/Howto_grid.rst @@ -11,7 +11,7 @@ more values (data). The grid cells and data they store are distributed across processors. Each processor owns the grid cells (and data) whose center points lie -within the spatial sub-domain of the processor. If needed for its +within the spatial subdomain of the processor. If needed for its computations, a processor may also store ghost grid cells with their data. @@ -28,7 +28,7 @@ box size, as set by the :doc:`boundary ` command for fixed or shrink-wrapped boundaries. If load-balancing is invoked by the :doc:`balance ` or -:doc:`fix balance ` commands, then the sub-domain owned +:doc:`fix balance ` commands, then the subdomain owned by a processor can change which may also change which grid cells they own. diff --git a/doc/src/Howto_output.rst b/doc/src/Howto_output.rst index 771ecad8f0..6c76eef184 100644 --- a/doc/src/Howto_output.rst +++ b/doc/src/Howto_output.rst @@ -59,7 +59,7 @@ of bond distances. A per-grid datum is one or more values per grid cell, for a grid which overlays the simulation domain. The grid cells and the data they store are distributed across processors; each processor owns the grid -cells whose center point falls within its sub-domain. +cells whose center point falls within its subdomain. .. _scalar: @@ -322,7 +322,7 @@ The chief difference between the :doc:`fix ave/grid ` and :doc:`fix ave/chunk ` commands when used in this context is that the former uses a distributed grid, while the latter uses a global grid. Distributed means that each processor owns the -subset of grid cells within its sub-domain. Global means that each +subset of grid cells within its subdomain. Global means that each processor owns a copy of the entire grid. The :doc:`fix ave/grid ` command is thus more efficient for large grids. diff --git a/doc/src/Howto_peri.rst b/doc/src/Howto_peri.rst index 4c78d04833..8a2aa4cab6 100644 --- a/doc/src/Howto_peri.rst +++ b/doc/src/Howto_peri.rst @@ -783,19 +783,19 @@ Pitfalls **Parallel Scalability** LAMMPS operates in parallel in a :doc:`spatial-decomposition mode -`, where each processor owns a spatial sub-domain of +`, where each processor owns a spatial subdomain of the overall simulation domain and communicates with its neighboring processors via distributed-memory message passing (MPI) to acquire ghost atom information to allow forces on the atoms it owns to be computed. LAMMPS also uses Verlet neighbor lists which are recomputed every few timesteps as particles move. On these timesteps, particles also migrate to new processors as needed. LAMMPS decomposes the overall -simulation domain so that spatial sub-domains of nearly equal volume are -assigned to each processor. When each sub-domain contains nearly the +simulation domain so that spatial subdomains of nearly equal volume are +assigned to each processor. When each subdomain contains nearly the same number of particles, this results in a reasonable load balance among all processors. As is more typical with some peridynamic -simulations, some sub-domains may contain many particles while other -sub-domains contain few particles, resulting in a load imbalance that +simulations, some subdomains may contain many particles while other +subdomains contain few particles, resulting in a load imbalance that impacts parallel scalability. **Setting the "skin" distance** diff --git a/doc/src/Howto_triclinic.rst b/doc/src/Howto_triclinic.rst index dd0a949f68..0efadbcc8c 100644 --- a/doc/src/Howto_triclinic.rst +++ b/doc/src/Howto_triclinic.rst @@ -150,7 +150,7 @@ option with either of the commands. Note that if a simulation box has a large tilt factor, LAMMPS will run less efficiently, due to the large volume of communication needed to -acquire ghost atoms around a processor's irregular-shaped sub-domain. +acquire ghost atoms around a processor's irregular-shaped subdomain. For extreme values of tilt, LAMMPS may also lose atoms and generate an error. diff --git a/doc/src/Intro_citing.rst b/doc/src/Intro_citing.rst index 56955cae3a..8a6cfde878 100644 --- a/doc/src/Intro_citing.rst +++ b/doc/src/Intro_citing.rst @@ -38,11 +38,11 @@ to create digital object identifiers (DOI) for stable releases of the LAMMPS source code. There are two types of DOIs for the LAMMPS source code. The canonical DOI for **all** versions of LAMMPS, which will always -point to the **latest** stable release version is: +point to the **latest** stable release version, is: - DOI: `10.5281/zenodo.3726416 `_ -In addition there are DOIs for individual stable releases. Currently there are: +In addition there are DOIs generated for individual stable releases: - 3 March 2020 version: `DOI:10.5281/zenodo.3726417 `_ - 29 October 2020 version: `DOI:10.5281/zenodo.4157471 `_ @@ -65,6 +65,6 @@ for optional features used in a specific run is printed to the screen and log file. Style and output location can be selected with the :ref:`-cite command-line switch `. Additional references are given in the documentation of the :doc:`corresponding commands -` or in the :doc:`Howto tutorials `. So please -make certain, that you provide the proper acknowledgments and citations -in any published works using LAMMPS. +` or in the :doc:`Howto tutorials `. Please make +certain, that you provide the proper acknowledgments and citations in +any published works using LAMMPS. diff --git a/doc/src/Intro_features.rst b/doc/src/Intro_features.rst index e6269793f9..76b989ad69 100644 --- a/doc/src/Intro_features.rst +++ b/doc/src/Intro_features.rst @@ -27,7 +27,7 @@ General features * distributed memory message-passing parallelism (MPI) * shared memory multi-threading parallelism (OpenMP) * spatial decomposition of simulation domain for MPI parallelism -* particle decomposition inside of spatial decomposition for OpenMP and GPU parallelism +* particle decomposition inside spatial decomposition for OpenMP and GPU parallelism * GPLv2 licensed open-source distribution * highly portable C++-11 * modular code with most functionality in optional packages @@ -113,7 +113,7 @@ Atom creation :doc:`create_atoms `, :doc:`delete_atoms `, :doc:`displace_atoms `, :doc:`replicate ` commands) -* read in atom coords from files +* read in atom coordinates from files * create atoms on one or more lattices (e.g. grain boundaries) * delete geometric or logical groups of atoms (e.g. voids) * replicate existing atoms multiple times @@ -173,11 +173,11 @@ Output (:doc:`dump `, :doc:`restart ` commands) * log file of thermodynamic info -* text dump files of atom coords, velocities, other per-atom quantities +* text dump files of atom coordinates, velocities, other per-atom quantities * dump output on fixed and variable intervals, based timestep or simulated time * binary restart files * parallel I/O of dump and restart files -* per-atom quantities (energy, stress, centro-symmetry parameter, CNA, etc) +* per-atom quantities (energy, stress, centro-symmetry parameter, CNA, etc.) * user-defined system-wide (log file) or per-atom (dump file) calculations * custom partitioning (chunks) for binning, and static or dynamic grouping of atoms for analysis * spatial, time, and per-chunk averaging of per-atom quantities diff --git a/doc/src/Intro_nonfeatures.rst b/doc/src/Intro_nonfeatures.rst index b1d68c8594..65889efd2e 100644 --- a/doc/src/Intro_nonfeatures.rst +++ b/doc/src/Intro_nonfeatures.rst @@ -20,22 +20,23 @@ that either closely interface with LAMMPS or extend LAMMPS. Here are suggestions on how to perform these tasks: -* **GUI:** LAMMPS can be built as a library and a Python wrapper that wraps - the library interface is provided. Thus, GUI interfaces can be - written in Python (or C or C++ if desired) that run LAMMPS and - visualize or plot its output. Examples of this are provided in the - python directory and described on the :doc:`Python ` doc - page. Also, there are several external wrappers or GUI front ends. -* **Builder:** Several pre-processing tools are packaged with LAMMPS. Some - of them convert input files in formats produced by other MD codes such - as CHARMM, AMBER, or Insight into LAMMPS input formats. Some of them - are simple programs that will build simple molecular systems, such as - linear bead-spring polymer chains. The moltemplate program is a true - molecular builder that will generate complex molecular models. See - the :doc:`Tools ` page for details on tools packaged with - LAMMPS. The `Pre/post processing page `_ of the LAMMPS website +* **GUI:** LAMMPS can be built as a library and a Python module that + wraps the library interface is provided. Thus, GUI interfaces can be + written in Python or C/C++ that run LAMMPS and visualize or plot its + output. Examples of this are provided in the python directory and + described on the :doc:`Python ` doc page. Also, there + are several external wrappers or GUI front ends. +* **Builder:** Several pre-processing tools are packaged with LAMMPS. + Some of them convert input files in formats produced by other MD codes + such as CHARMM, AMBER, or Insight into LAMMPS input formats. Some of + them are simple programs that will build simple molecular systems, + such as linear bead-spring polymer chains. The moltemplate program is + a true molecular builder that will generate complex molecular models. + See the :doc:`Tools ` page for details on tools packaged with + LAMMPS. The `Pre-/post-processing page + `_ of the LAMMPS homepage describes a variety of third party tools for this task. Furthermore, - some LAMMPS internal commands allow to reconstruct, or selectively add + some internal LAMMPS commands allow reconstructing, or selectively adding topology information, as well as provide the option to insert molecule templates instead of atoms for building bulk molecular systems. * **Force-field assignment:** The conversion tools described in the previous @@ -47,33 +48,34 @@ Here are suggestions on how to perform these tasks: powerful and flexible in converting force field and topology data between various MD simulation programs. * **Simulation analysis:** If you want to perform analysis on-the-fly as - your simulation runs, see the :doc:`compute ` and - :doc:`fix ` doc pages, which list commands that can be used in a - LAMMPS input script. Also see the :doc:`Modify ` page for - info on how to add your own analysis code or algorithms to LAMMPS. - For post-processing, LAMMPS output such as :doc:`dump file snapshots ` can be converted into formats used by other MD or + your simulation runs, see the :doc:`compute ` and :doc:`fix + ` doc pages, which list commands that can be used in a LAMMPS + input script. Also see the :doc:`Modify ` page for info on + how to add your own analysis code or algorithms to LAMMPS. For + post-processing, LAMMPS output such as :doc:`dump file snapshots + ` can be converted into formats used by other MD or post-processing codes. To some degree, that conversion can be done - directly inside of LAMMPS by interfacing to the VMD molfile plugins. - The :doc:`rerun ` command also allows to do some post-processing - of existing trajectories, and through being able to read a variety - of file formats, this can also be used for analyzing trajectories - from other MD codes. Some post-processing tools packaged with - LAMMPS will do these conversions. Scripts provided in the - tools/python directory can extract and massage data in dump files to - make it easier to import into other programs. See the - :doc:`Tools ` page for details on these various options. -* **Visualization:** LAMMPS can produce NETPBM, JPG or PNG snapshot images - on-the-fly via its :doc:`dump image ` command and pass - them to an external program, `FFmpeg `_ to generate - movies from them. For high-quality, interactive visualization there are - many excellent and free tools available. See the - `Visualization Tools `_ page of the - LAMMPS website for + directly inside LAMMPS by interfacing to the VMD molfile plugins. The + :doc:`rerun ` command also allows post-processing of existing + trajectories, and through being able to read a variety of file + formats, this can also be used for analyzing trajectories from other + MD codes. Some post-processing tools packaged with LAMMPS will do + these conversions. Scripts provided in the tools/python directory can + extract and massage data in dump files to make it easier to import + into other programs. See the :doc:`Tools ` page for details on + these various options. +* **Visualization:** LAMMPS can produce NETPBM, JPG, or PNG format + snapshot images on-the-fly via its :doc:`dump image ` + command and pass them to an external program, `FFmpeg + `_, to generate movies from them. For + high-quality, interactive visualization, there are many excellent and + free tools available. See the `Visualization Tools + `_ page of the LAMMPS website for visualization packages that can process LAMMPS output data. * **Plotting:** See the next bullet about Pizza.py as well as the :doc:`Python ` page for examples of plotting LAMMPS - output. Scripts provided with the *python* tool in the tools - directory will extract and massage data in log and dump files to make + output. Scripts provided with the *python* tool in the ``tools`` + directory will extract and process data in log and dump files to make it easier to analyze and plot. See the :doc:`Tools ` doc page for more discussion of the various tools. * **Pizza.py:** Our group has also written a separate toolkit called diff --git a/doc/src/Intro_overview.rst b/doc/src/Intro_overview.rst index 0fa2214218..57fa7fbfb6 100644 --- a/doc/src/Intro_overview.rst +++ b/doc/src/Intro_overview.rst @@ -1,20 +1,20 @@ Overview of LAMMPS ------------------ -LAMMPS is a classical molecular dynamics (MD) code that models -ensembles of particles in a liquid, solid, or gaseous state. It can -model atomic, polymeric, biological, solid-state (metals, ceramics, -oxides), granular, coarse-grained, or macroscopic systems using a -variety of interatomic potentials (force fields) and boundary -conditions. It can model 2d or 3d systems with only a few particles -up to millions or billions. +LAMMPS is a classical molecular dynamics (MD) code that models ensembles +of particles in a liquid, solid, or gaseous state. It can model atomic, +polymeric, biological, solid-state (metals, ceramics, oxides), granular, +coarse-grained, or macroscopic systems using a variety of interatomic +potentials (force fields) and boundary conditions. It can model 2d or +3d systems with sizes ranging from only a few particles up to billions. -LAMMPS can be built and run on a laptop or desktop machine, but is +LAMMPS can be built and run on single laptop or desktop machines, but is designed for parallel computers. It will run in serial and on any parallel machine that supports the `MPI `_ message-passing -library. This includes shared-memory boxes and distributed-memory -clusters and supercomputers. Parts of LAMMPS also support -`OpenMP multi-threading `_, vectorization and GPU acceleration. +library. This includes shared-memory multicore, multi-CPU servers and +distributed-memory clusters and supercomputers. Parts of LAMMPS also +support `OpenMP multi-threading `_, vectorization, and GPU +acceleration. .. _mpi: https://en.wikipedia.org/wiki/Message_Passing_Interface .. _lws: https://www.lammps.org @@ -42,11 +42,11 @@ LAMMPS uses neighbor lists to keep track of nearby particles. The lists are optimized for systems with particles that are repulsive at short distances, so that the local density of particles never becomes too large. This is in contrast to methods used for modeling plasma or -gravitational bodies (e.g. galaxy formation). +gravitational bodies (like galaxy formation). On parallel machines, LAMMPS uses spatial-decomposition techniques with -MPI parallelization to partition the simulation domain into sub-domains +MPI parallelization to partition the simulation domain into subdomains of equal computational cost, one of which is assigned to each processor. Processors communicate and store "ghost" atom information for atoms that -border their sub-domain. Multi-threading parallelization and GPU -acceleration with with particle-decomposition can be used in addition. +border their subdomain. Multi-threading parallelization and GPU +acceleration with particle-decomposition can be used in addition. diff --git a/doc/src/Intro_portability.rst b/doc/src/Intro_portability.rst index 49db13a6be..119453eb27 100644 --- a/doc/src/Intro_portability.rst +++ b/doc/src/Intro_portability.rst @@ -30,17 +30,17 @@ can be created using CMake. CMake must be at least version 3.10. Operating systems ^^^^^^^^^^^^^^^^^ -The primary development platform for LAMMPS is Linux. Thus the chances +The primary development platform for LAMMPS is Linux. Thus, the chances for LAMMPS to compile without problems on Linux machines are the best. -Also compilation and correct execution on macOS and Windows (using +Also, compilation and correct execution on macOS and Windows (using Microsoft Visual C++) is checked automatically for largest part of the source code. Some (optional) features are not compatible with all -operating systems either through limitations of the source code or -source code compatibility or the build system requirements of required -libraries. +operating systems, either through limitations of the corresponding +LAMMPS source code or through source code or build system +incompatibilities of required libraries. -Executables for Windows may be created using either Cygwin or Visual -Studio or a Linux to Windows MinGW cross-compiler. +Executables for Windows may be created natively using either Cygwin or +Visual Studio or with a Linux to Windows MinGW cross-compiler. Additionally, FreeBSD and Solaris have been tested successfully. @@ -49,7 +49,7 @@ Compilers The most commonly used compilers are the GNU compilers, but also Clang and the Intel compilers have been successfully used on Linux, macOS, and -Windows. Also the Nvidia HPC SDK (formerly PGI compilers) will compile +Windows. Also, the Nvidia HPC SDK (formerly PGI compilers) will compile LAMMPS (tested on Linux). CPU architectures @@ -62,12 +62,14 @@ regularly tested. Portability compliance ^^^^^^^^^^^^^^^^^^^^^^ -Not all of the LAMMPS source code is fully compliant to all of the above -mentioned standards. This is rather typical for projects like LAMMPS -that largely depend on contributions of features from the community. +Only a subset of the LAMMPS source code is fully compliant to all of the +above mentioned standards. This is rather typical for projects like +LAMMPS that largely depend on contributions from the user community. Not all contributors are trained as programmers and not all of them have -access to a variety of platforms. As part of the continuous integration -process, however, all contributions are automatically tested to compile, -link, and pass some runtime tests on a selection of Linux flavors, -macOS, and Windows with different compilers. Other platforms may be -checked occasionally or when portability bug are reported. +access to multiple platforms for testing. As part of the continuous +integration process, however, all contributions are automatically tested +to compile, link, and pass some runtime tests on a selection of Linux +flavors, macOS, and Windows, and on Linux with different compilers. +Thus portability issues are often found before a pull request is merged. +Other platforms may be checked occasionally or when portability bugs are +reported. diff --git a/doc/src/Library_properties.rst b/doc/src/Library_properties.rst index 0cabefb379..21e5609fc0 100644 --- a/doc/src/Library_properties.rst +++ b/doc/src/Library_properties.rst @@ -30,7 +30,7 @@ course, changing values should be done with care. When accessing per-atom data, please note that these data are the per-processor **local** data and are indexed accordingly. Per-atom data can change sizes and ordering at every neighbor list rebuild or atom sort event as atoms migrate between -sub-domains and processors. +subdomains and processors. .. code-block:: c diff --git a/doc/src/Manual.rst b/doc/src/Manual.rst index d898d3b250..0af27e5f71 100644 --- a/doc/src/Manual.rst +++ b/doc/src/Manual.rst @@ -5,16 +5,17 @@ LAMMPS Documentation (|version| version) LAMMPS stands for **L**\ arge-scale **A**\ tomic/**M**\ olecular **M**\ assively **P**\ arallel **S**\ imulator. -LAMMPS is a classical molecular dynamics simulation code with a focus -on materials modeling. It was designed to run efficiently on parallel -computers. It was developed originally at Sandia National -Laboratories, a US Department of Energy facility. The majority of -funding for LAMMPS has come from the US Department of Energy (DOE). -LAMMPS is an open-source code, distributed freely under the terms of -the GNU Public License Version 2 (GPLv2). +LAMMPS is a classical molecular dynamics simulation code focusing on +materials modeling. It was designed to run efficiently on parallel +computers and to be easy to extend and modify. Originally developed at +Sandia National Laboratories, a US Department of Energy facility, LAMMPS +now includes contributions from many research groups and individuals +from many institutions. Most of the funding for LAMMPS has come from +the US Department of Energy (DOE). LAMMPS is open-source software +distributed under the terms of the GNU Public License Version 2 (GPLv2). The `LAMMPS website `_ has a variety of information about the -code. It includes links to an on-line version of this manual, an +code. It includes links to an online version of this manual, an `online forum `_ where users can post questions and discuss LAMMPS, and a `GitHub site `_ where all LAMMPS development is @@ -26,14 +27,14 @@ The content for this manual is part of the LAMMPS distribution. The online version always corresponds to the latest feature release version. If needed, you can build a local copy of the manual as HTML pages or a PDF file by following the steps on the :doc:`Build_manual` page. If you -have difficulties viewing the pages please :ref:`see this note +have difficulties viewing the pages, please :ref:`see this note `. ----------- -The manual is organized in three parts: +The manual is organized into three parts: -1. the :ref:`User Guide ` with information about how +1. The :ref:`User Guide ` with information about how to obtain, configure, compile, install, and use LAMMPS, 2. the :ref:`Programmer Guide ` with information about how to use the LAMMPS library interface from @@ -47,7 +48,7 @@ The manual is organized in three parts: .. only:: html - Once you are familiar with LAMMPS, you may want to bookmark + After becoming familiar with LAMMPS, consider bookmarking :doc:`this page `, since it gives quick access to tables with links to the documentation for all LAMMPS commands. diff --git a/doc/src/Manual_version.rst b/doc/src/Manual_version.rst index f06617885a..6f99b9f117 100644 --- a/doc/src/Manual_version.rst +++ b/doc/src/Manual_version.rst @@ -2,43 +2,44 @@ What does a LAMMPS version mean ------------------------------- The LAMMPS "version" is the date when it was released, such as 1 May -2014. LAMMPS is updated continuously and we aim to keep it working +2014. LAMMPS is updated continuously, and we aim to keep it working correctly and reliably at all times. You can follow its development in a public `git repository on GitHub `_. -Modifications of the LAMMPS source code - like bug fixes, code -refactors, updates to existing features, or addition of new features - -are organized into pull requests, and will be merged into the *develop* -branch of the git repository when they pass automated testing and code +Modifications of the LAMMPS source code (like bug fixes, code refactors, +updates to existing features, or addition of new features) are organized +into pull requests. Pull requests will be merged into the *develop* +branch of the git repository after they pass automated testing and code review by the LAMMPS developers. When a sufficient number of changes -have accumulated *and* the software passes a set of automated tests, we -release it as a *feature release* (or patch release), which are -currently made every 4-8 weeks. The *release* branch of the git -repository is updated with every such release. A summary of the most -important changes of the patch releases are on `this website page -`_. More detailed release notes are -`available on GitHub `_. +have accumulated *and* the *develop* branch version passes an extended +set of automated tests, we release it as a *feature release* (or patch +release), which are currently made every 4 to 8 weeks. The *release* +branch of the git repository is updated with every such release. A +summary of the most important changes of the patch releases are on `this +website page `_. More detailed release +notes are `available on GitHub +`_. Once or twice a year, we have a "stabilization period" where we apply only bug fixes and small, non-intrusive changes to the *develop* -branch. At the same time the code is subjected to more detailed and -thorough manual testing than the default automated testing. Also +branch. At the same time, the code is subjected to more detailed and +thorough manual testing than the default automated testing. Also, several variants of static code analysis are run to improve the overall code quality, consistency, and compliance with programming standards, best practices and style conventions. The latest patch release after such a period is then also labeled as a *stable* version and the *stable* branch is updated with it. Between -stable releases we occasionally release updates to the stable release +stable releases, we occasionally release updates to the stable release containing only bug fixes and updates back-ported from the *develop* branch and update the *stable* branch accordingly. Each version of LAMMPS contains all the documented features up to and -including its version date. For recently added features we add markers +including its version date. For recently added features, we add markers to the documentation at which specific LAMMPS version a feature or keyword was added or significantly changed. -The version date is printed to the screen and logfile every time you run +The version date is printed to the screen and log file every time you run LAMMPS. It is also in the file src/version.h and in the LAMMPS directory name created when you unpack a tarball. And it is on the first page of the :doc:`manual `. diff --git a/doc/src/Python_atoms.rst b/doc/src/Python_atoms.rst index 2cb5c695e8..f01559a524 100644 --- a/doc/src/Python_atoms.rst +++ b/doc/src/Python_atoms.rst @@ -23,7 +23,7 @@ against invalid accesses. When accessing per-atom data, please note that this data is the per-processor local data and indexed accordingly. These arrays can change sizes and order at every neighbor list - rebuild and atom sort event as atoms are migrating between sub-domains. + rebuild and atom sort event as atoms are migrating between subdomains. .. tabs:: diff --git a/doc/src/Python_properties.rst b/doc/src/Python_properties.rst index 227a729622..d8e772379c 100644 --- a/doc/src/Python_properties.rst +++ b/doc/src/Python_properties.rst @@ -23,7 +23,7 @@ against invalid accesses. When accessing per-atom data, please note that this data is the per-processor local data and indexed accordingly. These arrays can change sizes and order at every neighbor list - rebuild and atom sort event as atoms are migrating between sub-domains. + rebuild and atom sort event as atoms are migrating between subdomains. .. tabs:: diff --git a/doc/src/Speed.rst b/doc/src/Speed.rst index dfd5f4bf3e..1208dc9ce7 100644 --- a/doc/src/Speed.rst +++ b/doc/src/Speed.rst @@ -9,7 +9,7 @@ There are two thrusts to the discussion that follows. The first is using code options that implement alternate algorithms that can speed-up a simulation. The second is to use one of the several accelerator packages provided with LAMMPS that contain code optimized -for certain kinds of hardware, including multi-core CPUs, GPUs, and +for certain kinds of hardware, including multicore CPUs, GPUs, and Intel Xeon Phi co-processors. The `Benchmark page `_ of the LAMMPS diff --git a/doc/src/Speed_gpu.rst b/doc/src/Speed_gpu.rst index e95787ebee..8eac8b9c21 100644 --- a/doc/src/Speed_gpu.rst +++ b/doc/src/Speed_gpu.rst @@ -11,7 +11,7 @@ parts of the :doc:`kspace_style pppm ` for long-range Coulombics. It has the following general features: * It is designed to exploit common GPU hardware configurations where one - or more GPUs are coupled to many cores of one or more multi-core CPUs, + or more GPUs are coupled to many cores of one or more multicore CPUs, e.g. within a node of a parallel machine. * Atom-based data (e.g. coordinates, forces) are moved back-and-forth between the CPU(s) and GPU every timestep. @@ -28,7 +28,7 @@ Coulombics. It has the following general features: * LAMMPS-specific code is in the GPU package. It makes calls to a generic GPU library in the lib/gpu directory. This library provides either Nvidia support, AMD support, or more general OpenCL support - (for Nvidia GPUs, AMD GPUs, Intel GPUs, and multi-core CPUs). + (for Nvidia GPUs, AMD GPUs, Intel GPUs, and multicore CPUs). so that the same functionality is supported on a variety of hardware. **Required hardware/software:** @@ -146,7 +146,7 @@ GPUs/node to use, as well as other options. **Speed-ups to expect:** -The performance of a GPU versus a multi-core CPU is a function of your +The performance of a GPU versus a multicore CPU is a function of your hardware, which pair style is used, the number of atoms/GPU, and the precision used on the GPU (double, single, mixed). Using the GPU package in OpenCL mode on CPUs (which uses vectorization and multithreading) is @@ -174,7 +174,7 @@ deterministic results. **Guidelines for best performance:** * Using multiple MPI tasks per GPU will often give the best performance, - as allowed my most multi-core CPU/GPU configurations. + as allowed my most multicore CPU/GPU configurations. * If the number of particles per MPI task is small (e.g. 100s of particles), it can be more efficient to run with fewer MPI tasks per GPU, even if you do not use all the cores on the compute node. diff --git a/doc/src/Speed_kokkos.rst b/doc/src/Speed_kokkos.rst index 4c0d6ae768..569a24f1c2 100644 --- a/doc/src/Speed_kokkos.rst +++ b/doc/src/Speed_kokkos.rst @@ -79,7 +79,7 @@ manner via the ``mpirun`` or ``mpiexec`` commands, and is independent of Kokkos. E.g. the mpirun command in OpenMPI does this via its ``-np`` and ``-npernode`` switches. Ditto for MPICH via ``-np`` and ``-ppn``. -Running on a multi-core CPU +Running on a multicore CPU ^^^^^^^^^^^^^^^^^^^^^^^^^^^ Here is a quick overview of how to use the KOKKOS package @@ -254,7 +254,7 @@ is recommended in this scenario. Using a GPU-aware MPI library is highly recommended. GPU-aware MPI use can be avoided by using :doc:`-pk kokkos gpu/aware off `. As above for -multi-core CPUs (and no GPU), if N is the number of physical cores/node, +multicore CPUs (and no GPU), if N is the number of physical cores/node, then the number of MPI tasks/node should not exceed N. .. parsed-literal:: diff --git a/doc/src/Speed_omp.rst b/doc/src/Speed_omp.rst index 7f8913d20f..f23198e36f 100644 --- a/doc/src/Speed_omp.rst +++ b/doc/src/Speed_omp.rst @@ -12,7 +12,7 @@ Required hardware/software """""""""""""""""""""""""" To enable multi-threading, your compiler must support the OpenMP interface. -You should have one or more multi-core CPUs, as multiple threads can only be +You should have one or more multicore CPUs, as multiple threads can only be launched by each MPI task on the local node (using shared memory). Building LAMMPS with the OPENMP package @@ -157,7 +157,7 @@ Additional performance tips are as follows: affinity setting that restricts each MPI task to a single CPU core. Using multi-threading in this mode will force all threads to share the one core and thus is likely to be counterproductive. Instead, binding - MPI tasks to a (multi-core) socket, should solve this issue. + MPI tasks to a (multicore) socket, should solve this issue. Restrictions """""""""""" diff --git a/doc/src/atom_modify.rst b/doc/src/atom_modify.rst index 2f01877ac7..9049a24fde 100644 --- a/doc/src/atom_modify.rst +++ b/doc/src/atom_modify.rst @@ -113,7 +113,7 @@ your input script. LAMMPS does not use the group until a simulation is run. The *sort* keyword turns on a spatial sorting or reordering of atoms -within each processor's sub-domain every *Nfreq* timesteps. If +within each processor's subdomain every *Nfreq* timesteps. If *Nfreq* is set to 0, then sorting is turned off. Sorting can improve cache performance and thus speed-up a LAMMPS simulation, as discussed in a paper by :ref:`(Meloni) `. Its efficacy depends on the problem diff --git a/doc/src/balance.rst b/doc/src/balance.rst index 4873fc35c9..bb66598546 100644 --- a/doc/src/balance.rst +++ b/doc/src/balance.rst @@ -54,7 +54,7 @@ Syntax *store* name = store weight in custom atom property defined by :doc:`fix property/atom ` command name = atom property name (without d\_ prefix) *out* arg = filename - filename = write each processor's sub-domain to a file + filename = write each processor's subdomain to a file Examples """""""" @@ -72,14 +72,14 @@ Examples Description """"""""""" -This command adjusts the size and shape of processor sub-domains +This command adjusts the size and shape of processor subdomains within the simulation box, to attempt to balance the number of atoms or particles and thus indirectly the computational cost (load) more evenly across processors. The load balancing is "static" in the sense that this command performs the balancing once, before or between -simulations. The processor sub-domains will then remain static during +simulations. The processor subdomains will then remain static during the subsequent run. To perform "dynamic" balancing, see the :doc:`fix -balance ` command, which can adjust processor sub-domain +balance ` command, which can adjust processor subdomain sizes and shapes on-the-fly during a :doc:`run `. Load-balancing is typically most useful if the particles in the @@ -90,7 +90,7 @@ an irregular-shaped geometry containing void regions, or :doc:`hybrid pair style simulations ` which combine pair styles with different computational cost. In these cases, the LAMMPS default of dividing the simulation box volume into a regular-spaced grid of 3d -bricks, with one equal-volume sub-domain per processor, may assign +bricks, with one equal-volume subdomain per processor, may assign numbers of particles per processor in a way that the computational effort varies significantly. This can lead to poor performance when the simulation is run in parallel. @@ -109,7 +109,7 @@ Specifically, for a Px by Py by Pz grid of processors, it allows choice of Px, Py, and Pz, subject to the constraint that Px \* Py \* Pz = P, the total number of processors. This is sufficient to achieve good load-balance for some problems on some processor counts. -However, all the processor sub-domains will still have the same shape +However, all the processor subdomains will still have the same shape and same volume. The requested load-balancing operation is only performed if the @@ -162,7 +162,7 @@ fractions of the box length) are also printed. simulation could run up to 20% faster if it were perfectly balanced, versus when imbalanced. However, computational cost is not strictly proportional to particle count, and changing the relative size and - shape of processor sub-domains may lead to additional computational + shape of processor subdomains may lead to additional computational and communication overheads, e.g. in the PPPM solver used via the :doc:`kspace_style ` command. Thus you should benchmark the run times of a simulation before and after balancing. @@ -177,7 +177,7 @@ The *x*, *y*, *z*, and *shift* styles are "grid" methods which produce a logical 3d grid of processors. They operate by changing the cutting planes (or lines) between processors in 3d (or 2d), to adjust the volume (area in 2d) assigned to each processor, as in the -following 2d diagram where processor sub-domains are shown and +following 2d diagram where processor subdomains are shown and particles are colored by the processor that owns them. .. |balance1| image:: img/balance_uniform.jpg @@ -226,7 +226,7 @@ The *x*, *y*, and *z* styles invoke a "grid" method for balancing, as described above. Note that any or all of these 3 styles can be specified together, one after the other, but they cannot be used with any other style. This style adjusts the position of cutting planes -between processor sub-domains in specific dimensions. Only the +between processor subdomains in specific dimensions. Only the specified dimensions are altered. The *uniform* argument spaces the planes evenly, as in the left @@ -245,8 +245,8 @@ the cutting place. The left (or lower) edge of the box is 0.0, and the right (or upper) edge is 1.0. Neither of these values is specified. Only the interior Ps-1 positions are specified. Thus is there are 2 processors in the x dimension, you specify a single value -such as 0.75, which would make the left processor's sub-domain 3x -larger than the right processor's sub-domain. +such as 0.75, which would make the left processor's subdomain 3x +larger than the right processor's subdomain. ---------- @@ -288,10 +288,10 @@ adjacent planes are closer together than the neighbor skin distance (as specified by the :doc:`neigh_modify ` command), then the plane positions are shifted to separate them by at least this amount. This is to prevent particles being lost when dynamics are run -with processor sub-domains that are too narrow in one or more +with processor subdomains that are too narrow in one or more dimensions. -Once the re-balancing is complete and final processor sub-domains +Once the re-balancing is complete and final processor subdomains assigned, particles are migrated to their new owning processor, and the balance procedure ends. @@ -299,7 +299,7 @@ the balance procedure ends. At each re-balance operation, the bisectioning for each cutting plane (line in 2d) typically starts with low and high bounds separated - by the extent of a processor's sub-domain in one dimension. The size + by the extent of a processor's subdomain in one dimension. The size of this bracketing region shrinks by 1/2 every iteration. Thus if *Niter* is specified as 10, the cutting plane will typically be positioned to 1 part in 1000 accuracy (relative to the perfect target @@ -494,7 +494,7 @@ different kinds of custom atom vectors or arrays as arguments. The *out* keyword writes a text file to the specified *filename* with the results of the balancing operation. The file contains the bounds -of the sub-domain for each processor after the balancing operation +of the subdomain for each processor after the balancing operation completes. The format of the file is compatible with the `Pizza.py `_ *mdump* tool which has support for manipulating and visualizing mesh files. An example is shown here for a balancing by 4 @@ -538,7 +538,7 @@ processors for a 2d problem: 4 1 13 14 15 16 The coordinates of all the vertices are listed in the NODES section, 5 -per processor. Note that the 4 sub-domains share vertices, so there +per processor. Note that the 4 subdomains share vertices, so there will be duplicate nodes in the list. The "SQUARES" section lists the node IDs of the 4 vertices in a diff --git a/doc/src/boundary.rst b/doc/src/boundary.rst index 3af0e2fa43..8e019e801c 100644 --- a/doc/src/boundary.rst +++ b/doc/src/boundary.rst @@ -61,7 +61,7 @@ move. Note that when the difference between the current box dimensions and the shrink-wrap box dimensions is large, this can lead to lost atoms at the beginning of a run when running in parallel. This is due to the large change in the (global) box dimensions also causing -significant changes in the individual sub-domain sizes. If these +significant changes in the individual subdomain sizes. If these changes are farther than the communication cutoff, atoms will be lost. This is best addressed by setting initial box dimensions to match the shrink-wrapped dimensions more closely, by using *m* style boundaries diff --git a/doc/src/comm_modify.rst b/doc/src/comm_modify.rst index d0526b792b..1a551a7659 100644 --- a/doc/src/comm_modify.rst +++ b/doc/src/comm_modify.rst @@ -62,7 +62,7 @@ distances are used to determine which atoms to communicate. The default mode is *single* which means each processor acquires information for ghost atoms that are within a single distance from its -sub-domain. The distance is by default the maximum of the neighbor +subdomain. The distance is by default the maximum of the neighbor cutoff across all atom type pairs. For many systems this is an efficient algorithm, but for systems with @@ -81,7 +81,7 @@ with both the *multi* and *multi/old* neighbor styles. The *cutoff* keyword allows you to extend the ghost cutoff distance for communication mode *single*, which is the distance from the borders -of a processor's sub-domain at which ghost atoms are acquired from other +of a processor's subdomain at which ghost atoms are acquired from other processors. By default the ghost cutoff = neighbor cutoff = pairwise force cutoff + neighbor skin. See the :doc:`neighbor ` command for more information about the skin distance. If the specified Rcut is diff --git a/doc/src/compute.rst b/doc/src/compute.rst index 27f2609696..060ed73f53 100644 --- a/doc/src/compute.rst +++ b/doc/src/compute.rst @@ -54,7 +54,7 @@ per atom, e.g. a list of bond distances. Per-grid quantities are calculated on a regular 2d or 3d grid which overlays a 2d or 3d simulation domain. The grid points and the data they store are distributed across processors; each processor owns the grid points -which fall within its sub-domain. +which fall within its subdomain. Computes that produce per-atom quantities have the word "atom" at the end of their style, e.g. *ke/atom*\ . Computes that produce local diff --git a/doc/src/compute_pressure.rst b/doc/src/compute_pressure.rst index bf344be270..52195ec5f8 100644 --- a/doc/src/compute_pressure.rst +++ b/doc/src/compute_pressure.rst @@ -48,9 +48,9 @@ the virial, equal to :math:`-dU/dV`, computed for all pairwise as well as 2-body, 3-body, 4-body, many-body, and long-range interactions, where :math:`\vec r_i` and :math:`\vec f_i` are the position and force vector of atom *i*, and the dot indicates the dot product (scalar product). -This is computed in parallel for each sub-domain and then summed over +This is computed in parallel for each subdomain and then summed over all parallel processes. Thus :math:`N'` necessarily includes atoms from -neighboring sub-domains (so-called ghost atoms) and the position and +neighboring subdomains (so-called ghost atoms) and the position and force vectors of ghost atoms are thus included in the summation. Only when running in serial and without periodic boundary conditions is :math:`N' = N` the number of atoms in the system. :doc:`Fixes ` diff --git a/doc/src/compute_property_grid.rst b/doc/src/compute_property_grid.rst index 90b48d3a7c..e65e822516 100644 --- a/doc/src/compute_property_grid.rst +++ b/doc/src/compute_property_grid.rst @@ -39,7 +39,7 @@ Description Define a computation that stores the specified attributes of a distributed grid. In LAMMPS, distributed grids are regular 2d or 3d grids which overlay a 2d or 3d simulation domain. Each processor owns -the grid cells whose center points lie within its sub-domain. See the +the grid cells whose center points lie within its subdomain. See the :doc:`Howto grid ` doc page for details of how distributed grids can be defined by various commands and referenced. diff --git a/doc/src/compute_sna_atom.rst b/doc/src/compute_sna_atom.rst index ac55aebd08..b17db625f8 100644 --- a/doc/src/compute_sna_atom.rst +++ b/doc/src/compute_sna_atom.rst @@ -259,7 +259,7 @@ layout in the global array. Compute *sna/grid/local* calculates bispectrum components of a regular grid of points similarly to compute *sna/grid* described above. However, because the array is local, it contains only rows for grid points -that are local to the processor sub-domain. The global grid +that are local to the processor subdomain. The global grid of :math:`nx \times ny \times nz` points is still laid out in space the same as for *sna/grid*, but grid points are strictly partitioned, so that every grid point appears in one and only one local array. The array contains one row for each of the diff --git a/doc/src/dump_image.rst b/doc/src/dump_image.rst index fe13e9e5a4..d97a570613 100644 --- a/doc/src/dump_image.rst +++ b/doc/src/dump_image.rst @@ -80,9 +80,9 @@ Syntax axes = *yes* or *no* = do or do not draw xyz axes lines next to simulation box length = length of axes lines as fraction of respective box lengths diam = diameter of axes lines as fraction of shortest box length - *subbox* values = lines diam = draw outline of processor sub-domains - lines = *yes* or *no* = do or do not draw sub-domain lines - diam = diameter of sub-domain lines as fraction of shortest box length + *subbox* values = lines diam = draw outline of processor subdomains + lines = *yes* or *no* = do or do not draw subdomain lines + diam = diameter of subdomain lines as fraction of shortest box length *shiny* value = sfactor = shinyness of spheres and cylinders sfactor = shinyness of spheres and cylinders from 0.0 to 1.0 *ssao* value = shading seed dfactor = SSAO depth shading @@ -145,7 +145,7 @@ Syntax *bitrate* arg = rate rate = target bitrate for movie in kbps *boxcolor* arg = color - color = name of color for simulation box lines and processor sub-domain lines + color = name of color for simulation box lines and processor subdomain lines *color* args = name R G B name = name of color R,G,B = red/green/blue numeric values from 0.0 to 1.0 @@ -581,13 +581,13 @@ respective box lengths. The *diam* setting determines their thickness as a fraction of the shortest box length in x,y,z (for 3d) or x,y (for 2d). -The *subbox* keyword determines if and how processor sub-domain +The *subbox* keyword determines if and how processor subdomain boundaries are rendered as thin cylinders in the image. If *no* is -set (default), then the sub-domain boundaries are not drawn and the +set (default), then the subdomain boundaries are not drawn and the *diam* setting is ignored. If *yes* is set, the 12 edges of each -processor sub-domain are drawn, with a diameter that is a fraction of +processor subdomain are drawn, with a diameter that is a fraction of the shortest box length in x,y,z (for 3d) or x,y (for 2d). The color -of the sub-domain boundaries can be set with the "dump_modify +of the subdomain boundaries can be set with the "dump_modify boxcolor" command. ---------- @@ -921,8 +921,8 @@ formats. The *boxcolor* keyword sets the color of the simulation box drawn around the atoms in each image as well as the color of processor -sub-domain boundaries. See the "dump image box" command for how to -specify that a box be drawn via the *box* keyword, and the sub-domain +subdomain boundaries. See the "dump image box" command for how to +specify that a box be drawn via the *box* keyword, and the subdomain boundaries via the *subbox* keyword. The color name can be any of the 140 pre-defined colors (see below) or a color name defined by the dump_modify color option. diff --git a/doc/src/fix.rst b/doc/src/fix.rst index d155e70493..0a926d570c 100644 --- a/doc/src/fix.rst +++ b/doc/src/fix.rst @@ -89,7 +89,7 @@ owns, but there may be zero or more per atoms. Per-grid quantities are calculated on a regular 2d or 3d grid which overlays a 2d or 3d simulation domain. The grid points and the data they store are distributed across processors; each processor owns the grid points -which fall within its sub-domain. +which fall within its subdomain. Note that a single fix typically produces either global or per-atom or local or per-grid values (or none at all). It does not produce both diff --git a/doc/src/fix_ave_grid.rst b/doc/src/fix_ave_grid.rst index fe22e5e7ef..5760bb4508 100644 --- a/doc/src/fix_ave_grid.rst +++ b/doc/src/fix_ave_grid.rst @@ -84,7 +84,7 @@ produced by other computes or fixes. This fix operates in either per-grid inputs in the same command. The grid created by this command is distributed; each processor owns -the grid points that are within its sub-domain. This is similar to +the grid points that are within its subdomain. This is similar to the :doc:`fix ave/chunk ` command when it uses chunks from the :doc:`compute chunk/atom ` command which are 2d or 3d regular bins. However, the per-bin outputs in that case diff --git a/doc/src/fix_balance.rst b/doc/src/fix_balance.rst index bf4f77ecd9..d3f4c48248 100644 --- a/doc/src/fix_balance.rst +++ b/doc/src/fix_balance.rst @@ -44,7 +44,7 @@ Syntax *store* name = store weight in custom atom property defined by :doc:`fix property/atom ` command name = atom property name (without d\_ prefix) *out* arg = filename - filename = write each processor's sub-domain to a file, at each re-balancing + filename = write each processor's subdomain to a file, at each re-balancing Examples """""""" @@ -61,7 +61,7 @@ Examples Description """"""""""" -This command adjusts the size and shape of processor sub-domains +This command adjusts the size and shape of processor subdomains within the simulation box, to attempt to balance the number of particles and thus the computational cost (load) evenly across processors. The load balancing is "dynamic" in the sense that @@ -77,7 +77,7 @@ an irregular-shaped geometry containing void regions, or :doc:`hybrid pair style simulations ` that combine pair styles with different computational cost). In these cases, the LAMMPS default of dividing the simulation box volume into a -regular-spaced grid of 3d bricks, with one equal-volume sub-domain +regular-spaced grid of 3d bricks, with one equal-volume subdomain per processor, may assign numbers of particles per processor in a way that the computational effort varies significantly. This can lead to poor performance when the simulation is run in parallel. @@ -105,7 +105,7 @@ a :math:`P_x \times P_y \times P_z` grid of processors, it allows choices of :math:`P_x P_y P_z = P`, the total number of processors. This is sufficient to achieve good load-balance for some problems on some processor counts. However, all the processor -sub-domains will still have the same shape and the same volume. +subdomains will still have the same shape and the same volume. On a particular time step, a load-balancing operation is only performed if the current "imbalance factor" in particles owned by each processor @@ -141,7 +141,7 @@ forced even if the current balance is perfect (1.0) be specifying a simulation could run up to 20% faster if it were perfectly balanced, versus when imbalanced. However, computational cost is not strictly proportional to particle count, and changing the relative size and - shape of processor sub-domains may lead to additional computational + shape of processor subdomains may lead to additional computational and communication overheads (e.g., in the PPPM solver used via the :doc:`kspace_style ` command). Thus, you should benchmark the run times of a simulation before and after balancing. @@ -156,7 +156,7 @@ The *shift* style is a "grid" method which produces a logical 3d grid of processors. It operates by changing the cutting planes (or lines) between processors in 3d (or 2d), to adjust the volume (area in 2d) assigned to each processor, as in the following 2d diagram where -processor sub-domains are shown and atoms are colored by the processor +processor subdomains are shown and atoms are colored by the processor that owns them. .. |balance1| image:: img/balance_uniform.jpg @@ -258,7 +258,7 @@ from balanced, and converge more slowly. In this case you probably want to use the :doc:`balance ` command before starting a run, so that you begin the run with a balanced system. -Once the re-balancing is complete and final processor sub-domains +Once the re-balancing is complete and final processor subdomains assigned, particles migrate to their new owning processor as part of the normal reneighboring procedure. @@ -266,7 +266,7 @@ the normal reneighboring procedure. At each re-balance operation, the bisectioning for each cutting plane (line in 2d) typically starts with low and high bounds separated - by the extent of a processor's sub-domain in one dimension. The size + by the extent of a processor's subdomain in one dimension. The size of this bracketing region shrinks based on the local density, as described above, which should typically be 1/2 or more every iteration. Thus if :math:`N_\text{iter}` is specified as 10, the cutting @@ -310,7 +310,7 @@ in that sub-box. The *out* keyword writes text to the specified *filename* with the results of each re-balancing operation. The file contains the bounds -of the sub-domain for each processor after the balancing operation +of the subdomain for each processor after the balancing operation completes. The format of the file is compatible with the `Pizza.py `_ *mdump* tool which has support for manipulating and visualizing mesh files. An example is shown here for a balancing by four @@ -354,7 +354,7 @@ processors for a 2d problem: 4 1 13 14 15 16 The coordinates of all the vertices are listed in the NODES section, five -per processor. Note that the four sub-domains share vertices, so there +per processor. Note that the four subdomains share vertices, so there will be duplicate nodes in the list. The "SQUARES" section lists the node IDs of the four vertices in a diff --git a/doc/src/fix_box_relax.rst b/doc/src/fix_box_relax.rst index 5827fe3732..5efbfcab1b 100644 --- a/doc/src/fix_box_relax.rst +++ b/doc/src/fix_box_relax.rst @@ -118,7 +118,7 @@ displaced by the same amount, different on each iteration. all. Also note that if the box shape tilts to an extreme shape, LAMMPS will run less efficiently, due to the large volume of communication needed to acquire ghost atoms around a processor's - irregular-shaped sub-domain. For extreme values of tilt, LAMMPS may + irregular-shaped subdomain. For extreme values of tilt, LAMMPS may also lose atoms and generate an error. .. note:: diff --git a/doc/src/fix_deform.rst b/doc/src/fix_deform.rst index 805bd84382..9adf5b4aa2 100644 --- a/doc/src/fix_deform.rst +++ b/doc/src/fix_deform.rst @@ -546,7 +546,7 @@ flipping the box when it is exceeded. If the *flip* value is set to you apply large deformations, this means the box shape can tilt dramatically LAMMPS will run less efficiently, due to the large volume of communication needed to acquire ghost atoms around a processor's -irregular-shaped sub-domain. For extreme values of tilt, LAMMPS may +irregular-shaped subdomain. For extreme values of tilt, LAMMPS may also lose atoms and generate an error. The *units* keyword determines the meaning of the distance units used diff --git a/doc/src/fix_lb_fluid.rst b/doc/src/fix_lb_fluid.rst index 5f20bb10b1..04f306156a 100644 --- a/doc/src/fix_lb_fluid.rst +++ b/doc/src/fix_lb_fluid.rst @@ -198,7 +198,7 @@ dt}{\rho dx^2}` is approximately equal to 1. and a simulation domain size. This fix uses the same subdivision of the simulation domain among processors as the main LAMMPS program. In order to uniformly cover the simulation domain with lattice sites, the - lengths of the individual LAMMPS sub-domains must all be evenly + lengths of the individual LAMMPS subdomains must all be evenly divisible by :math:`dx_{LB}`. If the simulation domain size is cubic, with equal lengths in all dimensions, and the default value for :math:`dx_{LB}` is used, this will automatically be satisfied. diff --git a/doc/src/fix_nh.rst b/doc/src/fix_nh.rst index 5e8f313323..7f2ec69f9b 100644 --- a/doc/src/fix_nh.rst +++ b/doc/src/fix_nh.rst @@ -371,7 +371,7 @@ flipping the box when it is exceeded. If the *flip* value is set to applied stress induces large deformations (e.g. in a liquid), this means the box shape can tilt dramatically and LAMMPS will run less efficiently, due to the large volume of communication needed to -acquire ghost atoms around a processor's irregular-shaped sub-domain. +acquire ghost atoms around a processor's irregular-shaped subdomain. For extreme values of tilt, LAMMPS may also lose atoms and generate an error. diff --git a/doc/src/fix_npt_cauchy.rst b/doc/src/fix_npt_cauchy.rst index 6bff29f9dd..36b22037bf 100644 --- a/doc/src/fix_npt_cauchy.rst +++ b/doc/src/fix_npt_cauchy.rst @@ -311,7 +311,7 @@ flipping the box when it is exceeded. If the *flip* value is set to applied stress induces large deformations (e.g. in a liquid), this means the box shape can tilt dramatically and LAMMPS will run less efficiently, due to the large volume of communication needed to -acquire ghost atoms around a processor's irregular-shaped sub-domain. +acquire ghost atoms around a processor's irregular-shaped subdomain. For extreme values of tilt, LAMMPS may also lose atoms and generate an error. diff --git a/doc/src/fix_shardlow.rst b/doc/src/fix_shardlow.rst index 94e4b557f5..468985be1b 100644 --- a/doc/src/fix_shardlow.rst +++ b/doc/src/fix_shardlow.rst @@ -69,7 +69,7 @@ geometries. This fix must be used with an additional fix that specifies time integration, e.g. :doc:`fix nve ` or :doc:`fix nph `. -The Shardlow splitting algorithm requires the sizes of the sub-domain +The Shardlow splitting algorithm requires the sizes of the subdomain lengths to be larger than twice the cutoff+skin. Generally, the domain decomposition is dependent on the number of processors requested. diff --git a/doc/src/fix_ttm.rst b/doc/src/fix_ttm.rst index e75f33bb4e..1953aee7e7 100644 --- a/doc/src/fix_ttm.rst +++ b/doc/src/fix_ttm.rst @@ -90,7 +90,7 @@ The description in this sub-section applies to all 3 fix styles: *ttm*, *ttm/grid*, and *ttm/mod*. Fix *ttm/grid* distributes the regular grid across processors consistent -with the sub-domains of atoms owned by each processor, but is otherwise +with the subdomains of atoms owned by each processor, but is otherwise identical to fix ttm. Note that fix *ttm* stores a copy of the grid on each processor, which is acceptable when the overall grid is reasonably small. For larger grids you should use fix *ttm/grid* instead. @@ -170,11 +170,11 @@ ttm/mod. periodic boundary conditions in all dimensions. They also require that the size and shape of the simulation box do not vary dynamically, e.g. due to use of the :doc:`fix npt ` command. - Likewise, the size/shape of processor sub-domains cannot vary due to + Likewise, the size/shape of processor subdomains cannot vary due to dynamic load-balancing via use of the :doc:`fix balance ` command. It is possible however to load balance before the simulation starts using the :doc:`balance ` - command, so that each processor has a different size sub-domain. + command, so that each processor has a different size subdomain. Periodic boundary conditions are also used in the heat equation solve for the electronic subsystem. This varies from the approach of diff --git a/doc/src/package.rst b/doc/src/package.rst index 84dab41ab3..d992d51281 100644 --- a/doc/src/package.rst +++ b/doc/src/package.rst @@ -399,7 +399,7 @@ automatically throughout the run. This typically give performance within 5 to 10 percent of the optimal fixed fraction. The *ghost* keyword determines whether or not ghost atoms, i.e. atoms -at the boundaries of processor sub-domains, are offloaded for neighbor +at the boundaries of processor subdomains, are offloaded for neighbor and force calculations. When the value = "no", ghost atoms are not offloaded. This option can reduce the amount of data transfer with the co-processor and can also overlap MPI communication of forces with @@ -521,7 +521,7 @@ the comm keywords. The value options for the keywords are *no* or *host* or *device*\ . A value of *no* means to use the standard non-KOKKOS method of packing/unpacking data for the communication. A value of *host* means to -use the host, typically a multi-core CPU, and perform the +use the host, typically a multicore CPU, and perform the packing/unpacking in parallel with threads. A value of *device* means to use the device, typically a GPU, to perform the packing/unpacking operation. diff --git a/doc/src/pair_dsmc.rst b/doc/src/pair_dsmc.rst index b0a508b054..edac1d7a65 100644 --- a/doc/src/pair_dsmc.rst +++ b/doc/src/pair_dsmc.rst @@ -56,7 +56,7 @@ commands: The global DSMC *max_cell_size* determines the maximum cell length used in the DSMC calculation. A structured mesh is overlayed on the simulation box such that an integer number of cells are created in -each direction for each processor's sub-domain. Cell lengths are +each direction for each processor's subdomain. Cell lengths are adjusted up to the user-specified maximum cell size. ---------- diff --git a/doc/src/pair_none.rst b/doc/src/pair_none.rst index 0bb366bce1..5eb08a2eaa 100644 --- a/doc/src/pair_none.rst +++ b/doc/src/pair_none.rst @@ -31,7 +31,7 @@ and the neighbor skin distance (see the documentation of the ` command). When you have bonds, angles, dihedrals, or impropers defined at the same time, you must set the communication cutoff so that communication cutoff distance is large enough to acquire -and communicate sufficient ghost atoms from neighboring sub-domains as +and communicate sufficient ghost atoms from neighboring subdomains as needed for computing bonds, angles, etc. A pair style of *none* will also not request a pairwise neighbor list. diff --git a/doc/src/processors.rst b/doc/src/processors.rst index e4279c00ea..64959ec6b2 100644 --- a/doc/src/processors.rst +++ b/doc/src/processors.rst @@ -66,7 +66,7 @@ parameters can be specified with an asterisk "\*", which means LAMMPS will choose the number of processors in that dimension of the grid. It will do this based on the size and shape of the global simulation box so as to minimize the surface-to-volume ratio of each processor's -sub-domain. +subdomain. Choosing explicit values for Px or Py or Pz can be used to override the default manner in which LAMMPS will create the regular 3d grid of @@ -81,7 +81,7 @@ equal 1. Note that if you run on a prime number of processors P, then a grid such as 1 x P x 1 will be required, which may incur extra communication costs due to the high surface area of each processor's -sub-domain. +subdomain. Also note that if multiple partitions are being used then P is the number of processors in this partition; see the :doc:`-partition command-line switch ` page for details. Also note @@ -113,10 +113,10 @@ will persist for all simulations. If balancing is performed, some of the methods invoked by those commands retain the logical topology of the initial 3d grid, and the mapping of processors to the grid specified by the processors command. However the grid spacings in -different dimensions may change, so that processors own sub-domains of +different dimensions may change, so that processors own subdomains of different sizes. If the :doc:`comm_style tiled ` command is used, methods invoked by the balancing commands may discard the 3d -grid of processors and tile the simulation domain with sub-domains of +grid of processors and tile the simulation domain with subdomains of different sizes and shapes which no longer have a logical 3d connectivity. If that occurs, all the information specified by the processors command is ignored. @@ -129,7 +129,7 @@ processors. The *onelevel* style creates a 3d grid that is compatible with the Px,Py,Pz settings, and which minimizes the surface-to-volume ratio of -each processor's sub-domain, as described above. The mapping of +each processor's subdomain, as described above. The mapping of processors to the grid is determined by the *map* keyword setting. The *twolevel* style can be used on machines with multicore nodes to @@ -145,7 +145,7 @@ parameters can be specified with an asterisk "\*", which means LAMMPS will choose the number of cores in that dimension of the node's sub-grid. As with Px,Py,Pz, it will do this based on the size and shape of the global simulation box so as to minimize the -surface-to-volume ratio of each processor's sub-domain. +surface-to-volume ratio of each processor's subdomain. .. note:: diff --git a/doc/src/replicate.rst b/doc/src/replicate.rst index 24ec52cbb0..ed4e844c35 100644 --- a/doc/src/replicate.rst +++ b/doc/src/replicate.rst @@ -16,7 +16,7 @@ nx,ny,nz = replication factors in each dimension .. parsed-literal:: - *bbox* = only check atoms in replicas that overlap with a processor's sub-domain + *bbox* = only check atoms in replicas that overlap with a processor's subdomain Examples """""""" @@ -52,7 +52,7 @@ image flags that differ by 1. This will allow the bond to be unwrapped appropriately. The optional keyword *bbox* uses a bounding box to only check atoms in -replicas that overlap with a processor's sub-domain when assigning +replicas that overlap with a processor's subdomain when assigning atoms to processors. It typically results in a substantial speedup when using the replicate command on a large number of processors. It does require temporary use of more memory, specifically that each diff --git a/doc/src/thermo_modify.rst b/doc/src/thermo_modify.rst index 1bee26c289..fe6f56488d 100644 --- a/doc/src/thermo_modify.rst +++ b/doc/src/thermo_modify.rst @@ -64,7 +64,7 @@ The *lost* keyword determines whether LAMMPS checks for lost atoms each time it computes thermodynamics and what it does if atoms are lost. An atom can be "lost" if it moves across a non-periodic simulation box :doc:`boundary ` or if it moves more than a box length outside -the simulation domain (or more than a processor sub-domain length) +the simulation domain (or more than a processor subdomain length) before reneighboring occurs. The latter case is typically due to bad dynamics (e.g., too large a time step and/or huge forces and velocities). If the value is *ignore*, LAMMPS does not check for lost atoms. If the diff --git a/doc/utils/sphinx-config/false_positives.txt b/doc/utils/sphinx-config/false_positives.txt index 204cf47a05..bcc269e451 100644 --- a/doc/utils/sphinx-config/false_positives.txt +++ b/doc/utils/sphinx-config/false_positives.txt @@ -3432,6 +3432,8 @@ Subclassed subcutoff subcycle subcycling +subdomain +subdomains subhi sublo Subramaniyan