revise based on suggestions from languagetool.org

This commit is contained in:
Axel Kohlmeyer
2023-01-22 08:33:04 -05:00
parent 57349b042e
commit f65f79ef82
56 changed files with 275 additions and 267 deletions

View File

@ -14,8 +14,8 @@ Owned and ghost atoms
As described on the :doc:`parallel partitioning algorithms As described on the :doc:`parallel partitioning algorithms
<Developer_par_part>` page, LAMMPS spatially decomposes the simulation <Developer_par_part>` page, LAMMPS spatially decomposes the simulation
domain, either in a *brick* or *tiled* manner. Each processor (MPI domain, either in a *brick* or *tiled* manner. Each processor (MPI
task) owns atoms within its sub-domain and additionally stores ghost task) owns atoms within its subdomain and additionally stores ghost
atoms within a cutoff distance of its sub-domain. atoms within a cutoff distance of its subdomain.
Forward and reverse communication Forward and reverse communication
================================= =================================

View File

@ -139,7 +139,7 @@ Periodic boundary conditions are then applied by the Domain class via
its ``pbc()`` method to remap particles that have moved outside the its ``pbc()`` method to remap particles that have moved outside the
simulation box back into the box. Note that this is not done every simulation box back into the box. Note that this is not done every
timestep, but only when neighbor lists are rebuilt. This is so that timestep, but only when neighbor lists are rebuilt. This is so that
each processor's sub-domain will have consistent (nearby) atom each processor's subdomain will have consistent (nearby) atom
coordinates for its owned and ghost atoms. It is also why dumped atom coordinates for its owned and ghost atoms. It is also why dumped atom
coordinates may be slightly outside the simulation box if not dumped coordinates may be slightly outside the simulation box if not dumped
on a step where the neighbor lists are rebuilt. on a step where the neighbor lists are rebuilt.
@ -153,10 +153,10 @@ method of the Comm class and ``setup_bins()`` method of the Neighbor
class perform the update. class perform the update.
The code is now ready to migrate atoms that have left a processor's The code is now ready to migrate atoms that have left a processor's
geometric sub-domain to new processors. The ``exchange()`` method of geometric subdomain to new processors. The ``exchange()`` method of
the Comm class performs this operation. The ``borders()`` method of the the Comm class performs this operation. The ``borders()`` method of the
Comm class then identifies ghost atoms surrounding each processor's Comm class then identifies ghost atoms surrounding each processor's
sub-domain and communicates ghost atom information to neighboring subdomain and communicates ghost atom information to neighboring
processors. It does this by looping over all the atoms owned by a processors. It does this by looping over all the atoms owned by a
processor to make lists of those to send to each neighbor processor. On processor to make lists of those to send to each neighbor processor. On
subsequent timesteps, the lists are used by the ``Comm::forward_comm()`` subsequent timesteps, the lists are used by the ``Comm::forward_comm()``

View File

@ -28,9 +28,9 @@ grid.
More specifically, a grid point is defined for each cell (by default More specifically, a grid point is defined for each cell (by default
the center point), and a processor owns a grid cell if its point is the center point), and a processor owns a grid cell if its point is
within the processor's spatial sub-domain. The union of processor within the processor's spatial subdomain. The union of processor
sub-domains is the global simulation box. If a grid point is on the subdomains is the global simulation box. If a grid point is on the
boundary of two sub-domains, the lower processor owns the grid cell. A boundary of two subdomains, the lower processor owns the grid cell. A
processor may also store copies of ghost cells which surround its processor may also store copies of ghost cells which surround its
owned cells. owned cells.
@ -62,7 +62,7 @@ y-dimension. It is even possible to define a 1x1x1 3d grid, though it
may be inefficient to use it in a computational sense. may be inefficient to use it in a computational sense.
Note that the choice of grid size is independent of the number of Note that the choice of grid size is independent of the number of
processors or their layout in a grid of processor sub-domains which processors or their layout in a grid of processor subdomains which
overlays the simulations domain. Depending on the distributed grid overlays the simulations domain. Depending on the distributed grid
size, a single processor may own many 1000s or no grid cells. size, a single processor may own many 1000s or no grid cells.
@ -235,7 +235,7 @@ invoked, because they influence its operation.
void set_zfactor(double factor); void set_zfactor(double factor);
Processors own a grid cell if a point within the grid cell is inside Processors own a grid cell if a point within the grid cell is inside
the processor's sub-domain. By default this is the center point of the the processor's subdomain. By default this is the center point of the
grid cell. The *set_shift_grid()* method can change this. The *shift* grid cell. The *set_shift_grid()* method can change this. The *shift*
argument is a value from 0.0 to 1.0 (inclusive) which is the offset of argument is a value from 0.0 to 1.0 (inclusive) which is the offset of
the point within the grid cell in each dimension. The default is 0.5 the point within the grid cell in each dimension. The default is 0.5
@ -245,9 +245,9 @@ typically no need to change the default as it is optimal for
minimizing the number of ghost cells needed. minimizing the number of ghost cells needed.
If a processor maps its particles to grid cells, it needs to allow for If a processor maps its particles to grid cells, it needs to allow for
its particles being outside its sub-domain between reneighboring. The its particles being outside its subdomain between reneighboring. The
*distance* argument of the *set_distance()* method sets the furthest *distance* argument of the *set_distance()* method sets the furthest
distance outside a processor's sub-domain which a particle can move. distance outside a processor's subdomain which a particle can move.
Typically this is half the neighbor skin distance, assuming Typically this is half the neighbor skin distance, assuming
reneighboring is done appropriately. This distance is used in reneighboring is done appropriately. This distance is used in
determining how many ghost cells a processor needs to store to enable determining how many ghost cells a processor needs to store to enable
@ -295,7 +295,7 @@ to the Grid class via the *set_zfactor()* method (*set_yfactor()* for
2d grids). The Grid class will then assign ownership of the 1/3 of 2d grids). The Grid class will then assign ownership of the 1/3 of
grid cells that overlay the simulation box to the processors which grid cells that overlay the simulation box to the processors which
also overlay the simulation box. The remaining 2/3 of the grid cells also overlay the simulation box. The remaining 2/3 of the grid cells
are assigned to processors whose sub-domains are adjacent to the upper are assigned to processors whose subdomains are adjacent to the upper
z boundary of the simulation box. z boundary of the simulation box.
---------- ----------
@ -549,13 +549,13 @@ Grid class remap methods for load balancing
The following methods are used when a load-balancing operation, The following methods are used when a load-balancing operation,
triggered by the :doc:`balance <balance>` or :doc:`fix balance triggered by the :doc:`balance <balance>` or :doc:`fix balance
<fix_balance>` commands, changes the partitioning of the simulation <fix_balance>` commands, changes the partitioning of the simulation
domain into processor sub-domains. domain into processor subdomains.
In order to work with load-balancing, any style command (compute, fix, In order to work with load-balancing, any style command (compute, fix,
pair, or kspace style) which allocates a grid and stores per-grid data pair, or kspace style) which allocates a grid and stores per-grid data
should define a *reset_grid()* method; it takes no arguments. It will should define a *reset_grid()* method; it takes no arguments. It will
be called by the two balance commands after they have reset processor be called by the two balance commands after they have reset processor
sub-domains and migrated atoms (particles) to new owning processors. subdomains and migrated atoms (particles) to new owning processors.
The *reset_grid()* method will typically perform some or all of the The *reset_grid()* method will typically perform some or all of the
following operations. See the src/fix_ave_grid.cpp and following operations. See the src/fix_ave_grid.cpp and
src/EXTRA_FIX/fix_ttm_grid.cpp files for examples of *reset_grid()* src/EXTRA_FIX/fix_ttm_grid.cpp files for examples of *reset_grid()*
@ -564,7 +564,7 @@ functions.
First, the *reset_grid()* method can instantiate new grid(s) of the First, the *reset_grid()* method can instantiate new grid(s) of the
same global size, then call *setup_grid()* to partition them via the same global size, then call *setup_grid()* to partition them via the
new processor sub-domains. At this point, it can invoke the new processor subdomains. At this point, it can invoke the
*identical()* method which compares the owned and ghost grid cell *identical()* method which compares the owned and ghost grid cell
index bounds between two grids, the old grid passed as a pointer index bounds between two grids, the old grid passed as a pointer
argument, and the new grid whose *identical()* method is being called. argument, and the new grid whose *identical()* method is being called.

View File

@ -102,7 +102,7 @@ build is then :doc:`processed in parallel <Developer_par_neigh>`.
The most commonly required neighbor list is a so-called "half" neighbor The most commonly required neighbor list is a so-called "half" neighbor
list, where each pair of atoms is listed only once (except when the list, where each pair of atoms is listed only once (except when the
:doc:`newton command setting <newton>` for pair is off; in that case :doc:`newton command setting <newton>` for pair is off; in that case
pairs straddling sub-domains or periodic boundaries will be listed twice). pairs straddling subdomains or periodic boundaries will be listed twice).
Thus these are the default settings when a neighbor list request is created in: Thus these are the default settings when a neighbor list request is created in:
.. code-block:: c++ .. code-block:: c++
@ -361,7 +361,7 @@ allocated as a 1d vector or 3d array. Either way, the ordering of
values within contiguous memory x fastest, then y, z slowest. values within contiguous memory x fastest, then y, z slowest.
For the ``3d decomposition`` of the grid, the global grid is For the ``3d decomposition`` of the grid, the global grid is
partitioned into bricks that correspond to the sub-domains of the partitioned into bricks that correspond to the subdomains of the
simulation box that each processor owns. Often, this is a regular 3d simulation box that each processor owns. Often, this is a regular 3d
array (Px by Py by Pz) of bricks, where P = number of processors = array (Px by Py by Pz) of bricks, where P = number of processors =
Px * Py * Pz. More generally it can be a tiled decomposition, where Px * Py * Pz. More generally it can be a tiled decomposition, where

View File

@ -7,16 +7,16 @@ large systems provided it uses a correspondingly large number of MPI
processes. Since The per-atom data (atom IDs, positions, velocities, processes. Since The per-atom data (atom IDs, positions, velocities,
types, etc.) To be able to compute the short-range interactions MPI types, etc.) To be able to compute the short-range interactions MPI
processes need not only access to data of atoms they "own" but also processes need not only access to data of atoms they "own" but also
information about atoms from neighboring sub-domains, in LAMMPS referred information about atoms from neighboring subdomains, in LAMMPS referred
to as "ghost" atoms. These are copies of atoms storing required to as "ghost" atoms. These are copies of atoms storing required
per-atom data for up to the communication cutoff distance. The green per-atom data for up to the communication cutoff distance. The green
dashed-line boxes in the :ref:`domain-decomposition` figure illustrate dashed-line boxes in the :ref:`domain-decomposition` figure illustrate
the extended ghost-atom sub-domain for one processor. the extended ghost-atom subdomain for one processor.
This approach is also used to implement periodic boundary This approach is also used to implement periodic boundary
conditions: atoms that lie within the cutoff distance across a periodic conditions: atoms that lie within the cutoff distance across a periodic
boundary are also stored as ghost atoms and taken from the periodic boundary are also stored as ghost atoms and taken from the periodic
replication of the sub-domain, which may be the same sub-domain, e.g. if replication of the subdomain, which may be the same subdomain, e.g. if
running in serial. As a consequence of this, force computation in running in serial. As a consequence of this, force computation in
LAMMPS is not subject to minimum image conventions and thus cutoffs may LAMMPS is not subject to minimum image conventions and thus cutoffs may
be larger than half the simulation domain. be larger than half the simulation domain.
@ -28,10 +28,10 @@ be larger than half the simulation domain.
ghost atom communication ghost atom communication
This figure shows the ghost atom communication patterns between This figure shows the ghost atom communication patterns between
sub-domains for "brick" (left) and "tiled" communication styles for subdomains for "brick" (left) and "tiled" communication styles for
2d simulations. The numbers indicate MPI process ranks. Here the 2d simulations. The numbers indicate MPI process ranks. Here the
sub-domains are drawn spatially separated for clarity. The subdomains are drawn spatially separated for clarity. The
dashed-line box is the extended sub-domain of processor 0 which dashed-line box is the extended subdomain of processor 0 which
includes its ghost atoms. The red- and blue-shaded boxes are the includes its ghost atoms. The red- and blue-shaded boxes are the
regions of communicated ghost atoms. regions of communicated ghost atoms.
@ -42,7 +42,7 @@ atom communication is performed in two stages for a 2d simulation (three
in 3d) for both a regular and irregular partitioning of the simulation in 3d) for both a regular and irregular partitioning of the simulation
box. For the regular case (left) atoms are exchanged first in the box. For the regular case (left) atoms are exchanged first in the
*x*-direction, then in *y*, with four neighbors in the grid of processor *x*-direction, then in *y*, with four neighbors in the grid of processor
sub-domains. subdomains.
In the *x* stage, processor ranks 1 and 2 send owned atoms in their In the *x* stage, processor ranks 1 and 2 send owned atoms in their
red-shaded regions to rank 0 (and vice versa). Then in the *y* stage, red-shaded regions to rank 0 (and vice versa). Then in the *y* stage,
@ -55,7 +55,7 @@ For the irregular case (right) the two stages are similar, but a
processor can have more than one neighbor in each direction. In the processor can have more than one neighbor in each direction. In the
*x* stage, MPI ranks 1,2,3 send owned atoms in their red-shaded regions to *x* stage, MPI ranks 1,2,3 send owned atoms in their red-shaded regions to
rank 0 (and vice versa). These include only atoms between the lower rank 0 (and vice versa). These include only atoms between the lower
and upper *y*-boundary of rank 0's sub-domain. In the *y* stage, ranks and upper *y*-boundary of rank 0's subdomain. In the *y* stage, ranks
4,5,6 send atoms in their blue-shaded regions to rank 0. This may 4,5,6 send atoms in their blue-shaded regions to rank 0. This may
include ghost atoms they received in the *x* stage, but only if they include ghost atoms they received in the *x* stage, but only if they
are needed by rank 0 to fill its extended ghost atom regions in the are needed by rank 0 to fill its extended ghost atom regions in the
@ -110,11 +110,11 @@ performed in LAMMPS:
over 3x the length of a stretched bond for dihedral interactions. It over 3x the length of a stretched bond for dihedral interactions. It
can also exceed the periodic box size. For the regular communication can also exceed the periodic box size. For the regular communication
pattern (left), if the cutoff distance extends beyond a neighbor pattern (left), if the cutoff distance extends beyond a neighbor
processor's sub-domain, then multiple exchanges are performed in the processor's subdomain, then multiple exchanges are performed in the
same direction. Each exchange is with the same neighbor processor, same direction. Each exchange is with the same neighbor processor,
but buffers are packed/unpacked using a different list of atoms. For but buffers are packed/unpacked using a different list of atoms. For
forward communication, in the first exchange a processor sends only forward communication, in the first exchange a processor sends only
owned atoms. In subsequent exchanges, it sends ghost atoms received owned atoms. In subsequent exchanges, it sends ghost atoms received
in previous exchanges. For the irregular pattern (right) overlaps of in previous exchanges. For the irregular pattern (right) overlaps of
a processor's extended ghost-atom sub-domain with all other processors a processor's extended ghost-atom subdomain with all other processors
in each dimension are detected. in each dimension are detected.

View File

@ -20,7 +20,7 @@ e) electric field values from grid points near each atom are interpolated to com
For any of the spatial-decomposition partitioning schemes each processor For any of the spatial-decomposition partitioning schemes each processor
owns the brick-shaped portion of FFT grid points contained within its owns the brick-shaped portion of FFT grid points contained within its
sub-domain. The two interpolation operations use a stencil of grid subdomain. The two interpolation operations use a stencil of grid
points surrounding each atom. To accommodate the stencil size, each points surrounding each atom. To accommodate the stencil size, each
processor also stores a few layers of ghost grid points surrounding its processor also stores a few layers of ghost grid points surrounding its
brick. Forward and reverse communication of grid point values is brick. Forward and reverse communication of grid point values is
@ -64,7 +64,7 @@ direction of the 1d FFTs it has to perform. LAMMPS uses the
pencil-decomposition algorithm as shown in the :ref:`fft-parallel` figure. pencil-decomposition algorithm as shown in the :ref:`fft-parallel` figure.
Initially (far left), each processor owns a brick of same-color grid Initially (far left), each processor owns a brick of same-color grid
cells (actually grid points) contained within in its sub-domain. A cells (actually grid points) contained within in its subdomain. A
brick-to-pencil communication operation converts this layout to 1d brick-to-pencil communication operation converts this layout to 1d
pencils in the *x*-dimension (center left). Again, cells of the same pencils in the *x*-dimension (center left). Again, cells of the same
color are owned by the same processor. Each processor can then compute color are owned by the same processor. Each processor can then compute
@ -161,8 +161,8 @@ grid/particle operations that LAMMPS supports:
<partition>` calculation and then use the :doc:`verlet/split <partition>` calculation and then use the :doc:`verlet/split
integrator <run_style>` to perform the PPPM computation on a integrator <run_style>` to perform the PPPM computation on a
dedicated, separate partition of MPI processes. This uses an integer dedicated, separate partition of MPI processes. This uses an integer
"1:*p*" mapping of *p* sub-domains of the atom decomposition to one "1:*p*" mapping of *p* subdomains of the atom decomposition to one
sub-domain of the FFT grid decomposition and where pairwise non-bonded subdomain of the FFT grid decomposition and where pairwise non-bonded
and bonded forces and energies are computed on the larger partition and bonded forces and energies are computed on the larger partition
and the PPPM kspace computation concurrently on the smaller partition. and the PPPM kspace computation concurrently on the smaller partition.
@ -172,7 +172,7 @@ grid/particle operations that LAMMPS supports:
- LAMMPS implements a ``GridComm`` class which overlays the simulation - LAMMPS implements a ``GridComm`` class which overlays the simulation
domain with a regular grid, partitions it across processors in a domain with a regular grid, partitions it across processors in a
manner consistent with processor sub-domains, and provides methods for manner consistent with processor subdomains, and provides methods for
forward and reverse communication of owned and ghost grid point forward and reverse communication of owned and ghost grid point
values. It is used for PPPM as an FFT grid (as outlined above) and values. It is used for PPPM as an FFT grid (as outlined above) and
also for the MSM algorithm which uses a cascade of grid sizes from also for the MSM algorithm which uses a cascade of grid sizes from

View File

@ -22,7 +22,7 @@ last reneighboring; this and other options of the neighbor list rebuild
can be adjusted with the :doc:`neigh_modify <neigh_modify>` command. can be adjusted with the :doc:`neigh_modify <neigh_modify>` command.
On steps when reneighboring is performed, atoms which have moved outside On steps when reneighboring is performed, atoms which have moved outside
their owning processor's sub-domain are first migrated to new processors their owning processor's subdomain are first migrated to new processors
via communication. Periodic boundary conditions are also (only) via communication. Periodic boundary conditions are also (only)
enforced on these steps to ensure each atom is re-assigned to the enforced on these steps to ensure each atom is re-assigned to the
correct processor. After migration, the atoms owned by each processor correct processor. After migration, the atoms owned by each processor
@ -39,12 +39,12 @@ its settings modified with the :doc:`atom_modify <atom_modify>` command.
neighbor list stencils neighbor list stencils
A 2d simulation sub-domain (thick black line) and the corresponding A 2d simulation subdomain (thick black line) and the corresponding
ghost atom cutoff region (dashed blue line) for both orthogonal ghost atom cutoff region (dashed blue line) for both orthogonal
(left) and triclinic (right) domains. A regular grid of neighbor (left) and triclinic (right) domains. A regular grid of neighbor
bins (thin lines) overlays the entire simulation domain and need not bins (thin lines) overlays the entire simulation domain and need not
align with sub-domain boundaries; only the portion overlapping the align with subdomain boundaries; only the portion overlapping the
augmented sub-domain is shown. In the triclinic case it overlaps the augmented subdomain is shown. In the triclinic case it overlaps the
bounding box of the tilted rectangle. The blue- and red-shaded bins bounding box of the tilted rectangle. The blue- and red-shaded bins
represent a stencil of bins searched to find neighbors of a particular represent a stencil of bins searched to find neighbors of a particular
atom (black dot). atom (black dot).
@ -52,8 +52,8 @@ its settings modified with the :doc:`atom_modify <atom_modify>` command.
To build a local neighbor list in linear time, the simulation domain is To build a local neighbor list in linear time, the simulation domain is
overlaid (conceptually) with a regular 3d (or 2d) grid of neighbor bins, overlaid (conceptually) with a regular 3d (or 2d) grid of neighbor bins,
as shown in the :ref:`neighbor-stencil` figure for 2d models and a as shown in the :ref:`neighbor-stencil` figure for 2d models and a
single MPI processor's sub-domain. Each processor stores a set of single MPI processor's subdomain. Each processor stores a set of
neighbor bins which overlap its sub-domain extended by the neighbor neighbor bins which overlap its subdomain extended by the neighbor
cutoff distance :math:`R_n`. As illustrated, the bins need not align cutoff distance :math:`R_n`. As illustrated, the bins need not align
with processor boundaries; an integer number in each dimension is fit to with processor boundaries; an integer number in each dimension is fit to
the size of the entire simulation box. the size of the entire simulation box.
@ -144,7 +144,7 @@ supports:
- For small and sparse systems and as a fallback method, LAMMPS also - For small and sparse systems and as a fallback method, LAMMPS also
supports neighbor list construction without binning by using a full supports neighbor list construction without binning by using a full
:math:`O(N^2)` loop over all *i,j* atom pairs in a sub-domain when :math:`O(N^2)` loop over all *i,j* atom pairs in a subdomain when
using the :doc:`neighbor nsq <neighbor>` command. using the :doc:`neighbor nsq <neighbor>` command.
- Dependent on the "pair" setting of the :doc:`newton <newton>` command, - Dependent on the "pair" setting of the :doc:`newton <newton>` command,

View File

@ -15,8 +15,8 @@ distributed-memory parallelism is set with the :doc:`comm_style command
for MPI parallelization: "brick" on the left with an orthogonal for MPI parallelization: "brick" on the left with an orthogonal
(left) and a triclinic (middle) simulation domain, and a "tiled" (left) and a triclinic (middle) simulation domain, and a "tiled"
decomposition (right). The black lines show the division into decomposition (right). The black lines show the division into
sub-domains and the contained atoms are "owned" by the corresponding subdomains and the contained atoms are "owned" by the corresponding
MPI process. The green dashed lines indicate how sub-domains are MPI process. The green dashed lines indicate how subdomains are
extended with "ghost" atoms up to the communication cutoff distance. extended with "ghost" atoms up to the communication cutoff distance.
The LAMMPS simulation box is a 3d or 2d volume, which can be orthogonal The LAMMPS simulation box is a 3d or 2d volume, which can be orthogonal
@ -32,14 +32,14 @@ means the position of the box face adjusts continuously to enclose all
the atoms. the atoms.
For distributed-memory MPI parallelism, the simulation box is spatially For distributed-memory MPI parallelism, the simulation box is spatially
decomposed (partitioned) into non-overlapping sub-domains which fill the decomposed (partitioned) into non-overlapping subdomains which fill the
box. The default partitioning, "brick", is most suitable when atom box. The default partitioning, "brick", is most suitable when atom
density is roughly uniform, as shown in the left-side images of the density is roughly uniform, as shown in the left-side images of the
:ref:`domain-decomposition` figure. The sub-domains comprise a regular :ref:`domain-decomposition` figure. The subdomains comprise a regular
grid and all sub-domains are identical in size and shape. Both the grid and all subdomains are identical in size and shape. Both the
orthogonal and triclinic boxes can deform continuously during a orthogonal and triclinic boxes can deform continuously during a
simulation, e.g. to compress a solid or shear a liquid, in which case simulation, e.g. to compress a solid or shear a liquid, in which case
the processor sub-domains likewise deform. the processor subdomains likewise deform.
For models with non-uniform density, the number of particles per For models with non-uniform density, the number of particles per
@ -50,14 +50,14 @@ load. For such models, LAMMPS supports multiple strategies to reduce
the load imbalance: the load imbalance:
- The processor grid decomposition is by default based on the simulation - The processor grid decomposition is by default based on the simulation
cell volume and tries to optimize the volume to surface ratio for the sub-domains. cell volume and tries to optimize the volume to surface ratio for the subdomains.
This can be changed with the :doc:`processors command <processors>`. This can be changed with the :doc:`processors command <processors>`.
- The parallel planes defining the size of the sub-domains can be shifted - The parallel planes defining the size of the subdomains can be shifted
with the :doc:`balance command <balance>`. Which can be done in addition with the :doc:`balance command <balance>`. Which can be done in addition
to choosing a more optimal processor grid. to choosing a more optimal processor grid.
- The recursive bisectioning algorithm in combination with the "tiled" - The recursive bisectioning algorithm in combination with the "tiled"
communication style can produce a partitioning with equal numbers of communication style can produce a partitioning with equal numbers of
particles in each sub-domain. particles in each subdomain.
.. |decomp1| image:: img/decomp-regular.png .. |decomp1| image:: img/decomp-regular.png
@ -76,14 +76,14 @@ the load imbalance:
The pictures above demonstrate different decompositions for a 2d system The pictures above demonstrate different decompositions for a 2d system
with 12 MPI ranks. The atom colors indicate the load imbalance of each with 12 MPI ranks. The atom colors indicate the load imbalance of each
sub-domain with green being optimal and red the least optimal. subdomain with green being optimal and red the least optimal.
Due to the vacuum in the system, the default decomposition is unbalanced Due to the vacuum in the system, the default decomposition is unbalanced
with several MPI ranks without atoms (left). By forcing a 1x12x1 with several MPI ranks without atoms (left). By forcing a 1x12x1
processor grid, every MPI rank does computations now, but number of processor grid, every MPI rank does computations now, but number of
atoms per sub-domain is still uneven and the thin slice shape increases atoms per subdomain is still uneven and the thin slice shape increases
the amount of communication between sub-domains (center left). With a the amount of communication between subdomains (center left). With a
2x6x1 processor grid and shifting the sub-domain divisions, the load 2x6x1 processor grid and shifting the subdomain divisions, the load
imbalance is further reduced and the amount of communication required imbalance is further reduced and the amount of communication required
between sub-domains is less (center right). And using the recursive between subdomains is less (center right). And using the recursive
bisectioning leads to further improved decomposition (right). bisectioning leads to further improved decomposition (right).

View File

@ -7,7 +7,7 @@ decomposition. The parallelization aims to be efficient, and resulting
in good strong scaling (= good speedup for the same system) and good in good strong scaling (= good speedup for the same system) and good
weak scaling (= the computational cost of enlarging the system is weak scaling (= the computational cost of enlarging the system is
proportional to the system size). Additional parallelization using GPUs proportional to the system size). Additional parallelization using GPUs
or OpenMP can also be applied within the sub-domain assigned to an MPI or OpenMP can also be applied within the subdomain assigned to an MPI
process. For clarity, most of the following illustrations show the 2d process. For clarity, most of the following illustrations show the 2d
simulation case. The underlying algorithms in those cases, however, simulation case. The underlying algorithms in those cases, however,
apply to both 2d and 3d cases equally well. apply to both 2d and 3d cases equally well.

View File

@ -647,7 +647,7 @@ Communication buffer coding with *ubuf*
--------------------------------------- ---------------------------------------
LAMMPS uses communication buffers where it collects data from various LAMMPS uses communication buffers where it collects data from various
class instances and then exchanges the data with neighboring sub-domains. class instances and then exchanges the data with neighboring subdomains.
For simplicity those buffers are defined as ``double`` buffers and For simplicity those buffers are defined as ``double`` buffers and
used for doubles and integer numbers. This presents a unique problem used for doubles and integer numbers. This presents a unique problem
when 64-bit integers are used. While the storage needed for a ``double`` when 64-bit integers are used. While the storage needed for a ``double``

View File

@ -5635,7 +5635,7 @@ Doc page with :doc:`WARNING messages <Errors_warnings>`
Lost atoms are checked for each time thermo output is done. See the Lost atoms are checked for each time thermo output is done. See the
thermo_modify lost command for options. Lost atoms usually indicate thermo_modify lost command for options. Lost atoms usually indicate
bad dynamics, e.g. atoms have been blown far out of the simulation bad dynamics, e.g. atoms have been blown far out of the simulation
box, or moved further than one processor's sub-domain away before box, or moved further than one processor's subdomain away before
reneighboring. reneighboring.
*MEAM library error %d* *MEAM library error %d*
@ -6266,14 +6266,14 @@ keyword to allow for additional bonds to be formed
One or more atoms are attempting to map their charge to a MSM grid point One or more atoms are attempting to map their charge to a MSM grid point
that is not owned by a processor. This is likely for one of two that is not owned by a processor. This is likely for one of two
reasons, both of them bad. First, it may mean that an atom near the reasons, both of them bad. First, it may mean that an atom near the
boundary of a processor's sub-domain has moved more than 1/2 the boundary of a processor's subdomain has moved more than 1/2 the
:doc:`neighbor skin distance <neighbor>` without neighbor lists being :doc:`neighbor skin distance <neighbor>` without neighbor lists being
rebuilt and atoms being migrated to new processors. This also means rebuilt and atoms being migrated to new processors. This also means
you may be missing pairwise interactions that need to be computed. you may be missing pairwise interactions that need to be computed.
The solution is to change the re-neighboring criteria via the The solution is to change the re-neighboring criteria via the
:doc:`neigh_modify <neigh_modify>` command. The safest settings are :doc:`neigh_modify <neigh_modify>` command. The safest settings are
"delay 0 every 1 check yes". Second, it may mean that an atom has "delay 0 every 1 check yes". Second, it may mean that an atom has
moved far outside a processor's sub-domain or even the entire moved far outside a processor's subdomain or even the entire
simulation box. This indicates bad physics, e.g. due to highly simulation box. This indicates bad physics, e.g. due to highly
overlapping atoms, too large a timestep, etc. overlapping atoms, too large a timestep, etc.
@ -6281,14 +6281,14 @@ keyword to allow for additional bonds to be formed
One or more atoms are attempting to map their charge to a PPPM grid One or more atoms are attempting to map their charge to a PPPM grid
point that is not owned by a processor. This is likely for one of two point that is not owned by a processor. This is likely for one of two
reasons, both of them bad. First, it may mean that an atom near the reasons, both of them bad. First, it may mean that an atom near the
boundary of a processor's sub-domain has moved more than 1/2 the boundary of a processor's subdomain has moved more than 1/2 the
:doc:`neighbor skin distance <neighbor>` without neighbor lists being :doc:`neighbor skin distance <neighbor>` without neighbor lists being
rebuilt and atoms being migrated to new processors. This also means rebuilt and atoms being migrated to new processors. This also means
you may be missing pairwise interactions that need to be computed. you may be missing pairwise interactions that need to be computed.
The solution is to change the re-neighboring criteria via the The solution is to change the re-neighboring criteria via the
:doc:`neigh_modify <neigh_modify>` command. The safest settings are :doc:`neigh_modify <neigh_modify>` command. The safest settings are
"delay 0 every 1 check yes". Second, it may mean that an atom has "delay 0 every 1 check yes". Second, it may mean that an atom has
moved far outside a processor's sub-domain or even the entire moved far outside a processor's subdomain or even the entire
simulation box. This indicates bad physics, e.g. due to highly simulation box. This indicates bad physics, e.g. due to highly
overlapping atoms, too large a timestep, etc. overlapping atoms, too large a timestep, etc.
@ -6296,14 +6296,14 @@ keyword to allow for additional bonds to be formed
One or more atoms are attempting to map their charge to a PPPM grid One or more atoms are attempting to map their charge to a PPPM grid
point that is not owned by a processor. This is likely for one of two point that is not owned by a processor. This is likely for one of two
reasons, both of them bad. First, it may mean that an atom near the reasons, both of them bad. First, it may mean that an atom near the
boundary of a processor's sub-domain has moved more than 1/2 the boundary of a processor's subdomain has moved more than 1/2 the
:doc:`neighbor skin distance <neighbor>` without neighbor lists being :doc:`neighbor skin distance <neighbor>` without neighbor lists being
rebuilt and atoms being migrated to new processors. This also means rebuilt and atoms being migrated to new processors. This also means
you may be missing pairwise interactions that need to be computed. you may be missing pairwise interactions that need to be computed.
The solution is to change the re-neighboring criteria via the The solution is to change the re-neighboring criteria via the
:doc:`neigh_modify <neigh_modify>` command. The safest settings are :doc:`neigh_modify <neigh_modify>` command. The safest settings are
"delay 0 every 1 check yes". Second, it may mean that an atom has "delay 0 every 1 check yes". Second, it may mean that an atom has
moved far outside a processor's sub-domain or even the entire moved far outside a processor's subdomain or even the entire
simulation box. This indicates bad physics, e.g. due to highly simulation box. This indicates bad physics, e.g. due to highly
overlapping atoms, too large a timestep, etc. overlapping atoms, too large a timestep, etc.

View File

@ -109,9 +109,9 @@ Doc page with :doc:`ERROR messages <Errors_messages>`
*Communication cutoff is shorter than a bond length based estimate. This may lead to errors.* *Communication cutoff is shorter than a bond length based estimate. This may lead to errors.*
Since LAMMPS stores topology data with individual atoms, all atoms Since LAMMPS stores topology data with individual atoms, all atoms
comprising a bond, angle, dihedral or improper must be present on any comprising a bond, angle, dihedral or improper must be present on any
sub-domain that "owns" the atom with the information, either as a subdomain that "owns" the atom with the information, either as a
local or a ghost atom. The communication cutoff is what determines up local or a ghost atom. The communication cutoff is what determines up
to what distance from a sub-domain boundary ghost atoms are created. to what distance from a subdomain boundary ghost atoms are created.
The communication cutoff is by default the largest non-bonded cutoff The communication cutoff is by default the largest non-bonded cutoff
plus the neighbor skin distance, but for short or non-bonded cutoffs plus the neighbor skin distance, but for short or non-bonded cutoffs
and/or long bonds, this may not be sufficient. This warning indicates and/or long bonds, this may not be sufficient. This warning indicates
@ -398,7 +398,7 @@ This will most likely cause errors in kinetic fluctuations.
Lost atoms are checked for each time thermo output is done. See the Lost atoms are checked for each time thermo output is done. See the
thermo_modify lost command for options. Lost atoms usually indicate thermo_modify lost command for options. Lost atoms usually indicate
bad dynamics, e.g. atoms have been blown far out of the simulation bad dynamics, e.g. atoms have been blown far out of the simulation
box, or moved further than one processor's sub-domain away before box, or moved further than one processor's subdomain away before
reneighboring. reneighboring.
*MSM mesh too small, increasing to 2 points in each direction* *MSM mesh too small, increasing to 2 points in each direction*
@ -582,13 +582,13 @@ This will most likely cause errors in kinetic fluctuations.
needed. The requested volume fraction may be too high, or other atoms needed. The requested volume fraction may be too high, or other atoms
may be in the insertion region. may be in the insertion region.
*Proc sub-domain size < neighbor skin, could lead to lost atoms* *Proc subdomain size < neighbor skin, could lead to lost atoms*
The decomposition of the physical domain (likely due to load The decomposition of the physical domain (likely due to load
balancing) has led to a processor's sub-domain being smaller than the balancing) has led to a processor's subdomain being smaller than the
neighbor skin in one or more dimensions. Since reneighboring is neighbor skin in one or more dimensions. Since reneighboring is
triggered by atoms moving the skin distance, this may lead to lost triggered by atoms moving the skin distance, this may lead to lost
atoms, if an atom moves all the way across a neighboring processor's atoms, if an atom moves all the way across a neighboring processor's
sub-domain before reneighboring is triggered. subdomain before reneighboring is triggered.
*Reducing PPPM order b/c stencil extends beyond nearest neighbor processor* *Reducing PPPM order b/c stencil extends beyond nearest neighbor processor*
This may lead to a larger grid than desired. See the kspace_modify overlap This may lead to a larger grid than desired. See the kspace_modify overlap

View File

@ -11,7 +11,7 @@ more values (data).
The grid cells and data they store are distributed across processors. The grid cells and data they store are distributed across processors.
Each processor owns the grid cells (and data) whose center points lie Each processor owns the grid cells (and data) whose center points lie
within the spatial sub-domain of the processor. If needed for its within the spatial subdomain of the processor. If needed for its
computations, a processor may also store ghost grid cells with their computations, a processor may also store ghost grid cells with their
data. data.
@ -28,7 +28,7 @@ box size, as set by the :doc:`boundary <boundary>` command for fixed
or shrink-wrapped boundaries. or shrink-wrapped boundaries.
If load-balancing is invoked by the :doc:`balance <balance>` or If load-balancing is invoked by the :doc:`balance <balance>` or
:doc:`fix balance <fix_balance>` commands, then the sub-domain owned :doc:`fix balance <fix_balance>` commands, then the subdomain owned
by a processor can change which may also change which grid cells they by a processor can change which may also change which grid cells they
own. own.

View File

@ -59,7 +59,7 @@ of bond distances.
A per-grid datum is one or more values per grid cell, for a grid which A per-grid datum is one or more values per grid cell, for a grid which
overlays the simulation domain. The grid cells and the data they overlays the simulation domain. The grid cells and the data they
store are distributed across processors; each processor owns the grid store are distributed across processors; each processor owns the grid
cells whose center point falls within its sub-domain. cells whose center point falls within its subdomain.
.. _scalar: .. _scalar:
@ -322,7 +322,7 @@ The chief difference between the :doc:`fix ave/grid <fix_ave_grid>`
and :doc:`fix ave/chunk <fix_ave_chunk>` commands when used in this and :doc:`fix ave/chunk <fix_ave_chunk>` commands when used in this
context is that the former uses a distributed grid, while the latter context is that the former uses a distributed grid, while the latter
uses a global grid. Distributed means that each processor owns the uses a global grid. Distributed means that each processor owns the
subset of grid cells within its sub-domain. Global means that each subset of grid cells within its subdomain. Global means that each
processor owns a copy of the entire grid. The :doc:`fix ave/grid processor owns a copy of the entire grid. The :doc:`fix ave/grid
<fix_ave_grid>` command is thus more efficient for large grids. <fix_ave_grid>` command is thus more efficient for large grids.

View File

@ -783,19 +783,19 @@ Pitfalls
**Parallel Scalability** **Parallel Scalability**
LAMMPS operates in parallel in a :doc:`spatial-decomposition mode LAMMPS operates in parallel in a :doc:`spatial-decomposition mode
<Developer_par_part>`, where each processor owns a spatial sub-domain of <Developer_par_part>`, where each processor owns a spatial subdomain of
the overall simulation domain and communicates with its neighboring the overall simulation domain and communicates with its neighboring
processors via distributed-memory message passing (MPI) to acquire ghost processors via distributed-memory message passing (MPI) to acquire ghost
atom information to allow forces on the atoms it owns to be atom information to allow forces on the atoms it owns to be
computed. LAMMPS also uses Verlet neighbor lists which are recomputed computed. LAMMPS also uses Verlet neighbor lists which are recomputed
every few timesteps as particles move. On these timesteps, particles every few timesteps as particles move. On these timesteps, particles
also migrate to new processors as needed. LAMMPS decomposes the overall also migrate to new processors as needed. LAMMPS decomposes the overall
simulation domain so that spatial sub-domains of nearly equal volume are simulation domain so that spatial subdomains of nearly equal volume are
assigned to each processor. When each sub-domain contains nearly the assigned to each processor. When each subdomain contains nearly the
same number of particles, this results in a reasonable load balance same number of particles, this results in a reasonable load balance
among all processors. As is more typical with some peridynamic among all processors. As is more typical with some peridynamic
simulations, some sub-domains may contain many particles while other simulations, some subdomains may contain many particles while other
sub-domains contain few particles, resulting in a load imbalance that subdomains contain few particles, resulting in a load imbalance that
impacts parallel scalability. impacts parallel scalability.
**Setting the "skin" distance** **Setting the "skin" distance**

View File

@ -150,7 +150,7 @@ option with either of the commands.
Note that if a simulation box has a large tilt factor, LAMMPS will run Note that if a simulation box has a large tilt factor, LAMMPS will run
less efficiently, due to the large volume of communication needed to less efficiently, due to the large volume of communication needed to
acquire ghost atoms around a processor's irregular-shaped sub-domain. acquire ghost atoms around a processor's irregular-shaped subdomain.
For extreme values of tilt, LAMMPS may also lose atoms and generate an For extreme values of tilt, LAMMPS may also lose atoms and generate an
error. error.

View File

@ -38,11 +38,11 @@ to create digital object identifiers (DOI) for stable releases of the
LAMMPS source code. There are two types of DOIs for the LAMMPS source code. LAMMPS source code. There are two types of DOIs for the LAMMPS source code.
The canonical DOI for **all** versions of LAMMPS, which will always The canonical DOI for **all** versions of LAMMPS, which will always
point to the **latest** stable release version is: point to the **latest** stable release version, is:
- DOI: `10.5281/zenodo.3726416 <https://dx.doi.org/10.5281/zenodo.3726416>`_ - DOI: `10.5281/zenodo.3726416 <https://dx.doi.org/10.5281/zenodo.3726416>`_
In addition there are DOIs for individual stable releases. Currently there are: In addition there are DOIs generated for individual stable releases:
- 3 March 2020 version: `DOI:10.5281/zenodo.3726417 <https://dx.doi.org/10.5281/zenodo.3726417>`_ - 3 March 2020 version: `DOI:10.5281/zenodo.3726417 <https://dx.doi.org/10.5281/zenodo.3726417>`_
- 29 October 2020 version: `DOI:10.5281/zenodo.4157471 <https://dx.doi.org/10.5281/zenodo.4157471>`_ - 29 October 2020 version: `DOI:10.5281/zenodo.4157471 <https://dx.doi.org/10.5281/zenodo.4157471>`_
@ -65,6 +65,6 @@ for optional features used in a specific run is printed to the screen
and log file. Style and output location can be selected with the and log file. Style and output location can be selected with the
:ref:`-cite command-line switch <cite>`. Additional references are :ref:`-cite command-line switch <cite>`. Additional references are
given in the documentation of the :doc:`corresponding commands given in the documentation of the :doc:`corresponding commands
<Commands_all>` or in the :doc:`Howto tutorials <Howto>`. So please <Commands_all>` or in the :doc:`Howto tutorials <Howto>`. Please make
make certain, that you provide the proper acknowledgments and citations certain, that you provide the proper acknowledgments and citations in
in any published works using LAMMPS. any published works using LAMMPS.

View File

@ -27,7 +27,7 @@ General features
* distributed memory message-passing parallelism (MPI) * distributed memory message-passing parallelism (MPI)
* shared memory multi-threading parallelism (OpenMP) * shared memory multi-threading parallelism (OpenMP)
* spatial decomposition of simulation domain for MPI parallelism * spatial decomposition of simulation domain for MPI parallelism
* particle decomposition inside of spatial decomposition for OpenMP and GPU parallelism * particle decomposition inside spatial decomposition for OpenMP and GPU parallelism
* GPLv2 licensed open-source distribution * GPLv2 licensed open-source distribution
* highly portable C++-11 * highly portable C++-11
* modular code with most functionality in optional packages * modular code with most functionality in optional packages
@ -113,7 +113,7 @@ Atom creation
:doc:`create_atoms <create_atoms>`, :doc:`delete_atoms <delete_atoms>`, :doc:`create_atoms <create_atoms>`, :doc:`delete_atoms <delete_atoms>`,
:doc:`displace_atoms <displace_atoms>`, :doc:`replicate <replicate>` commands) :doc:`displace_atoms <displace_atoms>`, :doc:`replicate <replicate>` commands)
* read in atom coords from files * read in atom coordinates from files
* create atoms on one or more lattices (e.g. grain boundaries) * create atoms on one or more lattices (e.g. grain boundaries)
* delete geometric or logical groups of atoms (e.g. voids) * delete geometric or logical groups of atoms (e.g. voids)
* replicate existing atoms multiple times * replicate existing atoms multiple times
@ -173,11 +173,11 @@ Output
(:doc:`dump <dump>`, :doc:`restart <restart>` commands) (:doc:`dump <dump>`, :doc:`restart <restart>` commands)
* log file of thermodynamic info * log file of thermodynamic info
* text dump files of atom coords, velocities, other per-atom quantities * text dump files of atom coordinates, velocities, other per-atom quantities
* dump output on fixed and variable intervals, based timestep or simulated time * dump output on fixed and variable intervals, based timestep or simulated time
* binary restart files * binary restart files
* parallel I/O of dump and restart files * parallel I/O of dump and restart files
* per-atom quantities (energy, stress, centro-symmetry parameter, CNA, etc) * per-atom quantities (energy, stress, centro-symmetry parameter, CNA, etc.)
* user-defined system-wide (log file) or per-atom (dump file) calculations * user-defined system-wide (log file) or per-atom (dump file) calculations
* custom partitioning (chunks) for binning, and static or dynamic grouping of atoms for analysis * custom partitioning (chunks) for binning, and static or dynamic grouping of atoms for analysis
* spatial, time, and per-chunk averaging of per-atom quantities * spatial, time, and per-chunk averaging of per-atom quantities

View File

@ -20,22 +20,23 @@ that either closely interface with LAMMPS or extend LAMMPS.
Here are suggestions on how to perform these tasks: Here are suggestions on how to perform these tasks:
* **GUI:** LAMMPS can be built as a library and a Python wrapper that wraps * **GUI:** LAMMPS can be built as a library and a Python module that
the library interface is provided. Thus, GUI interfaces can be wraps the library interface is provided. Thus, GUI interfaces can be
written in Python (or C or C++ if desired) that run LAMMPS and written in Python or C/C++ that run LAMMPS and visualize or plot its
visualize or plot its output. Examples of this are provided in the output. Examples of this are provided in the python directory and
python directory and described on the :doc:`Python <Python_head>` doc described on the :doc:`Python <Python_head>` doc page. Also, there
page. Also, there are several external wrappers or GUI front ends. are several external wrappers or GUI front ends.
* **Builder:** Several pre-processing tools are packaged with LAMMPS. Some * **Builder:** Several pre-processing tools are packaged with LAMMPS.
of them convert input files in formats produced by other MD codes such Some of them convert input files in formats produced by other MD codes
as CHARMM, AMBER, or Insight into LAMMPS input formats. Some of them such as CHARMM, AMBER, or Insight into LAMMPS input formats. Some of
are simple programs that will build simple molecular systems, such as them are simple programs that will build simple molecular systems,
linear bead-spring polymer chains. The moltemplate program is a true such as linear bead-spring polymer chains. The moltemplate program is
molecular builder that will generate complex molecular models. See a true molecular builder that will generate complex molecular models.
the :doc:`Tools <Tools>` page for details on tools packaged with See the :doc:`Tools <Tools>` page for details on tools packaged with
LAMMPS. The `Pre/post processing page <https:/www.lammps.org/prepost.html>`_ of the LAMMPS website LAMMPS. The `Pre-/post-processing page
<https:/www.lammps.org/prepost.html>`_ of the LAMMPS homepage
describes a variety of third party tools for this task. Furthermore, describes a variety of third party tools for this task. Furthermore,
some LAMMPS internal commands allow to reconstruct, or selectively add some internal LAMMPS commands allow reconstructing, or selectively adding
topology information, as well as provide the option to insert molecule topology information, as well as provide the option to insert molecule
templates instead of atoms for building bulk molecular systems. templates instead of atoms for building bulk molecular systems.
* **Force-field assignment:** The conversion tools described in the previous * **Force-field assignment:** The conversion tools described in the previous
@ -47,33 +48,34 @@ Here are suggestions on how to perform these tasks:
powerful and flexible in converting force field and topology data powerful and flexible in converting force field and topology data
between various MD simulation programs. between various MD simulation programs.
* **Simulation analysis:** If you want to perform analysis on-the-fly as * **Simulation analysis:** If you want to perform analysis on-the-fly as
your simulation runs, see the :doc:`compute <compute>` and your simulation runs, see the :doc:`compute <compute>` and :doc:`fix
:doc:`fix <fix>` doc pages, which list commands that can be used in a <fix>` doc pages, which list commands that can be used in a LAMMPS
LAMMPS input script. Also see the :doc:`Modify <Modify>` page for input script. Also see the :doc:`Modify <Modify>` page for info on
info on how to add your own analysis code or algorithms to LAMMPS. how to add your own analysis code or algorithms to LAMMPS. For
For post-processing, LAMMPS output such as :doc:`dump file snapshots <dump>` can be converted into formats used by other MD or post-processing, LAMMPS output such as :doc:`dump file snapshots
<dump>` can be converted into formats used by other MD or
post-processing codes. To some degree, that conversion can be done post-processing codes. To some degree, that conversion can be done
directly inside of LAMMPS by interfacing to the VMD molfile plugins. directly inside LAMMPS by interfacing to the VMD molfile plugins. The
The :doc:`rerun <rerun>` command also allows to do some post-processing :doc:`rerun <rerun>` command also allows post-processing of existing
of existing trajectories, and through being able to read a variety trajectories, and through being able to read a variety of file
of file formats, this can also be used for analyzing trajectories formats, this can also be used for analyzing trajectories from other
from other MD codes. Some post-processing tools packaged with MD codes. Some post-processing tools packaged with LAMMPS will do
LAMMPS will do these conversions. Scripts provided in the these conversions. Scripts provided in the tools/python directory can
tools/python directory can extract and massage data in dump files to extract and massage data in dump files to make it easier to import
make it easier to import into other programs. See the into other programs. See the :doc:`Tools <Tools>` page for details on
:doc:`Tools <Tools>` page for details on these various options. these various options.
* **Visualization:** LAMMPS can produce NETPBM, JPG or PNG snapshot images * **Visualization:** LAMMPS can produce NETPBM, JPG, or PNG format
on-the-fly via its :doc:`dump image <dump_image>` command and pass snapshot images on-the-fly via its :doc:`dump image <dump_image>`
them to an external program, `FFmpeg <https://www.ffmpeg.org>`_ to generate command and pass them to an external program, `FFmpeg
movies from them. For high-quality, interactive visualization there are <https://www.ffmpeg.org>`_, to generate movies from them. For
many excellent and free tools available. See the high-quality, interactive visualization, there are many excellent and
`Visualization Tools <https://www.lammps.org/viz.html>`_ page of the free tools available. See the `Visualization Tools
LAMMPS website for <https://www.lammps.org/viz.html>`_ page of the LAMMPS website for
visualization packages that can process LAMMPS output data. visualization packages that can process LAMMPS output data.
* **Plotting:** See the next bullet about Pizza.py as well as the * **Plotting:** See the next bullet about Pizza.py as well as the
:doc:`Python <Python_head>` page for examples of plotting LAMMPS :doc:`Python <Python_head>` page for examples of plotting LAMMPS
output. Scripts provided with the *python* tool in the tools output. Scripts provided with the *python* tool in the ``tools``
directory will extract and massage data in log and dump files to make directory will extract and process data in log and dump files to make
it easier to analyze and plot. See the :doc:`Tools <Tools>` doc page it easier to analyze and plot. See the :doc:`Tools <Tools>` doc page
for more discussion of the various tools. for more discussion of the various tools.
* **Pizza.py:** Our group has also written a separate toolkit called * **Pizza.py:** Our group has also written a separate toolkit called

View File

@ -1,20 +1,20 @@
Overview of LAMMPS Overview of LAMMPS
------------------ ------------------
LAMMPS is a classical molecular dynamics (MD) code that models LAMMPS is a classical molecular dynamics (MD) code that models ensembles
ensembles of particles in a liquid, solid, or gaseous state. It can of particles in a liquid, solid, or gaseous state. It can model atomic,
model atomic, polymeric, biological, solid-state (metals, ceramics, polymeric, biological, solid-state (metals, ceramics, oxides), granular,
oxides), granular, coarse-grained, or macroscopic systems using a coarse-grained, or macroscopic systems using a variety of interatomic
variety of interatomic potentials (force fields) and boundary potentials (force fields) and boundary conditions. It can model 2d or
conditions. It can model 2d or 3d systems with only a few particles 3d systems with sizes ranging from only a few particles up to billions.
up to millions or billions.
LAMMPS can be built and run on a laptop or desktop machine, but is LAMMPS can be built and run on single laptop or desktop machines, but is
designed for parallel computers. It will run in serial and on any designed for parallel computers. It will run in serial and on any
parallel machine that supports the `MPI <mpi_>`_ message-passing parallel machine that supports the `MPI <mpi_>`_ message-passing
library. This includes shared-memory boxes and distributed-memory library. This includes shared-memory multicore, multi-CPU servers and
clusters and supercomputers. Parts of LAMMPS also support distributed-memory clusters and supercomputers. Parts of LAMMPS also
`OpenMP multi-threading <omp_>`_, vectorization and GPU acceleration. support `OpenMP multi-threading <omp_>`_, vectorization, and GPU
acceleration.
.. _mpi: https://en.wikipedia.org/wiki/Message_Passing_Interface .. _mpi: https://en.wikipedia.org/wiki/Message_Passing_Interface
.. _lws: https://www.lammps.org .. _lws: https://www.lammps.org
@ -42,11 +42,11 @@ LAMMPS uses neighbor lists to keep track of nearby particles. The lists
are optimized for systems with particles that are repulsive at short are optimized for systems with particles that are repulsive at short
distances, so that the local density of particles never becomes too distances, so that the local density of particles never becomes too
large. This is in contrast to methods used for modeling plasma or large. This is in contrast to methods used for modeling plasma or
gravitational bodies (e.g. galaxy formation). gravitational bodies (like galaxy formation).
On parallel machines, LAMMPS uses spatial-decomposition techniques with On parallel machines, LAMMPS uses spatial-decomposition techniques with
MPI parallelization to partition the simulation domain into sub-domains MPI parallelization to partition the simulation domain into subdomains
of equal computational cost, one of which is assigned to each processor. of equal computational cost, one of which is assigned to each processor.
Processors communicate and store "ghost" atom information for atoms that Processors communicate and store "ghost" atom information for atoms that
border their sub-domain. Multi-threading parallelization and GPU border their subdomain. Multi-threading parallelization and GPU
acceleration with with particle-decomposition can be used in addition. acceleration with particle-decomposition can be used in addition.

View File

@ -30,17 +30,17 @@ can be created using CMake. CMake must be at least version 3.10.
Operating systems Operating systems
^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
The primary development platform for LAMMPS is Linux. Thus the chances The primary development platform for LAMMPS is Linux. Thus, the chances
for LAMMPS to compile without problems on Linux machines are the best. for LAMMPS to compile without problems on Linux machines are the best.
Also compilation and correct execution on macOS and Windows (using Also, compilation and correct execution on macOS and Windows (using
Microsoft Visual C++) is checked automatically for largest part of the Microsoft Visual C++) is checked automatically for largest part of the
source code. Some (optional) features are not compatible with all source code. Some (optional) features are not compatible with all
operating systems either through limitations of the source code or operating systems, either through limitations of the corresponding
source code compatibility or the build system requirements of required LAMMPS source code or through source code or build system
libraries. incompatibilities of required libraries.
Executables for Windows may be created using either Cygwin or Visual Executables for Windows may be created natively using either Cygwin or
Studio or a Linux to Windows MinGW cross-compiler. Visual Studio or with a Linux to Windows MinGW cross-compiler.
Additionally, FreeBSD and Solaris have been tested successfully. Additionally, FreeBSD and Solaris have been tested successfully.
@ -49,7 +49,7 @@ Compilers
The most commonly used compilers are the GNU compilers, but also Clang The most commonly used compilers are the GNU compilers, but also Clang
and the Intel compilers have been successfully used on Linux, macOS, and and the Intel compilers have been successfully used on Linux, macOS, and
Windows. Also the Nvidia HPC SDK (formerly PGI compilers) will compile Windows. Also, the Nvidia HPC SDK (formerly PGI compilers) will compile
LAMMPS (tested on Linux). LAMMPS (tested on Linux).
CPU architectures CPU architectures
@ -62,12 +62,14 @@ regularly tested.
Portability compliance Portability compliance
^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^
Not all of the LAMMPS source code is fully compliant to all of the above Only a subset of the LAMMPS source code is fully compliant to all of the
mentioned standards. This is rather typical for projects like LAMMPS above mentioned standards. This is rather typical for projects like
that largely depend on contributions of features from the community. LAMMPS that largely depend on contributions from the user community.
Not all contributors are trained as programmers and not all of them have Not all contributors are trained as programmers and not all of them have
access to a variety of platforms. As part of the continuous integration access to multiple platforms for testing. As part of the continuous
process, however, all contributions are automatically tested to compile, integration process, however, all contributions are automatically tested
link, and pass some runtime tests on a selection of Linux flavors, to compile, link, and pass some runtime tests on a selection of Linux
macOS, and Windows with different compilers. Other platforms may be flavors, macOS, and Windows, and on Linux with different compilers.
checked occasionally or when portability bug are reported. Thus portability issues are often found before a pull request is merged.
Other platforms may be checked occasionally or when portability bugs are
reported.

View File

@ -30,7 +30,7 @@ course, changing values should be done with care. When accessing per-atom
data, please note that these data are the per-processor **local** data and are data, please note that these data are the per-processor **local** data and are
indexed accordingly. Per-atom data can change sizes and ordering at indexed accordingly. Per-atom data can change sizes and ordering at
every neighbor list rebuild or atom sort event as atoms migrate between every neighbor list rebuild or atom sort event as atoms migrate between
sub-domains and processors. subdomains and processors.
.. code-block:: c .. code-block:: c

View File

@ -5,16 +5,17 @@ LAMMPS Documentation (|version| version)
LAMMPS stands for **L**\ arge-scale **A**\ tomic/**M**\ olecular LAMMPS stands for **L**\ arge-scale **A**\ tomic/**M**\ olecular
**M**\ assively **P**\ arallel **S**\ imulator. **M**\ assively **P**\ arallel **S**\ imulator.
LAMMPS is a classical molecular dynamics simulation code with a focus LAMMPS is a classical molecular dynamics simulation code focusing on
on materials modeling. It was designed to run efficiently on parallel materials modeling. It was designed to run efficiently on parallel
computers. It was developed originally at Sandia National computers and to be easy to extend and modify. Originally developed at
Laboratories, a US Department of Energy facility. The majority of Sandia National Laboratories, a US Department of Energy facility, LAMMPS
funding for LAMMPS has come from the US Department of Energy (DOE). now includes contributions from many research groups and individuals
LAMMPS is an open-source code, distributed freely under the terms of from many institutions. Most of the funding for LAMMPS has come from
the GNU Public License Version 2 (GPLv2). the US Department of Energy (DOE). LAMMPS is open-source software
distributed under the terms of the GNU Public License Version 2 (GPLv2).
The `LAMMPS website <lws_>`_ has a variety of information about the The `LAMMPS website <lws_>`_ has a variety of information about the
code. It includes links to an on-line version of this manual, an code. It includes links to an online version of this manual, an
`online forum <https://www.lammps.org/forum.html>`_ where users can post `online forum <https://www.lammps.org/forum.html>`_ where users can post
questions and discuss LAMMPS, and a `GitHub site questions and discuss LAMMPS, and a `GitHub site
<https://github.com/lammps/lammps>`_ where all LAMMPS development is <https://github.com/lammps/lammps>`_ where all LAMMPS development is
@ -26,14 +27,14 @@ The content for this manual is part of the LAMMPS distribution. The
online version always corresponds to the latest feature release version. online version always corresponds to the latest feature release version.
If needed, you can build a local copy of the manual as HTML pages or a If needed, you can build a local copy of the manual as HTML pages or a
PDF file by following the steps on the :doc:`Build_manual` page. If you PDF file by following the steps on the :doc:`Build_manual` page. If you
have difficulties viewing the pages please :ref:`see this note have difficulties viewing the pages, please :ref:`see this note
<webbrowser>`. <webbrowser>`.
----------- -----------
The manual is organized in three parts: The manual is organized into three parts:
1. the :ref:`User Guide <user_documentation>` with information about how 1. The :ref:`User Guide <user_documentation>` with information about how
to obtain, configure, compile, install, and use LAMMPS, to obtain, configure, compile, install, and use LAMMPS,
2. the :ref:`Programmer Guide <programmer_documentation>` with 2. the :ref:`Programmer Guide <programmer_documentation>` with
information about how to use the LAMMPS library interface from information about how to use the LAMMPS library interface from
@ -47,7 +48,7 @@ The manual is organized in three parts:
.. only:: html .. only:: html
Once you are familiar with LAMMPS, you may want to bookmark After becoming familiar with LAMMPS, consider bookmarking
:doc:`this page <Commands_all>`, since it gives quick access to :doc:`this page <Commands_all>`, since it gives quick access to
tables with links to the documentation for all LAMMPS commands. tables with links to the documentation for all LAMMPS commands.

View File

@ -2,43 +2,44 @@ What does a LAMMPS version mean
------------------------------- -------------------------------
The LAMMPS "version" is the date when it was released, such as 1 May The LAMMPS "version" is the date when it was released, such as 1 May
2014. LAMMPS is updated continuously and we aim to keep it working 2014. LAMMPS is updated continuously, and we aim to keep it working
correctly and reliably at all times. You can follow its development correctly and reliably at all times. You can follow its development
in a public `git repository on GitHub <https://github.com/lammps/lammps>`_. in a public `git repository on GitHub <https://github.com/lammps/lammps>`_.
Modifications of the LAMMPS source code - like bug fixes, code Modifications of the LAMMPS source code (like bug fixes, code refactors,
refactors, updates to existing features, or addition of new features - updates to existing features, or addition of new features) are organized
are organized into pull requests, and will be merged into the *develop* into pull requests. Pull requests will be merged into the *develop*
branch of the git repository when they pass automated testing and code branch of the git repository after they pass automated testing and code
review by the LAMMPS developers. When a sufficient number of changes review by the LAMMPS developers. When a sufficient number of changes
have accumulated *and* the software passes a set of automated tests, we have accumulated *and* the *develop* branch version passes an extended
release it as a *feature release* (or patch release), which are set of automated tests, we release it as a *feature release* (or patch
currently made every 4-8 weeks. The *release* branch of the git release), which are currently made every 4 to 8 weeks. The *release*
repository is updated with every such release. A summary of the most branch of the git repository is updated with every such release. A
important changes of the patch releases are on `this website page summary of the most important changes of the patch releases are on `this
<https://www.lammps.org/bug.html>`_. More detailed release notes are website page <https://www.lammps.org/bug.html>`_. More detailed release
`available on GitHub <https://github.com/lammps/lammps/releases/>`_. notes are `available on GitHub
<https://github.com/lammps/lammps/releases/>`_.
Once or twice a year, we have a "stabilization period" where we apply Once or twice a year, we have a "stabilization period" where we apply
only bug fixes and small, non-intrusive changes to the *develop* only bug fixes and small, non-intrusive changes to the *develop*
branch. At the same time the code is subjected to more detailed and branch. At the same time, the code is subjected to more detailed and
thorough manual testing than the default automated testing. Also thorough manual testing than the default automated testing. Also,
several variants of static code analysis are run to improve the overall several variants of static code analysis are run to improve the overall
code quality, consistency, and compliance with programming standards, code quality, consistency, and compliance with programming standards,
best practices and style conventions. best practices and style conventions.
The latest patch release after such a period is then also labeled as a The latest patch release after such a period is then also labeled as a
*stable* version and the *stable* branch is updated with it. Between *stable* version and the *stable* branch is updated with it. Between
stable releases we occasionally release updates to the stable release stable releases, we occasionally release updates to the stable release
containing only bug fixes and updates back-ported from the *develop* containing only bug fixes and updates back-ported from the *develop*
branch and update the *stable* branch accordingly. branch and update the *stable* branch accordingly.
Each version of LAMMPS contains all the documented features up to and Each version of LAMMPS contains all the documented features up to and
including its version date. For recently added features we add markers including its version date. For recently added features, we add markers
to the documentation at which specific LAMMPS version a feature or to the documentation at which specific LAMMPS version a feature or
keyword was added or significantly changed. keyword was added or significantly changed.
The version date is printed to the screen and logfile every time you run The version date is printed to the screen and log file every time you run
LAMMPS. It is also in the file src/version.h and in the LAMMPS LAMMPS. It is also in the file src/version.h and in the LAMMPS
directory name created when you unpack a tarball. And it is on the directory name created when you unpack a tarball. And it is on the
first page of the :doc:`manual <Manual>`. first page of the :doc:`manual <Manual>`.

View File

@ -23,7 +23,7 @@ against invalid accesses.
When accessing per-atom data, When accessing per-atom data,
please note that this data is the per-processor local data and indexed please note that this data is the per-processor local data and indexed
accordingly. These arrays can change sizes and order at every neighbor list accordingly. These arrays can change sizes and order at every neighbor list
rebuild and atom sort event as atoms are migrating between sub-domains. rebuild and atom sort event as atoms are migrating between subdomains.
.. tabs:: .. tabs::

View File

@ -23,7 +23,7 @@ against invalid accesses.
When accessing per-atom data, When accessing per-atom data,
please note that this data is the per-processor local data and indexed please note that this data is the per-processor local data and indexed
accordingly. These arrays can change sizes and order at every neighbor list accordingly. These arrays can change sizes and order at every neighbor list
rebuild and atom sort event as atoms are migrating between sub-domains. rebuild and atom sort event as atoms are migrating between subdomains.
.. tabs:: .. tabs::

View File

@ -9,7 +9,7 @@ There are two thrusts to the discussion that follows. The first is
using code options that implement alternate algorithms that can using code options that implement alternate algorithms that can
speed-up a simulation. The second is to use one of the several speed-up a simulation. The second is to use one of the several
accelerator packages provided with LAMMPS that contain code optimized accelerator packages provided with LAMMPS that contain code optimized
for certain kinds of hardware, including multi-core CPUs, GPUs, and for certain kinds of hardware, including multicore CPUs, GPUs, and
Intel Xeon Phi co-processors. Intel Xeon Phi co-processors.
The `Benchmark page <https://www.lammps.org/bench.html>`_ of the LAMMPS The `Benchmark page <https://www.lammps.org/bench.html>`_ of the LAMMPS

View File

@ -11,7 +11,7 @@ parts of the :doc:`kspace_style pppm <kspace_style>` for long-range
Coulombics. It has the following general features: Coulombics. It has the following general features:
* It is designed to exploit common GPU hardware configurations where one * It is designed to exploit common GPU hardware configurations where one
or more GPUs are coupled to many cores of one or more multi-core CPUs, or more GPUs are coupled to many cores of one or more multicore CPUs,
e.g. within a node of a parallel machine. e.g. within a node of a parallel machine.
* Atom-based data (e.g. coordinates, forces) are moved back-and-forth * Atom-based data (e.g. coordinates, forces) are moved back-and-forth
between the CPU(s) and GPU every timestep. between the CPU(s) and GPU every timestep.
@ -28,7 +28,7 @@ Coulombics. It has the following general features:
* LAMMPS-specific code is in the GPU package. It makes calls to a * LAMMPS-specific code is in the GPU package. It makes calls to a
generic GPU library in the lib/gpu directory. This library provides generic GPU library in the lib/gpu directory. This library provides
either Nvidia support, AMD support, or more general OpenCL support either Nvidia support, AMD support, or more general OpenCL support
(for Nvidia GPUs, AMD GPUs, Intel GPUs, and multi-core CPUs). (for Nvidia GPUs, AMD GPUs, Intel GPUs, and multicore CPUs).
so that the same functionality is supported on a variety of hardware. so that the same functionality is supported on a variety of hardware.
**Required hardware/software:** **Required hardware/software:**
@ -146,7 +146,7 @@ GPUs/node to use, as well as other options.
**Speed-ups to expect:** **Speed-ups to expect:**
The performance of a GPU versus a multi-core CPU is a function of your The performance of a GPU versus a multicore CPU is a function of your
hardware, which pair style is used, the number of atoms/GPU, and the hardware, which pair style is used, the number of atoms/GPU, and the
precision used on the GPU (double, single, mixed). Using the GPU package precision used on the GPU (double, single, mixed). Using the GPU package
in OpenCL mode on CPUs (which uses vectorization and multithreading) is in OpenCL mode on CPUs (which uses vectorization and multithreading) is
@ -174,7 +174,7 @@ deterministic results.
**Guidelines for best performance:** **Guidelines for best performance:**
* Using multiple MPI tasks per GPU will often give the best performance, * Using multiple MPI tasks per GPU will often give the best performance,
as allowed my most multi-core CPU/GPU configurations. as allowed my most multicore CPU/GPU configurations.
* If the number of particles per MPI task is small (e.g. 100s of * If the number of particles per MPI task is small (e.g. 100s of
particles), it can be more efficient to run with fewer MPI tasks per particles), it can be more efficient to run with fewer MPI tasks per
GPU, even if you do not use all the cores on the compute node. GPU, even if you do not use all the cores on the compute node.

View File

@ -79,7 +79,7 @@ manner via the ``mpirun`` or ``mpiexec`` commands, and is independent of
Kokkos. E.g. the mpirun command in OpenMPI does this via its ``-np`` and Kokkos. E.g. the mpirun command in OpenMPI does this via its ``-np`` and
``-npernode`` switches. Ditto for MPICH via ``-np`` and ``-ppn``. ``-npernode`` switches. Ditto for MPICH via ``-np`` and ``-ppn``.
Running on a multi-core CPU Running on a multicore CPU
^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here is a quick overview of how to use the KOKKOS package Here is a quick overview of how to use the KOKKOS package
@ -254,7 +254,7 @@ is recommended in this scenario.
Using a GPU-aware MPI library is highly recommended. GPU-aware MPI use can be Using a GPU-aware MPI library is highly recommended. GPU-aware MPI use can be
avoided by using :doc:`-pk kokkos gpu/aware off <package>`. As above for avoided by using :doc:`-pk kokkos gpu/aware off <package>`. As above for
multi-core CPUs (and no GPU), if N is the number of physical cores/node, multicore CPUs (and no GPU), if N is the number of physical cores/node,
then the number of MPI tasks/node should not exceed N. then the number of MPI tasks/node should not exceed N.
.. parsed-literal:: .. parsed-literal::

View File

@ -12,7 +12,7 @@ Required hardware/software
"""""""""""""""""""""""""" """"""""""""""""""""""""""
To enable multi-threading, your compiler must support the OpenMP interface. To enable multi-threading, your compiler must support the OpenMP interface.
You should have one or more multi-core CPUs, as multiple threads can only be You should have one or more multicore CPUs, as multiple threads can only be
launched by each MPI task on the local node (using shared memory). launched by each MPI task on the local node (using shared memory).
Building LAMMPS with the OPENMP package Building LAMMPS with the OPENMP package
@ -157,7 +157,7 @@ Additional performance tips are as follows:
affinity setting that restricts each MPI task to a single CPU core. affinity setting that restricts each MPI task to a single CPU core.
Using multi-threading in this mode will force all threads to share the Using multi-threading in this mode will force all threads to share the
one core and thus is likely to be counterproductive. Instead, binding one core and thus is likely to be counterproductive. Instead, binding
MPI tasks to a (multi-core) socket, should solve this issue. MPI tasks to a (multicore) socket, should solve this issue.
Restrictions Restrictions
"""""""""""" """"""""""""

View File

@ -113,7 +113,7 @@ your input script. LAMMPS does not use the group until a simulation
is run. is run.
The *sort* keyword turns on a spatial sorting or reordering of atoms The *sort* keyword turns on a spatial sorting or reordering of atoms
within each processor's sub-domain every *Nfreq* timesteps. If within each processor's subdomain every *Nfreq* timesteps. If
*Nfreq* is set to 0, then sorting is turned off. Sorting can improve *Nfreq* is set to 0, then sorting is turned off. Sorting can improve
cache performance and thus speed-up a LAMMPS simulation, as discussed cache performance and thus speed-up a LAMMPS simulation, as discussed
in a paper by :ref:`(Meloni) <Meloni>`. Its efficacy depends on the problem in a paper by :ref:`(Meloni) <Meloni>`. Its efficacy depends on the problem

View File

@ -54,7 +54,7 @@ Syntax
*store* name = store weight in custom atom property defined by :doc:`fix property/atom <fix_property_atom>` command *store* name = store weight in custom atom property defined by :doc:`fix property/atom <fix_property_atom>` command
name = atom property name (without d\_ prefix) name = atom property name (without d\_ prefix)
*out* arg = filename *out* arg = filename
filename = write each processor's sub-domain to a file filename = write each processor's subdomain to a file
Examples Examples
"""""""" """"""""
@ -72,14 +72,14 @@ Examples
Description Description
""""""""""" """""""""""
This command adjusts the size and shape of processor sub-domains This command adjusts the size and shape of processor subdomains
within the simulation box, to attempt to balance the number of atoms within the simulation box, to attempt to balance the number of atoms
or particles and thus indirectly the computational cost (load) more or particles and thus indirectly the computational cost (load) more
evenly across processors. The load balancing is "static" in the sense evenly across processors. The load balancing is "static" in the sense
that this command performs the balancing once, before or between that this command performs the balancing once, before or between
simulations. The processor sub-domains will then remain static during simulations. The processor subdomains will then remain static during
the subsequent run. To perform "dynamic" balancing, see the :doc:`fix the subsequent run. To perform "dynamic" balancing, see the :doc:`fix
balance <fix_balance>` command, which can adjust processor sub-domain balance <fix_balance>` command, which can adjust processor subdomain
sizes and shapes on-the-fly during a :doc:`run <run>`. sizes and shapes on-the-fly during a :doc:`run <run>`.
Load-balancing is typically most useful if the particles in the Load-balancing is typically most useful if the particles in the
@ -90,7 +90,7 @@ an irregular-shaped geometry containing void regions, or :doc:`hybrid
pair style simulations <pair_hybrid>` which combine pair styles with pair style simulations <pair_hybrid>` which combine pair styles with
different computational cost. In these cases, the LAMMPS default of different computational cost. In these cases, the LAMMPS default of
dividing the simulation box volume into a regular-spaced grid of 3d dividing the simulation box volume into a regular-spaced grid of 3d
bricks, with one equal-volume sub-domain per processor, may assign bricks, with one equal-volume subdomain per processor, may assign
numbers of particles per processor in a way that the computational numbers of particles per processor in a way that the computational
effort varies significantly. This can lead to poor performance when effort varies significantly. This can lead to poor performance when
the simulation is run in parallel. the simulation is run in parallel.
@ -109,7 +109,7 @@ Specifically, for a Px by Py by Pz grid of processors, it allows
choice of Px, Py, and Pz, subject to the constraint that Px \* Py \* choice of Px, Py, and Pz, subject to the constraint that Px \* Py \*
Pz = P, the total number of processors. This is sufficient to achieve Pz = P, the total number of processors. This is sufficient to achieve
good load-balance for some problems on some processor counts. good load-balance for some problems on some processor counts.
However, all the processor sub-domains will still have the same shape However, all the processor subdomains will still have the same shape
and same volume. and same volume.
The requested load-balancing operation is only performed if the The requested load-balancing operation is only performed if the
@ -162,7 +162,7 @@ fractions of the box length) are also printed.
simulation could run up to 20% faster if it were perfectly balanced, simulation could run up to 20% faster if it were perfectly balanced,
versus when imbalanced. However, computational cost is not strictly versus when imbalanced. However, computational cost is not strictly
proportional to particle count, and changing the relative size and proportional to particle count, and changing the relative size and
shape of processor sub-domains may lead to additional computational shape of processor subdomains may lead to additional computational
and communication overheads, e.g. in the PPPM solver used via the and communication overheads, e.g. in the PPPM solver used via the
:doc:`kspace_style <kspace_style>` command. Thus you should benchmark :doc:`kspace_style <kspace_style>` command. Thus you should benchmark
the run times of a simulation before and after balancing. the run times of a simulation before and after balancing.
@ -177,7 +177,7 @@ The *x*, *y*, *z*, and *shift* styles are "grid" methods which
produce a logical 3d grid of processors. They operate by changing the produce a logical 3d grid of processors. They operate by changing the
cutting planes (or lines) between processors in 3d (or 2d), to adjust cutting planes (or lines) between processors in 3d (or 2d), to adjust
the volume (area in 2d) assigned to each processor, as in the the volume (area in 2d) assigned to each processor, as in the
following 2d diagram where processor sub-domains are shown and following 2d diagram where processor subdomains are shown and
particles are colored by the processor that owns them. particles are colored by the processor that owns them.
.. |balance1| image:: img/balance_uniform.jpg .. |balance1| image:: img/balance_uniform.jpg
@ -226,7 +226,7 @@ The *x*, *y*, and *z* styles invoke a "grid" method for balancing, as
described above. Note that any or all of these 3 styles can be described above. Note that any or all of these 3 styles can be
specified together, one after the other, but they cannot be used with specified together, one after the other, but they cannot be used with
any other style. This style adjusts the position of cutting planes any other style. This style adjusts the position of cutting planes
between processor sub-domains in specific dimensions. Only the between processor subdomains in specific dimensions. Only the
specified dimensions are altered. specified dimensions are altered.
The *uniform* argument spaces the planes evenly, as in the left The *uniform* argument spaces the planes evenly, as in the left
@ -245,8 +245,8 @@ the cutting place. The left (or lower) edge of the box is 0.0, and
the right (or upper) edge is 1.0. Neither of these values is the right (or upper) edge is 1.0. Neither of these values is
specified. Only the interior Ps-1 positions are specified. Thus is specified. Only the interior Ps-1 positions are specified. Thus is
there are 2 processors in the x dimension, you specify a single value there are 2 processors in the x dimension, you specify a single value
such as 0.75, which would make the left processor's sub-domain 3x such as 0.75, which would make the left processor's subdomain 3x
larger than the right processor's sub-domain. larger than the right processor's subdomain.
---------- ----------
@ -288,10 +288,10 @@ adjacent planes are closer together than the neighbor skin distance
(as specified by the :doc:`neigh_modify <neigh_modify>` command), then (as specified by the :doc:`neigh_modify <neigh_modify>` command), then
the plane positions are shifted to separate them by at least this the plane positions are shifted to separate them by at least this
amount. This is to prevent particles being lost when dynamics are run amount. This is to prevent particles being lost when dynamics are run
with processor sub-domains that are too narrow in one or more with processor subdomains that are too narrow in one or more
dimensions. dimensions.
Once the re-balancing is complete and final processor sub-domains Once the re-balancing is complete and final processor subdomains
assigned, particles are migrated to their new owning processor, and assigned, particles are migrated to their new owning processor, and
the balance procedure ends. the balance procedure ends.
@ -299,7 +299,7 @@ the balance procedure ends.
At each re-balance operation, the bisectioning for each cutting At each re-balance operation, the bisectioning for each cutting
plane (line in 2d) typically starts with low and high bounds separated plane (line in 2d) typically starts with low and high bounds separated
by the extent of a processor's sub-domain in one dimension. The size by the extent of a processor's subdomain in one dimension. The size
of this bracketing region shrinks by 1/2 every iteration. Thus if of this bracketing region shrinks by 1/2 every iteration. Thus if
*Niter* is specified as 10, the cutting plane will typically be *Niter* is specified as 10, the cutting plane will typically be
positioned to 1 part in 1000 accuracy (relative to the perfect target positioned to 1 part in 1000 accuracy (relative to the perfect target
@ -494,7 +494,7 @@ different kinds of custom atom vectors or arrays as arguments.
The *out* keyword writes a text file to the specified *filename* with The *out* keyword writes a text file to the specified *filename* with
the results of the balancing operation. The file contains the bounds the results of the balancing operation. The file contains the bounds
of the sub-domain for each processor after the balancing operation of the subdomain for each processor after the balancing operation
completes. The format of the file is compatible with the completes. The format of the file is compatible with the
`Pizza.py <pizza_>`_ *mdump* tool which has support for manipulating and `Pizza.py <pizza_>`_ *mdump* tool which has support for manipulating and
visualizing mesh files. An example is shown here for a balancing by 4 visualizing mesh files. An example is shown here for a balancing by 4
@ -538,7 +538,7 @@ processors for a 2d problem:
4 1 13 14 15 16 4 1 13 14 15 16
The coordinates of all the vertices are listed in the NODES section, 5 The coordinates of all the vertices are listed in the NODES section, 5
per processor. Note that the 4 sub-domains share vertices, so there per processor. Note that the 4 subdomains share vertices, so there
will be duplicate nodes in the list. will be duplicate nodes in the list.
The "SQUARES" section lists the node IDs of the 4 vertices in a The "SQUARES" section lists the node IDs of the 4 vertices in a

View File

@ -61,7 +61,7 @@ move. Note that when the difference between the current box dimensions
and the shrink-wrap box dimensions is large, this can lead to lost and the shrink-wrap box dimensions is large, this can lead to lost
atoms at the beginning of a run when running in parallel. This is due atoms at the beginning of a run when running in parallel. This is due
to the large change in the (global) box dimensions also causing to the large change in the (global) box dimensions also causing
significant changes in the individual sub-domain sizes. If these significant changes in the individual subdomain sizes. If these
changes are farther than the communication cutoff, atoms will be lost. changes are farther than the communication cutoff, atoms will be lost.
This is best addressed by setting initial box dimensions to match the This is best addressed by setting initial box dimensions to match the
shrink-wrapped dimensions more closely, by using *m* style boundaries shrink-wrapped dimensions more closely, by using *m* style boundaries

View File

@ -62,7 +62,7 @@ distances are used to determine which atoms to communicate.
The default mode is *single* which means each processor acquires The default mode is *single* which means each processor acquires
information for ghost atoms that are within a single distance from its information for ghost atoms that are within a single distance from its
sub-domain. The distance is by default the maximum of the neighbor subdomain. The distance is by default the maximum of the neighbor
cutoff across all atom type pairs. cutoff across all atom type pairs.
For many systems this is an efficient algorithm, but for systems with For many systems this is an efficient algorithm, but for systems with
@ -81,7 +81,7 @@ with both the *multi* and *multi/old* neighbor styles.
The *cutoff* keyword allows you to extend the ghost cutoff distance The *cutoff* keyword allows you to extend the ghost cutoff distance
for communication mode *single*, which is the distance from the borders for communication mode *single*, which is the distance from the borders
of a processor's sub-domain at which ghost atoms are acquired from other of a processor's subdomain at which ghost atoms are acquired from other
processors. By default the ghost cutoff = neighbor cutoff = pairwise processors. By default the ghost cutoff = neighbor cutoff = pairwise
force cutoff + neighbor skin. See the :doc:`neighbor <neighbor>` command force cutoff + neighbor skin. See the :doc:`neighbor <neighbor>` command
for more information about the skin distance. If the specified Rcut is for more information about the skin distance. If the specified Rcut is

View File

@ -54,7 +54,7 @@ per atom, e.g. a list of bond distances. Per-grid quantities are
calculated on a regular 2d or 3d grid which overlays a 2d or 3d calculated on a regular 2d or 3d grid which overlays a 2d or 3d
simulation domain. The grid points and the data they store are simulation domain. The grid points and the data they store are
distributed across processors; each processor owns the grid points distributed across processors; each processor owns the grid points
which fall within its sub-domain. which fall within its subdomain.
Computes that produce per-atom quantities have the word "atom" at the Computes that produce per-atom quantities have the word "atom" at the
end of their style, e.g. *ke/atom*\ . Computes that produce local end of their style, e.g. *ke/atom*\ . Computes that produce local

View File

@ -48,9 +48,9 @@ the virial, equal to :math:`-dU/dV`, computed for all pairwise as well
as 2-body, 3-body, 4-body, many-body, and long-range interactions, where as 2-body, 3-body, 4-body, many-body, and long-range interactions, where
:math:`\vec r_i` and :math:`\vec f_i` are the position and force vector :math:`\vec r_i` and :math:`\vec f_i` are the position and force vector
of atom *i*, and the dot indicates the dot product (scalar product). of atom *i*, and the dot indicates the dot product (scalar product).
This is computed in parallel for each sub-domain and then summed over This is computed in parallel for each subdomain and then summed over
all parallel processes. Thus :math:`N'` necessarily includes atoms from all parallel processes. Thus :math:`N'` necessarily includes atoms from
neighboring sub-domains (so-called ghost atoms) and the position and neighboring subdomains (so-called ghost atoms) and the position and
force vectors of ghost atoms are thus included in the summation. Only force vectors of ghost atoms are thus included in the summation. Only
when running in serial and without periodic boundary conditions is when running in serial and without periodic boundary conditions is
:math:`N' = N` the number of atoms in the system. :doc:`Fixes <fix>` :math:`N' = N` the number of atoms in the system. :doc:`Fixes <fix>`

View File

@ -39,7 +39,7 @@ Description
Define a computation that stores the specified attributes of a Define a computation that stores the specified attributes of a
distributed grid. In LAMMPS, distributed grids are regular 2d or 3d distributed grid. In LAMMPS, distributed grids are regular 2d or 3d
grids which overlay a 2d or 3d simulation domain. Each processor owns grids which overlay a 2d or 3d simulation domain. Each processor owns
the grid cells whose center points lie within its sub-domain. See the the grid cells whose center points lie within its subdomain. See the
:doc:`Howto grid <Howto_grid>` doc page for details of how distributed :doc:`Howto grid <Howto_grid>` doc page for details of how distributed
grids can be defined by various commands and referenced. grids can be defined by various commands and referenced.

View File

@ -259,7 +259,7 @@ layout in the global array.
Compute *sna/grid/local* calculates bispectrum components of a regular Compute *sna/grid/local* calculates bispectrum components of a regular
grid of points similarly to compute *sna/grid* described above. grid of points similarly to compute *sna/grid* described above.
However, because the array is local, it contains only rows for grid points However, because the array is local, it contains only rows for grid points
that are local to the processor sub-domain. The global grid that are local to the processor subdomain. The global grid
of :math:`nx \times ny \times nz` points is still laid out in space the same as for *sna/grid*, of :math:`nx \times ny \times nz` points is still laid out in space the same as for *sna/grid*,
but grid points are strictly partitioned, so that every grid point appears in but grid points are strictly partitioned, so that every grid point appears in
one and only one local array. The array contains one row for each of the one and only one local array. The array contains one row for each of the

View File

@ -80,9 +80,9 @@ Syntax
axes = *yes* or *no* = do or do not draw xyz axes lines next to simulation box axes = *yes* or *no* = do or do not draw xyz axes lines next to simulation box
length = length of axes lines as fraction of respective box lengths length = length of axes lines as fraction of respective box lengths
diam = diameter of axes lines as fraction of shortest box length diam = diameter of axes lines as fraction of shortest box length
*subbox* values = lines diam = draw outline of processor sub-domains *subbox* values = lines diam = draw outline of processor subdomains
lines = *yes* or *no* = do or do not draw sub-domain lines lines = *yes* or *no* = do or do not draw subdomain lines
diam = diameter of sub-domain lines as fraction of shortest box length diam = diameter of subdomain lines as fraction of shortest box length
*shiny* value = sfactor = shinyness of spheres and cylinders *shiny* value = sfactor = shinyness of spheres and cylinders
sfactor = shinyness of spheres and cylinders from 0.0 to 1.0 sfactor = shinyness of spheres and cylinders from 0.0 to 1.0
*ssao* value = shading seed dfactor = SSAO depth shading *ssao* value = shading seed dfactor = SSAO depth shading
@ -145,7 +145,7 @@ Syntax
*bitrate* arg = rate *bitrate* arg = rate
rate = target bitrate for movie in kbps rate = target bitrate for movie in kbps
*boxcolor* arg = color *boxcolor* arg = color
color = name of color for simulation box lines and processor sub-domain lines color = name of color for simulation box lines and processor subdomain lines
*color* args = name R G B *color* args = name R G B
name = name of color name = name of color
R,G,B = red/green/blue numeric values from 0.0 to 1.0 R,G,B = red/green/blue numeric values from 0.0 to 1.0
@ -581,13 +581,13 @@ respective box lengths. The *diam* setting determines their thickness
as a fraction of the shortest box length in x,y,z (for 3d) or x,y (for as a fraction of the shortest box length in x,y,z (for 3d) or x,y (for
2d). 2d).
The *subbox* keyword determines if and how processor sub-domain The *subbox* keyword determines if and how processor subdomain
boundaries are rendered as thin cylinders in the image. If *no* is boundaries are rendered as thin cylinders in the image. If *no* is
set (default), then the sub-domain boundaries are not drawn and the set (default), then the subdomain boundaries are not drawn and the
*diam* setting is ignored. If *yes* is set, the 12 edges of each *diam* setting is ignored. If *yes* is set, the 12 edges of each
processor sub-domain are drawn, with a diameter that is a fraction of processor subdomain are drawn, with a diameter that is a fraction of
the shortest box length in x,y,z (for 3d) or x,y (for 2d). The color the shortest box length in x,y,z (for 3d) or x,y (for 2d). The color
of the sub-domain boundaries can be set with the "dump_modify of the subdomain boundaries can be set with the "dump_modify
boxcolor" command. boxcolor" command.
---------- ----------
@ -921,8 +921,8 @@ formats.
The *boxcolor* keyword sets the color of the simulation box drawn The *boxcolor* keyword sets the color of the simulation box drawn
around the atoms in each image as well as the color of processor around the atoms in each image as well as the color of processor
sub-domain boundaries. See the "dump image box" command for how to subdomain boundaries. See the "dump image box" command for how to
specify that a box be drawn via the *box* keyword, and the sub-domain specify that a box be drawn via the *box* keyword, and the subdomain
boundaries via the *subbox* keyword. The color name can be any of the boundaries via the *subbox* keyword. The color name can be any of the
140 pre-defined colors (see below) or a color name defined by the 140 pre-defined colors (see below) or a color name defined by the
dump_modify color option. dump_modify color option.

View File

@ -89,7 +89,7 @@ owns, but there may be zero or more per atoms. Per-grid quantities
are calculated on a regular 2d or 3d grid which overlays a 2d or 3d are calculated on a regular 2d or 3d grid which overlays a 2d or 3d
simulation domain. The grid points and the data they store are simulation domain. The grid points and the data they store are
distributed across processors; each processor owns the grid points distributed across processors; each processor owns the grid points
which fall within its sub-domain. which fall within its subdomain.
Note that a single fix typically produces either global or per-atom or Note that a single fix typically produces either global or per-atom or
local or per-grid values (or none at all). It does not produce both local or per-grid values (or none at all). It does not produce both

View File

@ -84,7 +84,7 @@ produced by other computes or fixes. This fix operates in either
per-grid inputs in the same command. per-grid inputs in the same command.
The grid created by this command is distributed; each processor owns The grid created by this command is distributed; each processor owns
the grid points that are within its sub-domain. This is similar to the grid points that are within its subdomain. This is similar to
the :doc:`fix ave/chunk <fix_ave_chunk>` command when it uses chunks the :doc:`fix ave/chunk <fix_ave_chunk>` command when it uses chunks
from the :doc:`compute chunk/atom <compute_chunk_atom>` command which from the :doc:`compute chunk/atom <compute_chunk_atom>` command which
are 2d or 3d regular bins. However, the per-bin outputs in that case are 2d or 3d regular bins. However, the per-bin outputs in that case

View File

@ -44,7 +44,7 @@ Syntax
*store* name = store weight in custom atom property defined by :doc:`fix property/atom <fix_property_atom>` command *store* name = store weight in custom atom property defined by :doc:`fix property/atom <fix_property_atom>` command
name = atom property name (without d\_ prefix) name = atom property name (without d\_ prefix)
*out* arg = filename *out* arg = filename
filename = write each processor's sub-domain to a file, at each re-balancing filename = write each processor's subdomain to a file, at each re-balancing
Examples Examples
"""""""" """"""""
@ -61,7 +61,7 @@ Examples
Description Description
""""""""""" """""""""""
This command adjusts the size and shape of processor sub-domains This command adjusts the size and shape of processor subdomains
within the simulation box, to attempt to balance the number of within the simulation box, to attempt to balance the number of
particles and thus the computational cost (load) evenly across particles and thus the computational cost (load) evenly across
processors. The load balancing is "dynamic" in the sense that processors. The load balancing is "dynamic" in the sense that
@ -77,7 +77,7 @@ an irregular-shaped geometry containing void regions, or
:doc:`hybrid pair style simulations <pair_hybrid>` that combine :doc:`hybrid pair style simulations <pair_hybrid>` that combine
pair styles with different computational cost). In these cases, the pair styles with different computational cost). In these cases, the
LAMMPS default of dividing the simulation box volume into a LAMMPS default of dividing the simulation box volume into a
regular-spaced grid of 3d bricks, with one equal-volume sub-domain regular-spaced grid of 3d bricks, with one equal-volume subdomain
per processor, may assign numbers of particles per processor in a per processor, may assign numbers of particles per processor in a
way that the computational effort varies significantly. This can way that the computational effort varies significantly. This can
lead to poor performance when the simulation is run in parallel. lead to poor performance when the simulation is run in parallel.
@ -105,7 +105,7 @@ a :math:`P_x \times P_y \times P_z` grid of processors, it allows choices of
:math:`P_x P_y P_z = P`, the total number of processors. :math:`P_x P_y P_z = P`, the total number of processors.
This is sufficient to achieve good load-balance for This is sufficient to achieve good load-balance for
some problems on some processor counts. However, all the processor some problems on some processor counts. However, all the processor
sub-domains will still have the same shape and the same volume. subdomains will still have the same shape and the same volume.
On a particular time step, a load-balancing operation is only performed On a particular time step, a load-balancing operation is only performed
if the current "imbalance factor" in particles owned by each processor if the current "imbalance factor" in particles owned by each processor
@ -141,7 +141,7 @@ forced even if the current balance is perfect (1.0) be specifying a
simulation could run up to 20% faster if it were perfectly balanced, simulation could run up to 20% faster if it were perfectly balanced,
versus when imbalanced. However, computational cost is not strictly versus when imbalanced. However, computational cost is not strictly
proportional to particle count, and changing the relative size and proportional to particle count, and changing the relative size and
shape of processor sub-domains may lead to additional computational shape of processor subdomains may lead to additional computational
and communication overheads (e.g., in the PPPM solver used via the and communication overheads (e.g., in the PPPM solver used via the
:doc:`kspace_style <kspace_style>` command). Thus, you should benchmark :doc:`kspace_style <kspace_style>` command). Thus, you should benchmark
the run times of a simulation before and after balancing. the run times of a simulation before and after balancing.
@ -156,7 +156,7 @@ The *shift* style is a "grid" method which produces a logical 3d grid
of processors. It operates by changing the cutting planes (or lines) of processors. It operates by changing the cutting planes (or lines)
between processors in 3d (or 2d), to adjust the volume (area in 2d) between processors in 3d (or 2d), to adjust the volume (area in 2d)
assigned to each processor, as in the following 2d diagram where assigned to each processor, as in the following 2d diagram where
processor sub-domains are shown and atoms are colored by the processor processor subdomains are shown and atoms are colored by the processor
that owns them. that owns them.
.. |balance1| image:: img/balance_uniform.jpg .. |balance1| image:: img/balance_uniform.jpg
@ -258,7 +258,7 @@ from balanced, and converge more slowly. In this case you probably
want to use the :doc:`balance <balance>` command before starting a run, want to use the :doc:`balance <balance>` command before starting a run,
so that you begin the run with a balanced system. so that you begin the run with a balanced system.
Once the re-balancing is complete and final processor sub-domains Once the re-balancing is complete and final processor subdomains
assigned, particles migrate to their new owning processor as part of assigned, particles migrate to their new owning processor as part of
the normal reneighboring procedure. the normal reneighboring procedure.
@ -266,7 +266,7 @@ the normal reneighboring procedure.
At each re-balance operation, the bisectioning for each cutting At each re-balance operation, the bisectioning for each cutting
plane (line in 2d) typically starts with low and high bounds separated plane (line in 2d) typically starts with low and high bounds separated
by the extent of a processor's sub-domain in one dimension. The size by the extent of a processor's subdomain in one dimension. The size
of this bracketing region shrinks based on the local density, as of this bracketing region shrinks based on the local density, as
described above, which should typically be 1/2 or more every described above, which should typically be 1/2 or more every
iteration. Thus if :math:`N_\text{iter}` is specified as 10, the cutting iteration. Thus if :math:`N_\text{iter}` is specified as 10, the cutting
@ -310,7 +310,7 @@ in that sub-box.
The *out* keyword writes text to the specified *filename* with the The *out* keyword writes text to the specified *filename* with the
results of each re-balancing operation. The file contains the bounds results of each re-balancing operation. The file contains the bounds
of the sub-domain for each processor after the balancing operation of the subdomain for each processor after the balancing operation
completes. The format of the file is compatible with the completes. The format of the file is compatible with the
`Pizza.py <pizza_>`_ *mdump* tool which has support for manipulating and `Pizza.py <pizza_>`_ *mdump* tool which has support for manipulating and
visualizing mesh files. An example is shown here for a balancing by four visualizing mesh files. An example is shown here for a balancing by four
@ -354,7 +354,7 @@ processors for a 2d problem:
4 1 13 14 15 16 4 1 13 14 15 16
The coordinates of all the vertices are listed in the NODES section, five The coordinates of all the vertices are listed in the NODES section, five
per processor. Note that the four sub-domains share vertices, so there per processor. Note that the four subdomains share vertices, so there
will be duplicate nodes in the list. will be duplicate nodes in the list.
The "SQUARES" section lists the node IDs of the four vertices in a The "SQUARES" section lists the node IDs of the four vertices in a

View File

@ -118,7 +118,7 @@ displaced by the same amount, different on each iteration.
all. Also note that if the box shape tilts to an extreme shape, all. Also note that if the box shape tilts to an extreme shape,
LAMMPS will run less efficiently, due to the large volume of LAMMPS will run less efficiently, due to the large volume of
communication needed to acquire ghost atoms around a processor's communication needed to acquire ghost atoms around a processor's
irregular-shaped sub-domain. For extreme values of tilt, LAMMPS may irregular-shaped subdomain. For extreme values of tilt, LAMMPS may
also lose atoms and generate an error. also lose atoms and generate an error.
.. note:: .. note::

View File

@ -546,7 +546,7 @@ flipping the box when it is exceeded. If the *flip* value is set to
you apply large deformations, this means the box shape can tilt you apply large deformations, this means the box shape can tilt
dramatically LAMMPS will run less efficiently, due to the large volume dramatically LAMMPS will run less efficiently, due to the large volume
of communication needed to acquire ghost atoms around a processor's of communication needed to acquire ghost atoms around a processor's
irregular-shaped sub-domain. For extreme values of tilt, LAMMPS may irregular-shaped subdomain. For extreme values of tilt, LAMMPS may
also lose atoms and generate an error. also lose atoms and generate an error.
The *units* keyword determines the meaning of the distance units used The *units* keyword determines the meaning of the distance units used

View File

@ -198,7 +198,7 @@ dt}{\rho dx^2}` is approximately equal to 1.
and a simulation domain size. This fix uses the same subdivision of and a simulation domain size. This fix uses the same subdivision of
the simulation domain among processors as the main LAMMPS program. In the simulation domain among processors as the main LAMMPS program. In
order to uniformly cover the simulation domain with lattice sites, the order to uniformly cover the simulation domain with lattice sites, the
lengths of the individual LAMMPS sub-domains must all be evenly lengths of the individual LAMMPS subdomains must all be evenly
divisible by :math:`dx_{LB}`. If the simulation domain size is cubic, divisible by :math:`dx_{LB}`. If the simulation domain size is cubic,
with equal lengths in all dimensions, and the default value for with equal lengths in all dimensions, and the default value for
:math:`dx_{LB}` is used, this will automatically be satisfied. :math:`dx_{LB}` is used, this will automatically be satisfied.

View File

@ -371,7 +371,7 @@ flipping the box when it is exceeded. If the *flip* value is set to
applied stress induces large deformations (e.g. in a liquid), this applied stress induces large deformations (e.g. in a liquid), this
means the box shape can tilt dramatically and LAMMPS will run less means the box shape can tilt dramatically and LAMMPS will run less
efficiently, due to the large volume of communication needed to efficiently, due to the large volume of communication needed to
acquire ghost atoms around a processor's irregular-shaped sub-domain. acquire ghost atoms around a processor's irregular-shaped subdomain.
For extreme values of tilt, LAMMPS may also lose atoms and generate an For extreme values of tilt, LAMMPS may also lose atoms and generate an
error. error.

View File

@ -311,7 +311,7 @@ flipping the box when it is exceeded. If the *flip* value is set to
applied stress induces large deformations (e.g. in a liquid), this applied stress induces large deformations (e.g. in a liquid), this
means the box shape can tilt dramatically and LAMMPS will run less means the box shape can tilt dramatically and LAMMPS will run less
efficiently, due to the large volume of communication needed to efficiently, due to the large volume of communication needed to
acquire ghost atoms around a processor's irregular-shaped sub-domain. acquire ghost atoms around a processor's irregular-shaped subdomain.
For extreme values of tilt, LAMMPS may also lose atoms and generate an For extreme values of tilt, LAMMPS may also lose atoms and generate an
error. error.

View File

@ -69,7 +69,7 @@ geometries.
This fix must be used with an additional fix that specifies time This fix must be used with an additional fix that specifies time
integration, e.g. :doc:`fix nve <fix_nve>` or :doc:`fix nph <fix_nh>`. integration, e.g. :doc:`fix nve <fix_nve>` or :doc:`fix nph <fix_nh>`.
The Shardlow splitting algorithm requires the sizes of the sub-domain The Shardlow splitting algorithm requires the sizes of the subdomain
lengths to be larger than twice the cutoff+skin. Generally, the lengths to be larger than twice the cutoff+skin. Generally, the
domain decomposition is dependent on the number of processors domain decomposition is dependent on the number of processors
requested. requested.

View File

@ -90,7 +90,7 @@ The description in this sub-section applies to all 3 fix styles:
*ttm*, *ttm/grid*, and *ttm/mod*. *ttm*, *ttm/grid*, and *ttm/mod*.
Fix *ttm/grid* distributes the regular grid across processors consistent Fix *ttm/grid* distributes the regular grid across processors consistent
with the sub-domains of atoms owned by each processor, but is otherwise with the subdomains of atoms owned by each processor, but is otherwise
identical to fix ttm. Note that fix *ttm* stores a copy of the grid on identical to fix ttm. Note that fix *ttm* stores a copy of the grid on
each processor, which is acceptable when the overall grid is reasonably each processor, which is acceptable when the overall grid is reasonably
small. For larger grids you should use fix *ttm/grid* instead. small. For larger grids you should use fix *ttm/grid* instead.
@ -170,11 +170,11 @@ ttm/mod.
periodic boundary conditions in all dimensions. They also require periodic boundary conditions in all dimensions. They also require
that the size and shape of the simulation box do not vary that the size and shape of the simulation box do not vary
dynamically, e.g. due to use of the :doc:`fix npt <fix_nh>` command. dynamically, e.g. due to use of the :doc:`fix npt <fix_nh>` command.
Likewise, the size/shape of processor sub-domains cannot vary due to Likewise, the size/shape of processor subdomains cannot vary due to
dynamic load-balancing via use of the :doc:`fix balance dynamic load-balancing via use of the :doc:`fix balance
<fix_balance>` command. It is possible however to load balance <fix_balance>` command. It is possible however to load balance
before the simulation starts using the :doc:`balance <balance>` before the simulation starts using the :doc:`balance <balance>`
command, so that each processor has a different size sub-domain. command, so that each processor has a different size subdomain.
Periodic boundary conditions are also used in the heat equation solve Periodic boundary conditions are also used in the heat equation solve
for the electronic subsystem. This varies from the approach of for the electronic subsystem. This varies from the approach of

View File

@ -399,7 +399,7 @@ automatically throughout the run. This typically give performance
within 5 to 10 percent of the optimal fixed fraction. within 5 to 10 percent of the optimal fixed fraction.
The *ghost* keyword determines whether or not ghost atoms, i.e. atoms The *ghost* keyword determines whether or not ghost atoms, i.e. atoms
at the boundaries of processor sub-domains, are offloaded for neighbor at the boundaries of processor subdomains, are offloaded for neighbor
and force calculations. When the value = "no", ghost atoms are not and force calculations. When the value = "no", ghost atoms are not
offloaded. This option can reduce the amount of data transfer with offloaded. This option can reduce the amount of data transfer with
the co-processor and can also overlap MPI communication of forces with the co-processor and can also overlap MPI communication of forces with
@ -521,7 +521,7 @@ the comm keywords.
The value options for the keywords are *no* or *host* or *device*\ . A The value options for the keywords are *no* or *host* or *device*\ . A
value of *no* means to use the standard non-KOKKOS method of value of *no* means to use the standard non-KOKKOS method of
packing/unpacking data for the communication. A value of *host* means to packing/unpacking data for the communication. A value of *host* means to
use the host, typically a multi-core CPU, and perform the use the host, typically a multicore CPU, and perform the
packing/unpacking in parallel with threads. A value of *device* means to packing/unpacking in parallel with threads. A value of *device* means to
use the device, typically a GPU, to perform the packing/unpacking use the device, typically a GPU, to perform the packing/unpacking
operation. operation.

View File

@ -56,7 +56,7 @@ commands:
The global DSMC *max_cell_size* determines the maximum cell length The global DSMC *max_cell_size* determines the maximum cell length
used in the DSMC calculation. A structured mesh is overlayed on the used in the DSMC calculation. A structured mesh is overlayed on the
simulation box such that an integer number of cells are created in simulation box such that an integer number of cells are created in
each direction for each processor's sub-domain. Cell lengths are each direction for each processor's subdomain. Cell lengths are
adjusted up to the user-specified maximum cell size. adjusted up to the user-specified maximum cell size.
---------- ----------

View File

@ -31,7 +31,7 @@ and the neighbor skin distance (see the documentation of the
<comm_modify>` command). When you have bonds, angles, dihedrals, or <comm_modify>` command). When you have bonds, angles, dihedrals, or
impropers defined at the same time, you must set the communication impropers defined at the same time, you must set the communication
cutoff so that communication cutoff distance is large enough to acquire cutoff so that communication cutoff distance is large enough to acquire
and communicate sufficient ghost atoms from neighboring sub-domains as and communicate sufficient ghost atoms from neighboring subdomains as
needed for computing bonds, angles, etc. needed for computing bonds, angles, etc.
A pair style of *none* will also not request a pairwise neighbor list. A pair style of *none* will also not request a pairwise neighbor list.

View File

@ -66,7 +66,7 @@ parameters can be specified with an asterisk "\*", which means LAMMPS
will choose the number of processors in that dimension of the grid. will choose the number of processors in that dimension of the grid.
It will do this based on the size and shape of the global simulation It will do this based on the size and shape of the global simulation
box so as to minimize the surface-to-volume ratio of each processor's box so as to minimize the surface-to-volume ratio of each processor's
sub-domain. subdomain.
Choosing explicit values for Px or Py or Pz can be used to override Choosing explicit values for Px or Py or Pz can be used to override
the default manner in which LAMMPS will create the regular 3d grid of the default manner in which LAMMPS will create the regular 3d grid of
@ -81,7 +81,7 @@ equal 1.
Note that if you run on a prime number of processors P, then a grid Note that if you run on a prime number of processors P, then a grid
such as 1 x P x 1 will be required, which may incur extra such as 1 x P x 1 will be required, which may incur extra
communication costs due to the high surface area of each processor's communication costs due to the high surface area of each processor's
sub-domain. subdomain.
Also note that if multiple partitions are being used then P is the Also note that if multiple partitions are being used then P is the
number of processors in this partition; see the :doc:`-partition command-line switch <Run_options>` page for details. Also note number of processors in this partition; see the :doc:`-partition command-line switch <Run_options>` page for details. Also note
@ -113,10 +113,10 @@ will persist for all simulations. If balancing is performed, some of
the methods invoked by those commands retain the logical topology of the methods invoked by those commands retain the logical topology of
the initial 3d grid, and the mapping of processors to the grid the initial 3d grid, and the mapping of processors to the grid
specified by the processors command. However the grid spacings in specified by the processors command. However the grid spacings in
different dimensions may change, so that processors own sub-domains of different dimensions may change, so that processors own subdomains of
different sizes. If the :doc:`comm_style tiled <comm_style>` command is different sizes. If the :doc:`comm_style tiled <comm_style>` command is
used, methods invoked by the balancing commands may discard the 3d used, methods invoked by the balancing commands may discard the 3d
grid of processors and tile the simulation domain with sub-domains of grid of processors and tile the simulation domain with subdomains of
different sizes and shapes which no longer have a logical 3d different sizes and shapes which no longer have a logical 3d
connectivity. If that occurs, all the information specified by the connectivity. If that occurs, all the information specified by the
processors command is ignored. processors command is ignored.
@ -129,7 +129,7 @@ processors.
The *onelevel* style creates a 3d grid that is compatible with the The *onelevel* style creates a 3d grid that is compatible with the
Px,Py,Pz settings, and which minimizes the surface-to-volume ratio of Px,Py,Pz settings, and which minimizes the surface-to-volume ratio of
each processor's sub-domain, as described above. The mapping of each processor's subdomain, as described above. The mapping of
processors to the grid is determined by the *map* keyword setting. processors to the grid is determined by the *map* keyword setting.
The *twolevel* style can be used on machines with multicore nodes to The *twolevel* style can be used on machines with multicore nodes to
@ -145,7 +145,7 @@ parameters can be specified with an asterisk "\*", which means LAMMPS
will choose the number of cores in that dimension of the node's will choose the number of cores in that dimension of the node's
sub-grid. As with Px,Py,Pz, it will do this based on the size and sub-grid. As with Px,Py,Pz, it will do this based on the size and
shape of the global simulation box so as to minimize the shape of the global simulation box so as to minimize the
surface-to-volume ratio of each processor's sub-domain. surface-to-volume ratio of each processor's subdomain.
.. note:: .. note::

View File

@ -16,7 +16,7 @@ nx,ny,nz = replication factors in each dimension
.. parsed-literal:: .. parsed-literal::
*bbox* = only check atoms in replicas that overlap with a processor's sub-domain *bbox* = only check atoms in replicas that overlap with a processor's subdomain
Examples Examples
"""""""" """"""""
@ -52,7 +52,7 @@ image flags that differ by 1. This will allow the bond to be
unwrapped appropriately. unwrapped appropriately.
The optional keyword *bbox* uses a bounding box to only check atoms in The optional keyword *bbox* uses a bounding box to only check atoms in
replicas that overlap with a processor's sub-domain when assigning replicas that overlap with a processor's subdomain when assigning
atoms to processors. It typically results in a substantial speedup atoms to processors. It typically results in a substantial speedup
when using the replicate command on a large number of processors. It when using the replicate command on a large number of processors. It
does require temporary use of more memory, specifically that each does require temporary use of more memory, specifically that each

View File

@ -64,7 +64,7 @@ The *lost* keyword determines whether LAMMPS checks for lost atoms each
time it computes thermodynamics and what it does if atoms are lost. An time it computes thermodynamics and what it does if atoms are lost. An
atom can be "lost" if it moves across a non-periodic simulation box atom can be "lost" if it moves across a non-periodic simulation box
:doc:`boundary <boundary>` or if it moves more than a box length outside :doc:`boundary <boundary>` or if it moves more than a box length outside
the simulation domain (or more than a processor sub-domain length) the simulation domain (or more than a processor subdomain length)
before reneighboring occurs. The latter case is typically due to bad before reneighboring occurs. The latter case is typically due to bad
dynamics (e.g., too large a time step and/or huge forces and velocities). If dynamics (e.g., too large a time step and/or huge forces and velocities). If
the value is *ignore*, LAMMPS does not check for lost atoms. If the the value is *ignore*, LAMMPS does not check for lost atoms. If the

View File

@ -3432,6 +3432,8 @@ Subclassed
subcutoff subcutoff
subcycle subcycle
subcycling subcycling
subdomain
subdomains
subhi subhi
sublo sublo
Subramaniyan Subramaniyan