Merge branch 'develop' into collected-small-fixes

This commit is contained in:
Axel Kohlmeyer
2024-02-20 20:45:15 -05:00
8 changed files with 311 additions and 64 deletions

View File

@ -268,6 +268,12 @@ molecule can be specified in the molecule file. See the
required to be in this file are the coordinates and types of atoms in
the molecule.
.. note::
If you are using the *mol* keyword in combination with the
:doc:`atom style template <atom_style>` command, they must use
the same molecule template-ID.
Using a lattice to add molecules, e.g. via the *box* or *region* or
*single* styles, is exactly the same as adding atoms on lattice
points, except that entire molecules are added at each point, i.e. on

View File

@ -74,7 +74,7 @@ Syntax
*no_affinity* values = none
*kokkos* args = keyword value ...
zero or more keyword/value pairs may be appended
keywords = *neigh* or *neigh/qeq* or *neigh/thread* or *neigh/transpose* or *newton* or *binsize* or *comm* or *comm/exchange* or *comm/forward* or *comm/pair/forward* or *comm/fix/forward* or *comm/reverse* or *comm/pair/reverse* or *sort* or *gpu/aware* or *pair/only*
keywords = *neigh* or *neigh/qeq* or *neigh/thread* or *neigh/transpose* or *newton* or *binsize* or *comm* or *comm/exchange* or *comm/forward* or *comm/pair/forward* or *comm/fix/forward* or *comm/reverse* or *comm/pair/reverse* or *sort* or *atom/map* or *gpu/aware* or *pair/only*
*neigh* value = *full* or *half*
full = full neighbor list
half = half neighbor list built in thread-safe manner
@ -108,6 +108,9 @@ Syntax
*sort* value = *no* or *device*
*no* = perform atom sorting in non-KOKKOS mode
*device* = perform atom sorting on device (e.g. on GPU)
*atom/map* value = *no* or *device*
*no* = build atom map in non-KOKKOS mode
*device* = build atom map on device (e.g. on GPU)
*gpu/aware* = *off* or *on*
*off* = do not use GPU-aware MPI
*on* = use GPU-aware MPI (default)
@ -566,15 +569,19 @@ performing the exchange pack/unpack on the host CPU can give speedup
since it reduces the number of CUDA kernel launches.
The *sort* keyword determines whether the host or device performs atom
sorting, see the :doc:`atom_modify sort <atom_modify>` command. The
value options for the *sort* keyword are *no* or *device* similar to the
*comm* keywords above. If a value of *host* is used it will be
automatically be changed to *no* since the *sort* keyword does not
support *host* mode. The value of *no* will also always be used when
running on the CPU, i.e. setting the value to *device* will have no
effect if the simulation is running on the CPU. Not all fix styles with
extra atom data support *device* mode and in that case a warning will be
given and atom sorting will run in *no* mode instead.
sorting, see the :doc:`atom_modify sort <atom_modify>` command. The value
options for the *sort* keyword are *no* or *device* similar to the *comm*
keywords above. If a value of *host* is used it will be automatically be
changed to *no* since the *sort* keyword does not support *host* mode. Not
all fix styles with extra atom data support *device* mode and in that case
a warning will be given and atom sorting will run in *no* mode instead.
.. versionadded:: TBD
The *atom/map* keyword determines whether the host or device builds the
atom_map, see the :doc:`atom_modify map <atom_modify>` command. The
value options for the *atom/map* keyword are identical to the *sort*
keyword above.
The *gpu/aware* keyword chooses whether GPU-aware MPI will be used. When
this keyword is set to *on*, buffers in GPU memory are passed directly
@ -593,12 +600,13 @@ for OpenMPI 1.8 (or later versions), Mvapich2 1.9 (or later) when the
Spectrum MPI when the "-gpu" flag is used.
The *pair/only* keyword can change how the KOKKOS suffix "kk" is applied
when using an accelerator device. By default device acceleration is
always used for all available styles. With *pair/only* set to *on* the
suffix setting will choose device acceleration only for pair styles and
run all other force computations on the host CPU.
The *comm* flags will also automatically be changed to *no*\ . This can
result in better performance for certain configurations and system sizes.
when using an accelerator device. By default device acceleration is always
used for all available styles. With *pair/only* set to *on* the suffix
setting will choose device acceleration only for pair styles and run all
other force computations on the host CPU. The *comm* flags, along with the
*sort* and *atom/map* keywords will also automatically be changed to *no*\ .
This can result in better performance for certain configurations and
system sizes.
----------
@ -684,18 +692,18 @@ Restrictions
This command cannot be used after the simulation box is defined by a
:doc:`read_data <read_data>` or :doc:`create_box <create_box>` command.
The gpu style of this command can only be invoked if LAMMPS was built
The *gpu* style of this command can only be invoked if LAMMPS was built
with the GPU package. See the :doc:`Build package <Build_package>` doc
page for more info.
The intel style of this command can only be invoked if LAMMPS was
The *intel* style of this command can only be invoked if LAMMPS was
built with the INTEL package. See the :doc:`Build package <Build_package>` page for more info.
The kk style of this command can only be invoked if LAMMPS was built
The *kokkos* style of this command can only be invoked if LAMMPS was built
with the KOKKOS package. See the :doc:`Build package <Build_package>`
doc page for more info.
The omp style of this command can only be invoked if LAMMPS was built
The *omp* style of this command can only be invoked if LAMMPS was built
with the OPENMP package. See the :doc:`Build package <Build_package>`
doc page for more info.
@ -704,19 +712,27 @@ Related commands
:doc:`suffix <suffix>`, :doc:`-pk command-line switch <Run_options>`
Default
"""""""
Defaults
""""""""
For the GPU package, the default is Ngpu = 0 and the option defaults are neigh
= yes, newton = off, binsize = 0.0, split = 1.0, gpuID = 0 to Ngpu-1, tpa = 1,
omp = 0, and platform=-1. These settings are made automatically if the "-sf
gpu" :doc:`command-line switch <Run_options>` is used. If it is not used, you
must invoke the package gpu command in your input script or via the "-pk gpu"
:doc:`command-line switch <Run_options>`.
For the GPU package, the default parameters and settings are:
For the INTEL package, the default is Nphi = 1 and the option defaults are omp
= 0, mode = mixed, lrt = no, balance = -1, tpc = 4, tptask = 240, pppm_table =
yes. The default ghost option is determined by the pair style being used.
.. parsed-literal::
Ngpu = 0, neigh = yes, newton = off, binsize = 0.0, split = 1.0, gpuID = 0 to Ngpu-1, tpa = 1, omp = 0, platform=-1.
These settings are made automatically if the "-sf gpu"
:doc:`command-line switch <Run_options>` is used. If it is not used,
you must invoke the package gpu command in your input script or via the
"-pk gpu" :doc:`command-line switch <Run_options>`.
For the INTEL package, the default parameters and settings are:
.. parsed-literal::
Nphi = 1, omp = 0, mode = mixed, lrt = no, balance = -1, tpc = 4, tptask = 240, pppm_table = yes
The default ghost option is determined by the pair style being used.
This value is output to the screen in the offload report at the end of each
run. Note that all of these settings, except "omp" and "mode", are ignored if
LAMMPS was not built with Xeon Phi co-processor support. These settings are
@ -724,20 +740,35 @@ made automatically if the "-sf intel" :doc:`command-line switch <Run_options>`
is used. If it is not used, you must invoke the package intel command in your
input script or via the "-pk intel" :doc:`command-line switch <Run_options>`.
For the KOKKOS package, the option defaults for GPUs are neigh = full,
neigh/qeq = full, newton = off, binsize for GPUs = 2x LAMMPS default value,
comm = device, sort = device, neigh/transpose = off, gpu/aware = on. When
LAMMPS can safely detect that GPU-aware MPI is not available, the default value
of gpu/aware becomes "off". For CPUs or Xeon Phis, the option defaults are
neigh = half, neigh/qeq = half, newton = on, binsize = 0.0, comm = no, and sort
= no. For GPUs, option neigh/thread = on when there are 16k atoms or less on an MPI
rank, otherwise it is "off". These settings are made automatically by the
required "-k on" :doc:`command-line switch <Run_options>`. You can change them
by using the package kokkos command in your input script or via the :doc:`-pk
kokkos command-line switch <Run_options>`.
For the KOKKOS package when using GPUs, the option defaults are:
For the OMP package, the default is Nthreads = 0 and the option defaults are
neigh = yes. These settings are made automatically if the "-sf omp"
:doc:`command-line switch <Run_options>` is used. If it is not used, you must
invoke the package omp command in your input script or via the "-pk omp"
:doc:`command-line switch <Run_options>`.
.. parsed-literal::
neigh = full, neigh/qeq = full, newton = off, binsize = 2x LAMMPS default value, comm = device, sort = device, atom/map = device, neigh/transpose = off, gpu/aware = on
For GPUs, option neigh/thread = on when there are 16k atoms or less on
an MPI rank, otherwise it is "off". When LAMMPS can safely detect that
GPU-aware MPI is not available, the default value of gpu/aware becomes
"off".
For the KOKKOS package when using CPUs or Xeon Phis, the option defaults are:
.. parsed-literal::
neigh = half, neigh/qeq = half, newton = on, binsize = 0.0, comm = no, sort = no, atom/map = no
These settings are made automatically by
the required "-k on" :doc:`command-line switch <Run_options>`. You can
change them by using the package kokkos command in your input script or
via the :doc:`-pk kokkos command-line switch <Run_options>`.
For the OMP package, the defaults are
.. parsed-literal::
Nthreads = 0, neigh = yes
These settings are made automatically if the "-sf omp"
:doc:`command-line switch <Run_options>` is used. If it is not used,
you must invoke the package omp command in your input script or via the
"-pk omp" :doc:`command-line switch <Run_options>`.