Merge pull request #3532 from stanmoore1/kk_occupancy

Update Kokkos version in LAMMPS to 3.7.1
This commit is contained in:
Axel Kohlmeyer
2023-01-20 17:52:05 -05:00
committed by GitHub
67 changed files with 2295 additions and 561 deletions

View File

@ -105,13 +105,12 @@ Either the full word or an abbreviation can be used for the keywords.
Note that the keywords do not use a leading minus sign. I.e. the
keyword is "t", not "-t". Also note that each of the keywords has a
default setting. Examples of when to use these options and what
settings to use on different platforms is given on the :doc:`KOKKOS package <Speed_kokkos>`
doc page.
settings to use on different platforms is given on the :doc:`KOKKOS
package <Speed_kokkos>` doc page.
* d or device
* g or gpus
* t or threads
* n or numa
.. parsed-literal::
@ -164,19 +163,10 @@ the number of physical cores per node, to use your available hardware
optimally. This also sets the number of threads used by the host when
LAMMPS is compiled with CUDA=yes.
.. parsed-literal::
.. deprecated:: 22Dec2022
numa Nm
This option is only relevant when using pthreads with hwloc support.
In this case Nm defines the number of NUMA regions (typically sockets)
on a node which will be utilized by a single MPI rank. By default Nm
= 1. If this option is used the total number of worker-threads per
MPI rank is threads\*numa. Currently it is always almost better to
assign at least one MPI rank per NUMA region, and leave numa set to
its default value of 1. This is because letting a single process span
multiple NUMA regions induces a significant amount of cross NUMA data
traffic which is slow.
Support for the "numa" or "n" option was removed as its functionality
was ignored in Kokkos for some time already.
----------