96 lines
4.6 KiB
Plaintext
96 lines
4.6 KiB
Plaintext
|
|
--------------------------------
|
|
LAMMPS Intel(R) Package
|
|
--------------------------------
|
|
|
|
W. Michael Brown (Intel) michael.w.brown at intel.com
|
|
Markus Hohnerbach (RWTH Aachen University)
|
|
William McDoniel (RWTH Aachen University)
|
|
Rodrigo Canales (RWTH Aachen University)
|
|
Stan Moore (Sandia)
|
|
Ahmed E. Ismail (RWTH Aachen University)
|
|
Paolo Bientinesi (RWTH Aachen University)
|
|
Anupama Kurpad (Intel)
|
|
Biswajit Mishra (Shell)
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
This package provides LAMMPS styles that:
|
|
|
|
1. include support for single and mixed precision in addition to double.
|
|
2. include modifications to support vectorization for key routines
|
|
3. include modifications for data layouts to improve cache efficiency
|
|
3. include modifications to support offload to Intel(R) Xeon Phi(TM)
|
|
coprocessors
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
As of 2019/03/28 none of the styles provided in this package support
|
|
tallying per-atom stresses. Any attempt to compute/access it will
|
|
cause an error termination.
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
For Intel server processors codenamed "Skylake", the following flags should
|
|
be added or changed in the Makefile depending on the version:
|
|
|
|
2017 update 2 - No changes needed
|
|
2017 updates 3 or 4 - Use -xCOMMON-AVX512 and not -xHost or -xCORE-AVX512
|
|
2018 initial release - Use -xCOMMON-AVX512 and not -xHost or -xCORE-AVX512
|
|
2018u1 or newer - Use -xHost or -xCORE-AVX512 and -qopt-zmm-usage=high
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
When using the suffix command with "intel", intel styles will be used if they
|
|
exist. If the suffix command is used with "hybrid intel omp" and the OPENMP
|
|
is installed, OPENMP styles will be used whenever INTEL styles are not
|
|
available. This allow for running most styles in LAMMPS with threading.
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
The Long-Range Thread mode (LRT) in the Intel package is enabled through the
|
|
-DLMP_INTEL_USELRT define at compile time. All intel optimized makefiles
|
|
include this define. This feature will use pthreads by default.
|
|
Alternatively, "-DLMP_INTEL_LRT11" can be used to build with compilers that
|
|
support threads intrinsically using the C++11 standard. When using
|
|
LRT mode, you might need to disable OpenMP affinity settings (e.g.
|
|
export KMP_AFFINITY=none). LAMMPS will generate a warning if the settings
|
|
need to be changed.
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
Unless Intel Math Kernel Library (MKL) is unavailable, -DLMP_USE_MKL_RNG
|
|
should be added to the compile flags. This will enable using the MKL Mersenne
|
|
Twister random number generator (RNG) for Dissipative Particle Dynamics
|
|
(DPD). This RNG can allow significantly faster performance and it also has a
|
|
significantly longer period than the standard RNG for DPD.
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
In order to use offload to Intel(R) Xeon Phi(TM) coprocessors, the flag
|
|
-DLMP_INTEL_OFFLOAD should be set in the Makefile. Offload requires the use of
|
|
Intel compilers.
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
For portability reasons, vectorization directives are currently only enabled
|
|
for Intel compilers. Using other compilers may result in significantly
|
|
lower performance. This behavior can be changed by defining
|
|
LMP_SIMD_COMPILER for the preprocessor (see intel_preprocess.h).
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
By default, when running with offload to Intel(R) coprocessors, affinity
|
|
for host MPI tasks and OpenMP threads is set automatically within the code.
|
|
This currently requires the use of system calls. To disable at build time,
|
|
compile with -DINTEL_OFFLOAD_NOAFFINITY.
|
|
|
|
-----------------------------------------------------------------------------
|
|
|
|
Vector intrinsics are temporarily being used for the Stillinger-Weber
|
|
potential to allow for advanced features in the AVX512 instruction set to
|
|
be exploited on early hardware. We hope to see compiler improvements for
|
|
AVX512 that will eliminate this requirement, so it is not recommended to
|
|
develop code based on the intrinsics implementation. Please e-mail the
|
|
authors for more details.
|