use FFT_FFTW_THREADS consistently, add CUFFT to CMake, minor improvements
This commit is contained in:
@ -49,13 +49,19 @@ through the CMAKE\_CXX\_FLAGS variable. Example for CentOS 7:
|
||||
|
||||
-D CMAKE_CXX_FLAGS="-O3 -g -fopenmp -DNDEBUG -std=c++11"
|
||||
|
||||
**Makefile.machine setting**\ :
|
||||
**Makefile.machine setting**\ to bypass the C++11 test and compile in C++98 mode:
|
||||
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
LMP_INC = -DLAMMPS_CXX98
|
||||
|
||||
**Makefile.machine setting**\ to enable the C++11 with older (but not too old) GNU c++ (e.g. on CentOS 7):
|
||||
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
CCFLAGS = -g -O3 -std=c++11
|
||||
|
||||
----------
|
||||
|
||||
@ -86,14 +92,19 @@ LAMMPS can use them if they are available on your system.
|
||||
an exception to the rule that all CMake variables can be specified
|
||||
with lower-case values.
|
||||
|
||||
Usually these settings are all that is needed. If CMake cannot find
|
||||
the FFT library, you can set these variables:
|
||||
Usually these settings are all that is needed. If FFTW3 is selected,
|
||||
then CMake will try to detect, if threaded FFTW libraries are available
|
||||
and enable them by default. This setting is independent of whether
|
||||
OpenMP threads are enabled and a packages like KOKKOS or USER-OMP is
|
||||
used. If CMake cannot detect the FFT library, you can set these variables
|
||||
to assist:
|
||||
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
-D FFTW3_INCLUDE_DIRS=path # path to FFTW3 include files
|
||||
-D FFTW3_LIBRARIES=path # path to FFTW3 libraries
|
||||
-D FFT_FFTW_THREADS=on # enable using threaded FFTW3 libraries
|
||||
-D MKL_INCLUDE_DIRS=path # ditto for Intel MKL library
|
||||
-D MKL_LIBRARIES=path
|
||||
|
||||
@ -105,6 +116,7 @@ the FFT library, you can set these variables:
|
||||
FFT_INC = -DFFT_FFTW3 # -DFFT_FFTW3, -DFFT_FFTW (same as -DFFT_FFTW3), -DFFT_MKL, or -DFFT_KISS
|
||||
# default is KISS if not specified
|
||||
FFT_INC = -DFFT_SINGLE # do not specify for double precision
|
||||
FFT_INC = -DFFT_FFTW_THREADS # enable using threaded FFTW3 libraries
|
||||
FFT_INC = -DFFT_PACK_ARRAY # or -DFFT_PACK_POINTER or -DFFT_PACK_MEMCPY
|
||||
|
||||
# default is FFT\_PACK\_ARRAY if not specified
|
||||
@ -115,6 +127,7 @@ the FFT library, you can set these variables:
|
||||
FFT_INC = -I/usr/local/include
|
||||
FFT_PATH = -L/usr/local/lib
|
||||
FFT_LIB = -lfftw3 # FFTW3 double precision
|
||||
FFT_LIB = -lfftw3 -lfftw3_omp # FFTW3 double precision with threads (needs -DFFT_FFTW_THREADS)
|
||||
FFT_LIB = -lfftw3 -lfftw3f # FFTW3 single precision
|
||||
FFT_LIB = -lmkl_intel_lp64 -lmkl_sequential -lmkl_core # MKL with Intel compiler
|
||||
FFT_LIB = -lmkl_gf_lp64 -lmkl_sequential -lmkl_core # MKL with GNU compier
|
||||
@ -126,16 +139,19 @@ FFT\_LIB with the appropriate FFT libraries to include in the link.
|
||||
**CMake and make info**\ :
|
||||
|
||||
The `KISS FFT library <http://kissfft.sf.net>`_ is included in the LAMMPS
|
||||
distribution. It is portable across all platforms. Depending on the
|
||||
size of the FFTs and the number of processors used, the other
|
||||
libraries listed here can be faster.
|
||||
distribution. It is portable across all platforms. Depending on the size
|
||||
of the FFTs and the number of processors used, the other libraries listed
|
||||
here can be faster.
|
||||
|
||||
However, note that long-range Coulombics are only a portion of the
|
||||
per-timestep CPU cost, FFTs are only a portion of long-range
|
||||
Coulombics, and 1d FFTs are only a portion of the FFT cost (parallel
|
||||
communication can be costly). A breakdown of these timings is printed
|
||||
to the screen at the end of a run using the :doc:`kspace\_style pppm <kspace_style>` command. The :doc:`Run output <Run_output>`
|
||||
doc page gives more details.
|
||||
to the screen at the end of a run when using the
|
||||
:doc:`kspace_style pppm <kspace_style>` command. The :doc:`Run output <Run_output>`
|
||||
doc page gives more details. A more detailed (and time consuming)
|
||||
report of the FFT performance is generated with the
|
||||
:doc:`kspace_modify fftbench yes <kspace_modify>` command.
|
||||
|
||||
FFTW is a fast, portable FFT library that should also work on any
|
||||
platform and can be faster than the KISS FFT library. You can
|
||||
@ -166,7 +182,7 @@ When using -DFFT\_SINGLE with FFTW3 you may need to build the FFTW
|
||||
library a second time with support for single-precision.
|
||||
|
||||
For FFTW3, do the following, which should produce the additional
|
||||
library libfftw3f.a
|
||||
library libfftw3f.a or libfftw3f.so.
|
||||
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
Reference in New Issue
Block a user