clarify CUDA versus OpenCL build and runtime restrictions
This commit is contained in:
@ -148,7 +148,6 @@ CMake build
|
||||
* sm_70 for Volta (supported since CUDA 9)
|
||||
* sm_75 for Turing (supported since CUDA 10)
|
||||
* sm_80 for Ampere (supported since CUDA 11)
|
||||
.. * sm_90 for Hopper (supported since CUDA 12)
|
||||
|
||||
A more detailed list can be found, for example,
|
||||
at `Wikipedia's CUDA article <https://en.wikipedia.org/wiki/CUDA#GPUs_supported>`_
|
||||
@ -159,9 +158,11 @@ Thus the GPU_ARCH setting is merely an optimization, to have code for
|
||||
the preferred GPU architecture directly included rather than having to wait
|
||||
for the JIT compiler of the CUDA driver to translate it.
|
||||
|
||||
Version 8.0 or later of the CUDA toolkit is required and a GPU architecture
|
||||
of Kepler or laters, which must *also* be supported by the CUDA toolkit in use
|
||||
**and** the CUDA driver in use.
|
||||
When compiling for CUDA or HIP with CUDA, version 8.0 or later of the CUDA toolkit
|
||||
is required and a GPU architecture of Kepler or later, which must *also* be
|
||||
supported by the CUDA toolkit in use **and** the CUDA driver in use.
|
||||
When compiling for OpenCL, OpenCL version 1.2 or later is required and the
|
||||
GPU must be supported by the GPU driver and OpenCL runtime bundled with the driver.
|
||||
|
||||
When building with CMake, you **must NOT** build the GPU library in ``lib/gpu``
|
||||
using the traditional build procedure. CMake will detect files generated by that
|
||||
|
||||
@ -176,7 +176,8 @@ Makefile.linux_multi after adjusting the settings for the CUDA toolkit in use.
|
||||
|
||||
Only CUDA toolkit version 8.0 and later and only GPU architecture 3.0
|
||||
(aka Kepler) and later are supported by this version of LAMMPS. If you want
|
||||
to use older hard- or software you have to use an older version of LAMMPS.
|
||||
to use older hard- or software you have to compile for OpenCL or use an older
|
||||
version of LAMMPS.
|
||||
|
||||
If you do not want to use a fat binary, that supports multiple CUDA
|
||||
architectures, the CUDA_ARCH must be set to match the GPU architecture. This
|
||||
@ -230,7 +231,8 @@ If GERYON_NUMA_FISSION is defined at build time, LAMMPS will consider separate
|
||||
NUMA nodes on GPUs or accelerators as separate devices. For example, a 2-socket
|
||||
CPU would appear as two separate devices for OpenCL (and LAMMPS would require
|
||||
two MPI processes to use both sockets with the GPU library - each with its
|
||||
own device ID as output by ocl_get_devices).
|
||||
own device ID as output by ocl_get_devices). OpenCL version 1.2 or later is
|
||||
required.
|
||||
|
||||
For a debug build, use "-DUCL_DEBUG -DGERYON_KERNEL_DUMP" and remove
|
||||
"-DUCL_NO_EXIT" and "-DMPI_GERYON" from the build options.
|
||||
|
||||
Reference in New Issue
Block a user