clarify CUDA versus OpenCL build and runtime restrictions

This commit is contained in:
Axel Kohlmeyer
2022-04-13 14:22:32 -04:00
parent f3363070e7
commit bd6d7b9136
2 changed files with 9 additions and 6 deletions

View File

@ -148,7 +148,6 @@ CMake build
* sm_70 for Volta (supported since CUDA 9)
* sm_75 for Turing (supported since CUDA 10)
* sm_80 for Ampere (supported since CUDA 11)
.. * sm_90 for Hopper (supported since CUDA 12)
A more detailed list can be found, for example,
at `Wikipedia's CUDA article <https://en.wikipedia.org/wiki/CUDA#GPUs_supported>`_
@ -159,9 +158,11 @@ Thus the GPU_ARCH setting is merely an optimization, to have code for
the preferred GPU architecture directly included rather than having to wait
for the JIT compiler of the CUDA driver to translate it.
Version 8.0 or later of the CUDA toolkit is required and a GPU architecture
of Kepler or laters, which must *also* be supported by the CUDA toolkit in use
**and** the CUDA driver in use.
When compiling for CUDA or HIP with CUDA, version 8.0 or later of the CUDA toolkit
is required and a GPU architecture of Kepler or later, which must *also* be
supported by the CUDA toolkit in use **and** the CUDA driver in use.
When compiling for OpenCL, OpenCL version 1.2 or later is required and the
GPU must be supported by the GPU driver and OpenCL runtime bundled with the driver.
When building with CMake, you **must NOT** build the GPU library in ``lib/gpu``
using the traditional build procedure. CMake will detect files generated by that

View File

@ -176,7 +176,8 @@ Makefile.linux_multi after adjusting the settings for the CUDA toolkit in use.
Only CUDA toolkit version 8.0 and later and only GPU architecture 3.0
(aka Kepler) and later are supported by this version of LAMMPS. If you want
to use older hard- or software you have to use an older version of LAMMPS.
to use older hard- or software you have to compile for OpenCL or use an older
version of LAMMPS.
If you do not want to use a fat binary, that supports multiple CUDA
architectures, the CUDA_ARCH must be set to match the GPU architecture. This
@ -230,7 +231,8 @@ If GERYON_NUMA_FISSION is defined at build time, LAMMPS will consider separate
NUMA nodes on GPUs or accelerators as separate devices. For example, a 2-socket
CPU would appear as two separate devices for OpenCL (and LAMMPS would require
two MPI processes to use both sockets with the GPU library - each with its
own device ID as output by ocl_get_devices).
own device ID as output by ocl_get_devices). OpenCL version 1.2 or later is
required.
For a debug build, use "-DUCL_DEBUG -DGERYON_KERNEL_DUMP" and remove
"-DUCL_NO_EXIT" and "-DMPI_GERYON" from the build options.