clarify CUDA versus OpenCL build and runtime restrictions

2022-04-13 14:22:32 -04:00
parent f3363070e7
commit bd6d7b9136
2 changed files with 9 additions and 6 deletions
--- a/doc/src/Build_extras.rst
+++ b/doc/src/Build_extras.rst
@ -148,7 +148,6 @@ CMake build
 * sm_70 for Volta (supported since CUDA 9)
 * sm_75 for Turing (supported since CUDA 10)
 * sm_80 for Ampere (supported since CUDA 11)
-.. * sm_90 for Hopper (supported since CUDA 12)

 A more detailed list can be found, for example,
 at `Wikipedia's CUDA article <https://en.wikipedia.org/wiki/CUDA#GPUs_supported>`_
@ -159,9 +158,11 @@ Thus the GPU_ARCH setting is merely an optimization, to have code for
 the preferred GPU architecture directly included rather than having to wait
 for the JIT compiler of the CUDA driver to translate it.

-Version 8.0 or later of the CUDA toolkit is required and a GPU architecture
-of Kepler or laters, which must *also* be supported by the CUDA toolkit in use
-**and** the CUDA driver in use.
+When compiling for CUDA or HIP with CUDA, version 8.0 or later of the CUDA toolkit
+is required and a GPU architecture of Kepler or later, which must *also* be
+supported by the CUDA toolkit in use **and** the CUDA driver in use.
+When compiling for OpenCL, OpenCL version 1.2 or later is required and the
+GPU must be supported by the GPU driver and OpenCL runtime bundled with the driver.

 When building with CMake, you **must NOT** build the GPU library in ``lib/gpu``
 using the traditional build procedure. CMake will detect files generated by that
--- a/lib/gpu/README
+++ b/lib/gpu/README
@ -176,7 +176,8 @@ Makefile.linux_multi after adjusting the settings for the CUDA toolkit in use.

 Only CUDA toolkit version 8.0 and later and only GPU architecture 3.0
 (aka Kepler) and later are supported by this version of LAMMPS. If you want
-to use older hard- or software you have to use an older version of LAMMPS.
+to use older hard- or software you have to compile for OpenCL or use an older
+version of LAMMPS.

 If you do not want to use a fat binary, that supports multiple CUDA
 architectures, the CUDA_ARCH must be set to match the GPU architecture. This
@ -230,7 +231,8 @@ If GERYON_NUMA_FISSION is defined at build time, LAMMPS will consider separate
 NUMA nodes on GPUs or accelerators as separate devices. For example, a 2-socket
 CPU would appear as two separate devices for OpenCL (and LAMMPS would require
 two MPI processes to use both sockets with the GPU library - each with its
-own device ID as output by ocl_get_devices).
+own device ID as output by ocl_get_devices). OpenCL version 1.2 or later is
+required.

 For a debug build, use "-DUCL_DEBUG -DGERYON_KERNEL_DUMP" and remove
 "-DUCL_NO_EXIT" and "-DMPI_GERYON" from the build options.