Commit Graph

67 Commits

Author SHA1 Message Date
c96ac858bf GPU Package: Adding JIT test for OpenCL prefetch support. 2023-03-07 21:43:19 -08:00
37f22c8627 Misc Improvements to GPU Package
- Optimizations for molecular systems
-   Improved kernel performance and greater CPU overlap
- Reduced GPU to CPU communications for discrete devices
- Switch classic Intel makefiles to use LLVM-based compilers
- Prefetch optimizations supported for OpenCL
- Optimized data repack for quaternions
2023-03-05 21:03:12 -08:00
2ccfe635ce Removed the outdated CUDA_PROXY flag, using CUDA_MPS_SUPPORT consistently in CMake and traditional builds 2023-03-01 16:38:50 -06:00
4244d2e6cd silence compiler warnings about unused parameters and variables 2023-01-19 08:56:54 -05:00
eddd3d6f25 Fixed a bug with extra being nullptr when _host_view is true: always allocate extra
(Note that BaseAmoeba has its own cast_extra_data() that doesn't know if extra is allocated properly, it is the case when _host_view is false for dedicated GPUs for example)
2023-01-18 20:04:45 -06:00
9dc0369cee Attempted to resolve the address space change issue when casting for OpenCL 2.0 (ref: https://www.intel.com/content/www/us/en/developer/articles/technical/the-generic-address-space-in-opencl-20.html#06_address_space_casting) 2023-01-15 23:28:48 -06:00
8af77c690c Merge branch 'develop' into amoeba-gpu 2022-12-14 13:16:41 -06:00
80a141d9c8 silence compiler warnings 2022-11-01 03:38:08 -04:00
9d081a5916 more adjustments for bogus timer results on Intel OpenCL 2022-10-19 07:39:56 -04:00
f867adc541 GPU Package fix where timing disable could result in event/marker destruction before completion on accelerator during initialization. 2022-10-19 02:16:29 -04:00
00f46120c7 Removed max_cus() from Device, used device->gpu->cus() instead 2022-10-07 15:50:30 -05:00
5a98a38e24 GPU Package: Switching to parallel GPU initialization / JIT compilation. 2022-10-07 13:25:14 -07:00
f715f174bb GPU Package: Print OCL platform name to screen when multiple platforms 2022-10-06 21:40:42 -07:00
6b9e83fe20 Added timing for the induced dipole spreading part, computed the block size to ensure all the CUs are occupied by the fphi_uind and fphi_mpole kernels 2022-10-06 15:03:58 -05:00
2ef6a59c0a Merge branch 'develop' into amoeba-gpu 2022-10-01 00:38:24 -05:00
6e34d21b24 GPU Package: Switching back to timer disabling with multiple MPI tasks per GPU. Logic added to prevent mem leak. 2022-09-28 21:02:16 -07:00
a14f0cfd6c Merge branch 'amoeba' into amoeba-gpu, update the gpu pair styles with the base class 2022-06-28 12:54:27 -05:00
d4ea5ca49e more clang-tidy fixes after re-running it with added settings 2022-05-14 07:18:05 -04:00
f09556018b fix bugs reported by @jibril-b-coulibaly 2022-04-28 14:47:53 -04:00
531e553162 Merge branch 'amoeba' into amoeba-gpu 2022-04-22 16:10:24 -05:00
d6f7570d57 avoid redundant use of boolean literals 2022-04-10 20:47:31 -04:00
39b316729b use auto type when assigning from cast or using new 2022-04-10 18:16:36 -04:00
a17bdf5652 silence compiler warnings and avoid infinite recursion in aspherical pair styles 2022-02-11 21:06:16 -05:00
87b63f768f Only check for GPU double precision support if a GPU is present 2021-10-18 12:15:05 -04:00
afad3f42d5 Report only compatible GPU, i.e. no GPU if mixed/double precision is requested by the hardware does not support it 2021-10-13 21:15:16 -04:00
42dca75225 add check and suitable error message when fp64 is required but not available 2021-09-24 12:17:58 -04:00
17ba0d5804 possible workaround for some GPU package neighbor list issue 2021-09-22 21:47:32 -04:00
e6fb0e3bd8 small tweaks 2021-09-17 16:51:37 -04:00
353b3a2bb3 reformat for increased readability 2021-09-16 07:25:04 -04:00
2de482f825 Merge pull request #2911 from akohlmey/fix-gpu-package-issues
Fix minor GPU package issues for the stable release
2021-08-30 13:45:23 -04:00
39d8b239ff don't report bogus timings 2021-08-29 17:56:47 -04:00
89556f0bcb Override any OpenCL fast math JIT settings for born/coul/wolf{/cs}/gpu to resolve numerical deviations seen with some OpenCL implementations. 2021-08-28 17:01:58 -07:00
91317b2879 Added changes to Atom and Device classes for allocation of extra fields and SBBITS15 and NEIGHMASK15 2021-08-26 09:33:20 -05:00
a687868c69 finalize available GPU hardware introspection functions 2021-05-10 16:34:27 -04:00
fbdcfb2f72 preliminary interface to detect whether a viable GPU is present 2021-05-10 09:16:51 -04:00
299ad3b37d work around bogus device overhead info in OpenCL 2021-05-08 23:43:15 -04:00
717faa6515 correctly detect GPU package with CUDA api 2021-04-06 19:12:28 -04:00
45c782308c Fixing issue from recent GPU package update with OMP_NUM_THREADS env being overridden in GPU library.
Fixing race condition with OpenMP for GPU styles using torque (missed in regression tests due to the first fix)
Documenting GPU package option for setting the number of threads (consistent with USER-INTEL and USER-OMP).
2021-02-18 21:08:18 -08:00
e7e2d2323b Feb2021 GPU Package Update - GPU Package Files 2021-02-15 08:20:50 -08:00
56909e88b1 implement accelerator introspection for GPU package 2021-01-11 17:03:23 -05:00
f1ef7d85a8 T2345: Replace instances of NULL with nullptr
The following changes have been applied to src and lib folders:
regex replace: ([^"_])NULL ⇒ \1nullptr (8968 chgs in src, 1153 in lib)
Manually find/change: (void \*) nullptr ⇒ nullptr (1 case)
regex find: ".*?nullptr.*?"
  Manually ~14 cases back to "NULL" in src, ~2 in lib
  regex finds a few false positive where nullptr appears between two
  strings in a function call
2020-09-12 09:34:38 -06:00
66c5fa2abd Merge 'gpu_hip_port' into master 2020-01-28 20:35:08 +03:00
e832b5d50b make clang++ happy when trying to compile the GPU library 2019-07-12 15:42:16 -04:00
8c3d18520d add missing include needed on ppc64le 2019-06-26 10:45:31 -06:00
4a4297591e Did some more cleanups 2019-04-17 12:04:31 -05:00
1f43efc111 Cleaned up the changes in Device and the base class of the pair styles 2019-04-17 00:09:49 -05:00
c55009a0ac Enabled neighbor list build on the device with pair_style hybrid and hybrid/overlay 2019-04-16 23:30:25 -05:00
cd6b23d104 explicitly request OpenCL version 1.2 compatibility when compiling GPU package kernels for OpenCL 2019-03-22 09:50:31 -04:00
923ae041dc (temporary) workaround for memory leaks with OpenCL and MPI for upcoming stable release 2018-07-23 15:52:42 -04:00
de8176b4fc various minor OpenCL related fixes and improvements to the GPU package
- document previously undocumented OpenCL tune settings
- implement OpenCL platform selection through prefixing the device type with the platform id separated by a colon
- allow passing custom tune parameters though postfixing the device type with the 13 tuneable parameters separated by commas
- remove an extra clear() that would delete device properties structs an cause LAMMPS to output garbage strings
2018-07-20 14:41:54 -04:00