6442e05988
even more define to static constexpr conversions
2024-01-25 02:17:28 -05:00
a6d178194e
use consistent names and capitalization in comments
2023-08-03 10:59:31 -04:00
d6412dc97b
Attempted to resolve issues with switching from acctyp4 to acctyp3 in tep, fieldp since the changes in PR #3675 , noting some changes with Intel OCL PR #3663
2023-07-08 00:50:19 -05:00
d2faf86214
Merge branch 'develop' into bond-harmonic-restrain
2023-03-14 00:41:28 -04:00
17f39d9d2c
rename fix STORE/PERATOM to STORE/ATOM
2023-03-13 22:33:47 -04:00
37f22c8627
Misc Improvements to GPU Package
...
- Optimizations for molecular systems
- Improved kernel performance and greater CPU overlap
- Reduced GPU to CPU communications for discrete devices
- Switch classic Intel makefiles to use LLVM-based compilers
- Prefetch optimizations supported for OpenCL
- Optimized data repack for quaternions
2023-03-05 21:03:12 -08:00
722e583b59
use available introspection API to get accumulator data type. update name of flag.
2023-01-25 05:22:49 -05:00
e068b14969
make consistent and simplify
2023-01-25 02:56:05 -05:00
8786819993
use FFT_SCALAR more consistently to perhaps support single precision FFT some time
...
also, use "override" instead of virtual and add a forgotten virtual
2023-01-24 22:32:40 -05:00
617d70dd1c
Replaced MPI_Wtime() with platform::walltime(), put the low-level timing breakdown inside #if DEBUG_AMOEBA
2023-01-20 14:19:16 -06:00
b59ee8d16c
silence compiler warnings
2023-01-17 03:54:49 -05:00
28fbc2631b
Fixed another bug with ic_kspace being nullptr
2023-01-16 22:33:21 -06:00
b3e45c29ca
Removed whitespaces
2023-01-16 10:30:03 -06:00
973b46a907
Attempted to resolve the memory access runtime errors when acquiring single and mixed precision arrays from the GPU lib
2023-01-16 10:12:42 -06:00
c9ae41246d
Ran the four make commands in the src folder: make fix-whitespace; make fix-homepage; make fix-errordocs; make fix-permissions
2023-01-15 16:05:36 -06:00
c21f2faa1f
Cleaned up debug statements and unused sections in the amoeba and hippo gpu styles
2023-01-14 20:02:36 -06:00
2f1f7ee0fa
Cleaned up code
2022-11-03 23:45:40 -05:00
6b9e83fe20
Added timing for the induced dipole spreading part, computed the block size to ensure all the CUs are occupied by the fphi_uind and fphi_mpole kernels
2022-10-06 15:03:58 -05:00
2ef6a59c0a
Merge branch 'develop' into amoeba-gpu
2022-10-01 00:38:24 -05:00
1d75ca3b20
Moved precompute() out of the terms in amoeba and hippo, to be involed in the first term in a time step: multipole for amoeba and repulsion for hippo
2022-09-30 16:31:13 -05:00
e6d2582642
Updated fphi_mpole, renamed precompute_induce to precompute_kspace
2022-09-28 15:08:18 -05:00
785131932c
Added fphi_mpole in amoeba/gpu, fixed a bug in the kernel when indexing grid
2022-09-20 13:58:17 -05:00
caa66d904e
Cleaned up GPU lib functions
2022-09-18 15:54:12 -05:00
f9f777b099
Refactored precompute_induce to overlap data transfers with kernel launches
2022-09-18 15:09:26 -05:00
62ecf98cda
Enabled fphi_uind in hippo/gpu, really need to refactor hippo and amoeba in the GPU lib to remove kernel duplicates
2022-09-16 14:47:16 -05:00
880f20c285
Cleaned up kernels
2022-09-15 15:29:14 -05:00
cd3a00c2c4
Added timing breakdown for fphi_uind
2022-09-14 15:28:44 -05:00
17e54c9390
Updated the GPU API in the gpu pair style
2022-09-11 19:00:40 -05:00
363b6c51d0
Used local arrays and re-arranged for coalesced global memory writes
2022-09-10 02:31:39 -05:00
c58343b2e2
Cleaned up debugging stuffs, need more refactoring and add to hippo
2022-09-09 13:50:41 -05:00
b72b71837e
Moved first_induce_iteration in induce() to the right place
2022-09-09 13:34:57 -05:00
4b8caac727
Made some progress with fphi_uind in the gpu pair style
2022-09-09 12:14:36 -05:00
21b7fb2fcf
Exposing fphi_uind to the gpu pair style, still keeping the part not ready though
2022-09-02 14:55:20 -05:00
b2d6df5bfb
Re-arranged some for loops in umutual1 to improve cache-friendly memory access; made placeholder for grid_uind on the GPU lib, maybe FFT is not that heavy to be put on the device.
2022-08-25 23:18:13 -05:00
28dabb9687
Cleaned up unused variables in the amoeba kernels, made room for convolution gpu
2022-08-16 15:37:49 -05:00
f1112ab6b6
Working on the gpu kspace induce term: dipole spreading and/or fft calls
2022-08-15 14:28:46 -05:00
e980838ae2
Added timings for real-space and k-space portions for the terms
2022-08-02 16:45:06 -05:00
93784f35e3
Added ucl_erfc to the opencl, cuda and hip backends; reverted to using erfc instead of approximation to ensure double-precision matches
2022-07-25 15:34:44 -05:00
0c44bd1086
Rearranged the order of real-space and kspace part of ufield0c(), delayed device-host transfer from umutual2b() to overlap with kspace part
2022-07-08 14:45:31 -05:00
78d6df5ba9
Removed temporary arrays in hippo/gpu induce, flipped sign of the viriral terms in torque2force in hippo/gpu
2022-07-06 11:17:08 -05:00
ee5afdc146
Updated all the gpu ready terms
2022-07-04 23:24:31 -05:00
5dab809522
Flipped force sign in polar_real, made sure that multipole_real is true for precompute() to be invoked, ubdirect2b() is segfault and needs work
2022-07-04 01:38:22 -05:00
f4900d131a
Working on the multipole term on the gpu side, incorrect virials
2022-07-01 16:26:25 -05:00
a14f0cfd6c
Merge branch 'amoeba' into amoeba-gpu, update the gpu pair styles with the base class
2022-06-28 12:54:27 -05:00
79fbbd4f33
Cleaned up the API of amoeba and hippo to remove unncessary arguments
2021-10-04 14:40:58 -05:00
e0f91b96fe
Cleaned up and added necessary comments
2021-09-29 13:07:20 -05:00
b874feb127
Removed trailing spaces
2021-09-28 17:28:33 -05:00
d77d5b7f0a
Added classes for hippo/gpu, refactored BaseAmoeba and made room for the dispersion real-space term in hippo
2021-09-21 15:40:06 -05:00
a2fd784034
Added the dispersion real space term, which is for HIPPO.
2021-09-21 10:55:38 -05:00
0228867d8e
Added the dispersion real space kernel and transfer special coeffs to the device
2021-09-19 23:40:43 -05:00