Commit Graph

24 Commits

Author SHA1 Message Date
f9f777b099 Refactored precompute_induce to overlap data transfers with kernel launches 2022-09-18 15:09:26 -05:00
7f4efa380a Re-arranged memory allocation for cgrid_brick, some issues need to be fixed 2022-09-11 18:58:34 -05:00
4b8caac727 Made some progress with fphi_uind in the gpu pair style 2022-09-09 12:14:36 -05:00
21b7fb2fcf Exposing fphi_uind to the gpu pair style, still keeping the part not ready though 2022-09-02 14:55:20 -05:00
b2d6df5bfb Re-arranged some for loops in umutual1 to improve cache-friendly memory access; made placeholder for grid_uind on the GPU lib, maybe FFT is not that heavy to be put on the device. 2022-08-25 23:18:13 -05:00
f4a90c62c0 First attempt to port the forward FFT in the k-space induce term to the GPU, not working yet 2022-08-23 15:42:05 -05:00
28dabb9687 Cleaned up unused variables in the amoeba kernels, made room for convolution gpu 2022-08-16 15:37:49 -05:00
46b8b00a4f Working on fft on the device 2022-08-15 15:51:43 -05:00
0c44bd1086 Rearranged the order of real-space and kspace part of ufield0c(), delayed device-host transfer from umutual2b() to overlap with kspace part 2022-07-08 14:45:31 -05:00
79fbbd4f33 Cleaned up the API of amoeba and hippo to remove unncessary arguments 2021-10-04 14:40:58 -05:00
98a2b67292 Changed to the API of BaseAmoeba to reduce duplicates in hippo 2021-09-28 17:39:55 -05:00
d77d5b7f0a Added classes for hippo/gpu, refactored BaseAmoeba and made room for the dispersion real-space term in hippo 2021-09-21 15:40:06 -05:00
a2fd784034 Added the dispersion real space term, which is for HIPPO. 2021-09-21 10:55:38 -05:00
4e88cd158e Fixed bugs with _tep and _fieldp to allow mixed-precision builds, being defensive with acctyp for these variables 2021-09-20 11:38:50 -05:00
0228867d8e Added the dispersion real space kernel and transfer special coeffs to the device 2021-09-19 23:40:43 -05:00
1166845fcf Prepared data structure for the dispersion real-space term 2021-09-18 10:22:22 -05:00
2e6df83b9b Fixed bugs in the multipole real-space part on the GPU; separately multipole real and polar real work correctly (along with udirect2b and umutual2b), but
together they are conflicting due to the use of ans to copy forces back from device to host. The other 2 kernels (induce part) do not touch forces and energies.
2021-09-17 15:24:36 -05:00
003bebd31e Working on the multipole real-space term, not ready yet 2021-09-17 01:19:33 -05:00
98c1a0178c Refactored the API so that different off2 values are used for different kernels 2021-09-16 17:14:36 -05:00
a22923aee2 Added the API for the umutual kernel, needs work for storing the tdiptdip array 2021-09-09 17:22:09 -05:00
4e346c2de6 Refactored neighbor list builds and per-atom reallocation parts 2021-09-07 13:05:57 -05:00
785a794d39 Added and renamed API to make room for additional kernels (udirect2b only computes the field and fieldp, not accumulating forces, energies, nor virials) 2021-09-01 14:37:11 -05:00
07b60827c4 Working on the udirect2b kernel for the induce real space term, need to add the API for the GPU library 2021-09-01 12:30:41 -05:00
3825fee8e9 Added work on amoeba/gpu, some minor changes to PairAmoeba to allow function overriding in PairAmoebaGPU, added the package AMOEBA to cmake/CMakeLists.txt 2021-08-25 22:57:37 -05:00