880f20c285
Cleaned up kernels
2022-09-15 15:29:14 -05:00
9c4d3db558
Cleaned up and converted arrays to ucl_vector of numtyp4
2022-09-13 16:48:39 -05:00
31047b4a31
Removed mem alloc in precompute_induce, used buffer for packing, and switched to using ucl_vector
2022-09-13 12:53:48 -05:00
7f4efa380a
Re-arranged memory allocation for cgrid_brick, some issues need to be fixed
2022-09-11 18:58:34 -05:00
c58343b2e2
Cleaned up debugging stuffs, need more refactoring and add to hippo
2022-09-09 13:50:41 -05:00
b72b71837e
Moved first_induce_iteration in induce() to the right place
2022-09-09 13:34:57 -05:00
4b8caac727
Made some progress with fphi_uind in the gpu pair style
2022-09-09 12:14:36 -05:00
a0af9627e5
Fixed memory bugs with device array allocations
2022-09-06 16:19:17 -05:00
21b7fb2fcf
Exposing fphi_uind to the gpu pair style, still keeping the part not ready though
2022-09-02 14:55:20 -05:00
cad7e1b364
Moved fphi_uind up to BaseAmoeba
2022-09-02 10:18:59 -05:00
aac264f2e2
Working on the fphi_uind kernel and array allocations
2022-08-30 23:40:04 -05:00
c5c3c697df
Adding fphi_uind kernel, working on the arrays allocation
2022-08-29 00:13:30 -05:00
9e7bbad4d4
Working on fphi_uind in the GPU lib
2022-08-27 13:19:52 -05:00
b160460dcc
Added preprocessors to comment out cufft entirely for now
2022-08-26 12:55:46 -05:00
f4a90c62c0
First attempt to port the forward FFT in the k-space induce term to the GPU, not working yet
2022-08-23 15:42:05 -05:00
28dabb9687
Cleaned up unused variables in the amoeba kernels, made room for convolution gpu
2022-08-16 15:37:49 -05:00
46b8b00a4f
Working on fft on the device
2022-08-15 15:51:43 -05:00
538aa13693
Only transfer data that is needed for umutual2b; allowed convolution and kspace term umutual1 to be overridden by the gpu counterparts
2022-08-10 16:21:30 -05:00
66ee2bf989
Cleaned up
2022-07-14 11:01:30 -05:00
0c44bd1086
Rearranged the order of real-space and kspace part of ufield0c(), delayed device-host transfer from umutual2b() to overlap with kspace part
2022-07-08 14:45:31 -05:00
79fbbd4f33
Cleaned up the API of amoeba and hippo to remove unncessary arguments
2021-10-04 14:40:58 -05:00
5a6426bf96
Only transfer data arrays that are needed in each kernel
2021-10-02 00:56:15 -05:00
f4d3d3a2b5
Gradually cleaned up and removed redundancy in amoeba and hippo
2021-10-02 00:09:53 -05:00
f126f785a4
Removed duplicates in the amoeba kernels
2021-10-01 10:19:17 -05:00
3328ac0df2
Attempted to remove some redundancy in data transfers in the amoeba kernels; keeping HIPPO independent of AMOEBA for now
2021-10-01 09:58:21 -05:00
01381b7f54
Fixed bugs in the repulsion kernel, now working correctly with the double precision mode
2021-09-29 11:57:25 -05:00
6286a119b3
Removed precompute() in hippo
2021-09-28 23:12:07 -05:00
98a2b67292
Changed to the API of BaseAmoeba to reduce duplicates in hippo
2021-09-28 17:39:55 -05:00
b874feb127
Removed trailing spaces
2021-09-28 17:28:33 -05:00
f8bc091cb8
Kept working on the multipole real-space term of hippo
2021-09-25 13:17:06 -05:00
78ef0d631f
Working on the multipole real-space term of hippo
2021-09-25 12:25:34 -05:00
d77d5b7f0a
Added classes for hippo/gpu, refactored BaseAmoeba and made room for the dispersion real-space term in hippo
2021-09-21 15:40:06 -05:00
a2fd784034
Added the dispersion real space term, which is for HIPPO.
2021-09-21 10:55:38 -05:00
42034bd1c9
Fixed bugs for undefined tagint and ucl_powr ambiguity in kernels for OpenCL builds
2021-09-20 12:48:29 -05:00
5d801e985f
More cleanup
2021-09-17 23:24:23 -05:00
f5713a52b3
Added another kernel to accumulate forces, energies and virial on the device (similar to the tersoff kernels) as multiple kernels all added to those quantities; also only copy answers back to the host in the last kernel in a time step; cleaned up debugging messages
2021-09-17 16:39:57 -05:00
2e6df83b9b
Fixed bugs in the multipole real-space part on the GPU; separately multipole real and polar real work correctly (along with udirect2b and umutual2b), but
...
together they are conflicting due to the use of ans to copy forces back from device to host. The other 2 kernels (induce part) do not touch forces and energies.
2021-09-17 15:24:36 -05:00
003bebd31e
Working on the multipole real-space term, not ready yet
2021-09-17 01:19:33 -05:00
98c1a0178c
Refactored the API so that different off2 values are used for different kernels
2021-09-16 17:14:36 -05:00
edd76733a1
Working on umutual2b, tdipdip are correct, but incorrect results for field and fieldp
2021-09-12 00:51:48 -05:00
94d6f7219c
Attempted to reduce the memory footprint of the per-atom arrays
2021-09-11 11:22:17 -05:00
c765861851
Cleaned up and re-arranged the functions to reflect the order of calling in a time step
2021-09-11 01:00:58 -05:00
7f5a82dc54
Switched to the short neighbor list implementation in the pre-10Feb21 version (the recent version enforces tpa = 1 for short nbor)
2021-09-11 00:34:43 -05:00
4ebe5833d3
Working on short nbor list for the amoeba kernels (based on what has been done with tersoff and ellipsod, nbor dev_packed needs to be allocated properly)
2021-09-10 16:51:16 -05:00
efe0bf593f
Adding the umutual2b kernel, need to create another array for tdipdip on the GPU
2021-09-09 15:19:43 -05:00
6f6fd0999c
Both udirect2b and polar_real are working correctly on the GPU
2021-09-09 00:57:21 -05:00
8c5a116d30
Made dfield0c work to compute uind and uinp correctly; need to make sure they are correct for polar_real()
2021-09-08 16:43:33 -05:00
1c5d235f12
Working on the field and fieldp values from GPU back to the host for dfield0c
2021-09-07 16:15:08 -05:00
4e346c2de6
Refactored neighbor list builds and per-atom reallocation parts
2021-09-07 13:05:57 -05:00
7d69a870a4
Reverted the binsize function call from the GPU package in Atom, instead added atom_modify sort with a binsize to ensure matching virial values, enabled the udirect2b kernel, need more work to override dfield0c, and induce() to bypass reverse_comm() for field and fieldp (line amoeba_induce.cpp:111-112)
2021-09-03 13:43:22 -05:00