Commit Graph

40 Commits

Author SHA1 Message Date
4b8caac727 Made some progress with fphi_uind in the gpu pair style 2022-09-09 12:14:36 -05:00
a0af9627e5 Fixed memory bugs with device array allocations 2022-09-06 16:19:17 -05:00
cad7e1b364 Moved fphi_uind up to BaseAmoeba 2022-09-02 10:18:59 -05:00
aac264f2e2 Working on the fphi_uind kernel and array allocations 2022-08-30 23:40:04 -05:00
c5c3c697df Adding fphi_uind kernel, working on the arrays allocation 2022-08-29 00:13:30 -05:00
28dabb9687 Cleaned up unused variables in the amoeba kernels, made room for convolution gpu 2022-08-16 15:37:49 -05:00
aad4e417f9 Moved temp variables inside neighbor loops 2022-08-03 12:33:48 -05:00
a54f0b684d Moved temp variables inside the loop over neighbors 2022-08-03 10:56:52 -05:00
93784f35e3 Added ucl_erfc to the opencl, cuda and hip backends; reverted to using erfc instead of approximation to ensure double-precision matches 2022-07-25 15:34:44 -05:00
675c2d38a3 Flipped sign of forces and virial terms in the hippo kernels 2022-07-05 14:37:26 -05:00
5dab809522 Flipped force sign in polar_real, made sure that multipole_real is true for precompute() to be invoked, ubdirect2b() is segfault and needs work 2022-07-04 01:38:22 -05:00
f4900d131a Working on the multipole term on the gpu side, incorrect virials 2022-07-01 16:26:25 -05:00
0f0f6a51de Renamed sp_polar to sp_amoeba, and replaced special_wscale with special_hal for amoeba 2021-10-02 16:02:44 -05:00
3328ac0df2 Attempted to remove some redundancy in data transfers in the amoeba kernels; keeping HIPPO independent of AMOEBA for now 2021-10-01 09:58:21 -05:00
b874feb127 Removed trailing spaces 2021-09-28 17:28:33 -05:00
e80eea56ba Added udirect2b and umutual2b for hippo 2021-09-28 14:59:39 -05:00
bebef18495 Cleaned up and minor changes 2021-09-21 23:46:21 -05:00
a2fd784034 Added the dispersion real space term, which is for HIPPO. 2021-09-21 10:55:38 -05:00
42034bd1c9 Fixed bugs for undefined tagint and ucl_powr ambiguity in kernels for OpenCL builds 2021-09-20 12:48:29 -05:00
4e88cd158e Fixed bugs with _tep and _fieldp to allow mixed-precision builds, being defensive with acctyp for these variables 2021-09-20 11:38:50 -05:00
0228867d8e Added the dispersion real space kernel and transfer special coeffs to the device 2021-09-19 23:40:43 -05:00
1166845fcf Prepared data structure for the dispersion real-space term 2021-09-18 10:22:22 -05:00
78045d8f76 Cleaned up debugging stuffs and unused variables 2021-09-17 23:13:51 -05:00
f5713a52b3 Added another kernel to accumulate forces, energies and virial on the device (similar to the tersoff kernels) as multiple kernels all added to those quantities; also only copy answers back to the host in the last kernel in a time step; cleaned up debugging messages 2021-09-17 16:39:57 -05:00
2e6df83b9b Fixed bugs in the multipole real-space part on the GPU; separately multipole real and polar real work correctly (along with udirect2b and umutual2b), but
together they are conflicting due to the use of ans to copy forces back from device to host. The other 2 kernels (induce part) do not touch forces and energies.
2021-09-17 15:24:36 -05:00
d926705950 Short neighbor list for multipole real-space should be built with off2_mpole 2021-09-17 01:32:00 -05:00
003bebd31e Working on the multipole real-space term, not ready yet 2021-09-17 01:19:33 -05:00
bc665999d5 Fixed bugs with the umutual2b kernel, now the field and fieldp seems correct 2021-09-13 01:11:03 -05:00
edd76733a1 Working on umutual2b, tdipdip are correct, but incorrect results for field and fieldp 2021-09-12 00:51:48 -05:00
c765861851 Cleaned up and re-arranged the functions to reflect the order of calling in a time step 2021-09-11 01:00:58 -05:00
7f5a82dc54 Switched to the short neighbor list implementation in the pre-10Feb21 version (the recent version enforces tpa = 1 for short nbor) 2021-09-11 00:34:43 -05:00
4ebe5833d3 Working on short nbor list for the amoeba kernels (based on what has been done with tersoff and ellipsod, nbor dev_packed needs to be allocated properly) 2021-09-10 16:51:16 -05:00
a22923aee2 Added the API for the umutual kernel, needs work for storing the tdiptdip array 2021-09-09 17:22:09 -05:00
b654f293ee Working on the umutual2b kernel, the tdipdip values are computed on the fly for now, maybe a seprate neigh list as in the CPU version will be more efficient 2021-09-09 16:52:27 -05:00
efe0bf593f Adding the umutual2b kernel, need to create another array for tdipdip on the GPU 2021-09-09 15:19:43 -05:00
1c5d235f12 Working on the field and fieldp values from GPU back to the host for dfield0c 2021-09-07 16:15:08 -05:00
7d69a870a4 Reverted the binsize function call from the GPU package in Atom, instead added atom_modify sort with a binsize to ensure matching virial values, enabled the udirect2b kernel, need more work to override dfield0c, and induce() to bypass reverse_comm() for field and fieldp (line amoeba_induce.cpp:111-112) 2021-09-03 13:43:22 -05:00
785a794d39 Added and renamed API to make room for additional kernels (udirect2b only computes the field and fieldp, not accumulating forces, energies, nor virials) 2021-09-01 14:37:11 -05:00
07b60827c4 Working on the udirect2b kernel for the induce real space term, need to add the API for the GPU library 2021-09-01 12:30:41 -05:00
3825fee8e9 Added work on amoeba/gpu, some minor changes to PairAmoeba to allow function overriding in PairAmoebaGPU, added the package AMOEBA to cmake/CMakeLists.txt 2021-08-25 22:57:37 -05:00