Commit Graph

65 Commits

Author SHA1 Message Date
c9ae41246d Ran the four make commands in the src folder: make fix-whitespace; make fix-homepage; make fix-errordocs; make fix-permissions 2023-01-15 16:05:36 -06:00
959b9c220f Cleaned up unused member functions and hd_balancer calls 2022-11-07 15:49:37 -06:00
a3cc0e8432 Reverted the block size tuning, which caused bugs for low atom counts (will revisit later) 2022-11-04 13:45:59 -05:00
00f46120c7 Removed max_cus() from Device, used device->gpu->cus() instead 2022-10-07 15:50:30 -05:00
6b9e83fe20 Added timing for the induced dipole spreading part, computed the block size to ensure all the CUs are occupied by the fphi_uind and fphi_mpole kernels 2022-10-06 15:03:58 -05:00
2ef6a59c0a Merge branch 'develop' into amoeba-gpu 2022-10-01 00:38:24 -05:00
9a1f23a079 Cosmetic changes and cleanup 2022-09-30 17:32:25 -05:00
1d75ca3b20 Moved precompute() out of the terms in amoeba and hippo, to be involed in the first term in a time step: multipole for amoeba and repulsion for hippo 2022-09-30 16:31:13 -05:00
e6d2582642 Updated fphi_mpole, renamed precompute_induce to precompute_kspace 2022-09-28 15:08:18 -05:00
785131932c Added fphi_mpole in amoeba/gpu, fixed a bug in the kernel when indexing grid 2022-09-20 13:58:17 -05:00
caa66d904e Cleaned up GPU lib functions 2022-09-18 15:54:12 -05:00
f9f777b099 Refactored precompute_induce to overlap data transfers with kernel launches 2022-09-18 15:09:26 -05:00
62ecf98cda Enabled fphi_uind in hippo/gpu, really need to refactor hippo and amoeba in the GPU lib to remove kernel duplicates 2022-09-16 14:47:16 -05:00
880f20c285 Cleaned up kernels 2022-09-15 15:29:14 -05:00
9c4d3db558 Cleaned up and converted arrays to ucl_vector of numtyp4 2022-09-13 16:48:39 -05:00
31047b4a31 Removed mem alloc in precompute_induce, used buffer for packing, and switched to using ucl_vector 2022-09-13 12:53:48 -05:00
7f4efa380a Re-arranged memory allocation for cgrid_brick, some issues need to be fixed 2022-09-11 18:58:34 -05:00
c58343b2e2 Cleaned up debugging stuffs, need more refactoring and add to hippo 2022-09-09 13:50:41 -05:00
b72b71837e Moved first_induce_iteration in induce() to the right place 2022-09-09 13:34:57 -05:00
4b8caac727 Made some progress with fphi_uind in the gpu pair style 2022-09-09 12:14:36 -05:00
a0af9627e5 Fixed memory bugs with device array allocations 2022-09-06 16:19:17 -05:00
21b7fb2fcf Exposing fphi_uind to the gpu pair style, still keeping the part not ready though 2022-09-02 14:55:20 -05:00
cad7e1b364 Moved fphi_uind up to BaseAmoeba 2022-09-02 10:18:59 -05:00
aac264f2e2 Working on the fphi_uind kernel and array allocations 2022-08-30 23:40:04 -05:00
c5c3c697df Adding fphi_uind kernel, working on the arrays allocation 2022-08-29 00:13:30 -05:00
9e7bbad4d4 Working on fphi_uind in the GPU lib 2022-08-27 13:19:52 -05:00
b160460dcc Added preprocessors to comment out cufft entirely for now 2022-08-26 12:55:46 -05:00
f4a90c62c0 First attempt to port the forward FFT in the k-space induce term to the GPU, not working yet 2022-08-23 15:42:05 -05:00
28dabb9687 Cleaned up unused variables in the amoeba kernels, made room for convolution gpu 2022-08-16 15:37:49 -05:00
46b8b00a4f Working on fft on the device 2022-08-15 15:51:43 -05:00
538aa13693 Only transfer data that is needed for umutual2b; allowed convolution and kspace term umutual1 to be overridden by the gpu counterparts 2022-08-10 16:21:30 -05:00
66ee2bf989 Cleaned up 2022-07-14 11:01:30 -05:00
0c44bd1086 Rearranged the order of real-space and kspace part of ufield0c(), delayed device-host transfer from umutual2b() to overlap with kspace part 2022-07-08 14:45:31 -05:00
79fbbd4f33 Cleaned up the API of amoeba and hippo to remove unncessary arguments 2021-10-04 14:40:58 -05:00
5a6426bf96 Only transfer data arrays that are needed in each kernel 2021-10-02 00:56:15 -05:00
f4d3d3a2b5 Gradually cleaned up and removed redundancy in amoeba and hippo 2021-10-02 00:09:53 -05:00
f126f785a4 Removed duplicates in the amoeba kernels 2021-10-01 10:19:17 -05:00
3328ac0df2 Attempted to remove some redundancy in data transfers in the amoeba kernels; keeping HIPPO independent of AMOEBA for now 2021-10-01 09:58:21 -05:00
01381b7f54 Fixed bugs in the repulsion kernel, now working correctly with the double precision mode 2021-09-29 11:57:25 -05:00
6286a119b3 Removed precompute() in hippo 2021-09-28 23:12:07 -05:00
98a2b67292 Changed to the API of BaseAmoeba to reduce duplicates in hippo 2021-09-28 17:39:55 -05:00
b874feb127 Removed trailing spaces 2021-09-28 17:28:33 -05:00
f8bc091cb8 Kept working on the multipole real-space term of hippo 2021-09-25 13:17:06 -05:00
78ef0d631f Working on the multipole real-space term of hippo 2021-09-25 12:25:34 -05:00
d77d5b7f0a Added classes for hippo/gpu, refactored BaseAmoeba and made room for the dispersion real-space term in hippo 2021-09-21 15:40:06 -05:00
a2fd784034 Added the dispersion real space term, which is for HIPPO. 2021-09-21 10:55:38 -05:00
42034bd1c9 Fixed bugs for undefined tagint and ucl_powr ambiguity in kernels for OpenCL builds 2021-09-20 12:48:29 -05:00
5d801e985f More cleanup 2021-09-17 23:24:23 -05:00
f5713a52b3 Added another kernel to accumulate forces, energies and virial on the device (similar to the tersoff kernels) as multiple kernels all added to those quantities; also only copy answers back to the host in the last kernel in a time step; cleaned up debugging messages 2021-09-17 16:39:57 -05:00
2e6df83b9b Fixed bugs in the multipole real-space part on the GPU; separately multipole real and polar real work correctly (along with udirect2b and umutual2b), but
together they are conflicting due to the use of ans to copy forces back from device to host. The other 2 kernels (induce part) do not touch forces and energies.
2021-09-17 15:24:36 -05:00