lammps

Author	SHA1	Message	Date
Trung Nguyen	880f20c285	Cleaned up kernels	2022-09-15 15:29:14 -05:00
Trung Nguyen	9c4d3db558	Cleaned up and converted arrays to ucl_vector of numtyp4	2022-09-13 16:48:39 -05:00
Trung Nguyen	31047b4a31	Removed mem alloc in precompute_induce, used buffer for packing, and switched to using ucl_vector	2022-09-13 12:53:48 -05:00
Trung Nguyen	7f4efa380a	Re-arranged memory allocation for cgrid_brick, some issues need to be fixed	2022-09-11 18:58:34 -05:00
Trung Nguyen	c58343b2e2	Cleaned up debugging stuffs, need more refactoring and add to hippo	2022-09-09 13:50:41 -05:00
Trung Nguyen	b72b71837e	Moved first_induce_iteration in induce() to the right place	2022-09-09 13:34:57 -05:00
Trung Nguyen	4b8caac727	Made some progress with fphi_uind in the gpu pair style	2022-09-09 12:14:36 -05:00
Trung Nguyen	a0af9627e5	Fixed memory bugs with device array allocations	2022-09-06 16:19:17 -05:00
Trung Nguyen	21b7fb2fcf	Exposing fphi_uind to the gpu pair style, still keeping the part not ready though	2022-09-02 14:55:20 -05:00
Trung Nguyen	cad7e1b364	Moved fphi_uind up to BaseAmoeba	2022-09-02 10:18:59 -05:00
Trung Nguyen	aac264f2e2	Working on the fphi_uind kernel and array allocations	2022-08-30 23:40:04 -05:00
Trung Nguyen	c5c3c697df	Adding fphi_uind kernel, working on the arrays allocation	2022-08-29 00:13:30 -05:00
Trung Nguyen	9e7bbad4d4	Working on fphi_uind in the GPU lib	2022-08-27 13:19:52 -05:00
Trung Nguyen	b160460dcc	Added preprocessors to comment out cufft entirely for now	2022-08-26 12:55:46 -05:00
Trung Nguyen	f4a90c62c0	First attempt to port the forward FFT in the k-space induce term to the GPU, not working yet	2022-08-23 15:42:05 -05:00
Trung Nguyen	28dabb9687	Cleaned up unused variables in the amoeba kernels, made room for convolution gpu	2022-08-16 15:37:49 -05:00
Trung Nguyen	46b8b00a4f	Working on fft on the device	2022-08-15 15:51:43 -05:00
Trung Nguyen	538aa13693	Only transfer data that is needed for umutual2b; allowed convolution and kspace term umutual1 to be overridden by the gpu counterparts	2022-08-10 16:21:30 -05:00
Trung Nguyen	66ee2bf989	Cleaned up	2022-07-14 11:01:30 -05:00
Trung Nguyen	0c44bd1086	Rearranged the order of real-space and kspace part of ufield0c(), delayed device-host transfer from umutual2b() to overlap with kspace part	2022-07-08 14:45:31 -05:00
Trung Nguyen	79fbbd4f33	Cleaned up the API of amoeba and hippo to remove unncessary arguments	2021-10-04 14:40:58 -05:00
Trung Nguyen	5a6426bf96	Only transfer data arrays that are needed in each kernel	2021-10-02 00:56:15 -05:00
Trung Nguyen	f4d3d3a2b5	Gradually cleaned up and removed redundancy in amoeba and hippo	2021-10-02 00:09:53 -05:00
Trung Nguyen	f126f785a4	Removed duplicates in the amoeba kernels	2021-10-01 10:19:17 -05:00
Trung Nguyen	3328ac0df2	Attempted to remove some redundancy in data transfers in the amoeba kernels; keeping HIPPO independent of AMOEBA for now	2021-10-01 09:58:21 -05:00
Trung Nguyen	01381b7f54	Fixed bugs in the repulsion kernel, now working correctly with the double precision mode	2021-09-29 11:57:25 -05:00
Trung Nguyen	6286a119b3	Removed precompute() in hippo	2021-09-28 23:12:07 -05:00
Trung Nguyen	98a2b67292	Changed to the API of BaseAmoeba to reduce duplicates in hippo	2021-09-28 17:39:55 -05:00
Trung Nguyen	b874feb127	Removed trailing spaces	2021-09-28 17:28:33 -05:00
Trung Nguyen	f8bc091cb8	Kept working on the multipole real-space term of hippo	2021-09-25 13:17:06 -05:00
Trung Nguyen	78ef0d631f	Working on the multipole real-space term of hippo	2021-09-25 12:25:34 -05:00
Trung Nguyen	d77d5b7f0a	Added classes for hippo/gpu, refactored BaseAmoeba and made room for the dispersion real-space term in hippo	2021-09-21 15:40:06 -05:00
Trung Nguyen	a2fd784034	Added the dispersion real space term, which is for HIPPO.	2021-09-21 10:55:38 -05:00
Trung Nguyen	42034bd1c9	Fixed bugs for undefined tagint and ucl_powr ambiguity in kernels for OpenCL builds	2021-09-20 12:48:29 -05:00
Trung Nguyen	5d801e985f	More cleanup	2021-09-17 23:24:23 -05:00
Trung Nguyen	f5713a52b3	Added another kernel to accumulate forces, energies and virial on the device (similar to the tersoff kernels) as multiple kernels all added to those quantities; also only copy answers back to the host in the last kernel in a time step; cleaned up debugging messages	2021-09-17 16:39:57 -05:00
Trung Nguyen	2e6df83b9b	Fixed bugs in the multipole real-space part on the GPU; separately multipole real and polar real work correctly (along with udirect2b and umutual2b), but together they are conflicting due to the use of ans to copy forces back from device to host. The other 2 kernels (induce part) do not touch forces and energies.	2021-09-17 15:24:36 -05:00
Trung Nguyen	003bebd31e	Working on the multipole real-space term, not ready yet	2021-09-17 01:19:33 -05:00
Trung Nguyen	98c1a0178c	Refactored the API so that different off2 values are used for different kernels	2021-09-16 17:14:36 -05:00
Trung Nguyen	edd76733a1	Working on umutual2b, tdipdip are correct, but incorrect results for field and fieldp	2021-09-12 00:51:48 -05:00
Trung Nguyen	94d6f7219c	Attempted to reduce the memory footprint of the per-atom arrays	2021-09-11 11:22:17 -05:00
Trung Nguyen	c765861851	Cleaned up and re-arranged the functions to reflect the order of calling in a time step	2021-09-11 01:00:58 -05:00
Trung Nguyen	7f5a82dc54	Switched to the short neighbor list implementation in the pre-10Feb21 version (the recent version enforces tpa = 1 for short nbor)	2021-09-11 00:34:43 -05:00
Trung Nguyen	4ebe5833d3	Working on short nbor list for the amoeba kernels (based on what has been done with tersoff and ellipsod, nbor dev_packed needs to be allocated properly)	2021-09-10 16:51:16 -05:00
Trung Nguyen	efe0bf593f	Adding the umutual2b kernel, need to create another array for tdipdip on the GPU	2021-09-09 15:19:43 -05:00
Trung Nguyen	6f6fd0999c	Both udirect2b and polar_real are working correctly on the GPU	2021-09-09 00:57:21 -05:00
Trung Nguyen	8c5a116d30	Made dfield0c work to compute uind and uinp correctly; need to make sure they are correct for polar_real()	2021-09-08 16:43:33 -05:00
Trung Nguyen	1c5d235f12	Working on the field and fieldp values from GPU back to the host for dfield0c	2021-09-07 16:15:08 -05:00
Trung Nguyen	4e346c2de6	Refactored neighbor list builds and per-atom reallocation parts	2021-09-07 13:05:57 -05:00
Trung Nguyen	7d69a870a4	Reverted the binsize function call from the GPU package in Atom, instead added atom_modify sort with a binsize to ensure matching virial values, enabled the udirect2b kernel, need more work to override dfield0c, and induce() to bypass reverse_comm() for field and fieldp (line amoeba_induce.cpp:111-112)	2021-09-03 13:43:22 -05:00

1 2

52 Commits