lammps

Author	SHA1	Message	Date
Trung Nguyen	a3cc0e8432	Reverted the block size tuning, which caused bugs for low atom counts (will revisit later)	2022-11-04 13:45:59 -05:00
Trung Nguyen	2f1f7ee0fa	Cleaned up code	2022-11-03 23:45:40 -05:00
Trung Nguyen	00f46120c7	Removed max_cus() from Device, used device->gpu->cus() instead	2022-10-07 15:50:30 -05:00
Trung Nguyen	6b9e83fe20	Added timing for the induced dipole spreading part, computed the block size to ensure all the CUs are occupied by the fphi_uind and fphi_mpole kernels	2022-10-06 15:03:58 -05:00
Trung Nguyen	2ef6a59c0a	Merge branch 'develop' into amoeba-gpu	2022-10-01 00:38:24 -05:00
Trung Nguyen	9a1f23a079	Cosmetic changes and cleanup	2022-09-30 17:32:25 -05:00
Trung Nguyen	1d75ca3b20	Moved precompute() out of the terms in amoeba and hippo, to be involed in the first term in a time step: multipole for amoeba and repulsion for hippo	2022-09-30 16:31:13 -05:00
Axel Kohlmeyer	fb675028b9	whitespace	2022-09-29 02:42:11 -04:00
W. Michael Brown	71464d8314	GPU Package: Fixing logic in OpenCL backend that could result in unnecessary device allocations.	2022-09-28 22:30:09 -07:00
W. Michael Brown	6e34d21b24	GPU Package: Switching back to timer disabling with multiple MPI tasks per GPU. Logic added to prevent mem leak.	2022-09-28 21:02:16 -07:00
Trung Nguyen	e6d2582642	Updated fphi_mpole, renamed precompute_induce to precompute_kspace	2022-09-28 15:08:18 -05:00
ndtrung	166701f13a	Fixed missing commas in the argument list of the macros in amoeba and hippo cu files, added amoeba_convolution_gpu.cpp and .h to the source file list in GPU.cmake	2022-09-23 11:53:09 -05:00
Trung Nguyen	785131932c	Added fphi_mpole in amoeba/gpu, fixed a bug in the kernel when indexing grid	2022-09-20 13:58:17 -05:00
Trung Nguyen	356c46c913	Replaced mem allocation/deallocation inside moduli() with using member variables and mem resize if needed	2022-09-18 16:28:30 -05:00
Trung Nguyen	caa66d904e	Cleaned up GPU lib functions	2022-09-18 15:54:12 -05:00
Trung Nguyen	f9f777b099	Refactored precompute_induce to overlap data transfers with kernel launches	2022-09-18 15:09:26 -05:00
Trung Nguyen	62ecf98cda	Enabled fphi_uind in hippo/gpu, really need to refactor hippo and amoeba in the GPU lib to remove kernel duplicates	2022-09-16 14:47:16 -05:00
Trung Nguyen	880f20c285	Cleaned up kernels	2022-09-15 15:29:14 -05:00
Trung Nguyen	cd3a00c2c4	Added timing breakdown for fphi_uind	2022-09-14 15:28:44 -05:00
Trung Nguyen	9c4d3db558	Cleaned up and converted arrays to ucl_vector of numtyp4	2022-09-13 16:48:39 -05:00
Trung Nguyen	31047b4a31	Removed mem alloc in precompute_induce, used buffer for packing, and switched to using ucl_vector	2022-09-13 12:53:48 -05:00
Trung Nguyen	7f4efa380a	Re-arranged memory allocation for cgrid_brick, some issues need to be fixed	2022-09-11 18:58:34 -05:00
Trung Nguyen	5e59c95be4	Moved temp variables inside loops	2022-09-10 02:45:06 -05:00
Trung Nguyen	363b6c51d0	Used local arrays and re-arranged for coalesced global memory writes	2022-09-10 02:31:39 -05:00
Trung Nguyen	c58343b2e2	Cleaned up debugging stuffs, need more refactoring and add to hippo	2022-09-09 13:50:41 -05:00
Trung Nguyen	b72b71837e	Moved first_induce_iteration in induce() to the right place	2022-09-09 13:34:57 -05:00
Trung Nguyen	4b8caac727	Made some progress with fphi_uind in the gpu pair style	2022-09-09 12:14:36 -05:00
Axel Kohlmeyer	167abe9ce0	add preprocessor flags to select between the changed and the old code variant	2022-09-09 12:41:24 -04:00
Axel Kohlmeyer	0d2db984eb	Merge branch 'develop' into benmenadue/develop	2022-09-06 19:25:21 -04:00
Trung Nguyen	a0af9627e5	Fixed memory bugs with device array allocations	2022-09-06 16:19:17 -05:00
Ben Menadue	294a1c2168	Use primary context in CUDA GPU code. Since LAMMPS uses the low-level driver API of CUDA, it needs to ensure that it is in the correct context when invoking such functions. At the moment it creates and switches to its own context inside `UCL_Device::set` but then assumes that the driver is still in that context for subsequent calls into CUDA; if another part of the program uses a different context (such as the CUDA runtime using the "primary" context) this will cause failures inside LAMMPS. This patch changes the context creation to instead use the primary context for the requested device. While it's not perfect, in that it still doesn't ensure that it's in the correct context before making driver API calls, it at least allows it to work with libraries that use the runtime API.	2022-09-06 09:28:51 +10:00
Trung Nguyen	21b7fb2fcf	Exposing fphi_uind to the gpu pair style, still keeping the part not ready though	2022-09-02 14:55:20 -05:00
David Eberius	51a4819bfc	Fixed an illegal preprocessor issue.	2022-09-02 11:42:30 -04:00
Trung Nguyen	cad7e1b364	Moved fphi_uind up to BaseAmoeba	2022-09-02 10:18:59 -05:00
Trung Nguyen	aac264f2e2	Working on the fphi_uind kernel and array allocations	2022-08-30 23:40:04 -05:00
Trung Nguyen	c5c3c697df	Adding fphi_uind kernel, working on the arrays allocation	2022-08-29 00:13:30 -05:00
Trung Nguyen	9e7bbad4d4	Working on fphi_uind in the GPU lib	2022-08-27 13:19:52 -05:00
Trung Nguyen	b160460dcc	Added preprocessors to comment out cufft entirely for now	2022-08-26 12:55:46 -05:00
Trung Nguyen	b2d6df5bfb	Re-arranged some for loops in umutual1 to improve cache-friendly memory access; made placeholder for grid_uind on the GPU lib, maybe FFT is not that heavy to be put on the device.	2022-08-25 23:18:13 -05:00
Vsevak	8d77c1daee	Merge remote-tracking branch 'origin/develop' into tip4p_cornercase	2022-08-25 17:58:17 +03:00
Trung Nguyen	f4a90c62c0	First attempt to port the forward FFT in the k-space induce term to the GPU, not working yet	2022-08-23 15:42:05 -05:00
Trung Nguyen	921796a15f	Cleaned up unused variables in the hippo kernels	2022-08-16 16:29:38 -05:00
Trung Nguyen	28dabb9687	Cleaned up unused variables in the amoeba kernels, made room for convolution gpu	2022-08-16 15:37:49 -05:00
Trung Nguyen	46b8b00a4f	Working on fft on the device	2022-08-15 15:51:43 -05:00
Trung Nguyen	538aa13693	Only transfer data that is needed for umutual2b; allowed convolution and kspace term umutual1 to be overridden by the gpu counterparts	2022-08-10 16:21:30 -05:00
Vsevak	baf3e614fb	Add comments for tip4p GPU kernels	2022-08-07 22:26:11 +03:00
Trung Nguyen	aad4e417f9	Moved temp variables inside neighbor loops	2022-08-03 12:33:48 -05:00
Trung Nguyen	a54f0b684d	Moved temp variables inside the loop over neighbors	2022-08-03 10:56:52 -05:00
Axel Kohlmeyer	5fee276348	add some GNU Make magic(tm) to Makefile.hip to adapt itself to OpenMPI and MPICH	2022-07-28 07:03:58 -04:00
Paulius Velesko	e7ffa7fae3	Add Makefile support for CHIP-SPV	2022-07-27 08:34:35 +00:00

1 2 3 4 5 ...

512 Commits