Commit Graph

537 Commits

Author SHA1 Message Date
4244d2e6cd silence compiler warnings about unused parameters and variables 2023-01-19 08:56:54 -05:00
eddd3d6f25 Fixed a bug with extra being nullptr when _host_view is true: always allocate extra
(Note that BaseAmoeba has its own cast_extra_data() that doesn't know if extra is allocated properly, it is the case when _host_view is false for dedicated GPUs for example)
2023-01-18 20:04:45 -06:00
f86375c992 Attempted to ensure that extra gets allocated in the exactly same way as other added fields (charge, quat and vel) 2023-01-17 09:47:09 -06:00
71931d1d44 Cleaned up, and added missing zero timers for extra fields transfers 2023-01-17 09:39:03 -06:00
973b46a907 Attempted to resolve the memory access runtime errors when acquiring single and mixed precision arrays from the GPU lib 2023-01-16 10:12:42 -06:00
9dc0369cee Attempted to resolve the address space change issue when casting for OpenCL 2.0 (ref: https://www.intel.com/content/www/us/en/developer/articles/technical/the-generic-address-space-in-opencl-20.html#06_address_space_casting) 2023-01-15 23:28:48 -06:00
c9ae41246d Ran the four make commands in the src folder: make fix-whitespace; make fix-homepage; make fix-errordocs; make fix-permissions 2023-01-15 16:05:36 -06:00
212da7f109 Merge branch 'develop' into amoeba-gpu 2023-01-14 18:36:26 -06:00
5cbe303af4 Merge branch 'develop' into collected-small-changes 2023-01-04 07:28:03 -05:00
d9abc3fcc0 update CUDA Toolkit / GPU compatibility lists and GPU package compilation settings 2023-01-03 11:56:44 -05:00
396d577f40 port DPD exclusions corrections to GPU package 2023-01-02 12:04:10 -05:00
8af77c690c Merge branch 'develop' into amoeba-gpu 2022-12-14 13:16:41 -06:00
959b9c220f Cleaned up unused member functions and hd_balancer calls 2022-11-07 15:49:37 -06:00
a3cc0e8432 Reverted the block size tuning, which caused bugs for low atom counts (will revisit later) 2022-11-04 13:45:59 -05:00
2f1f7ee0fa Cleaned up code 2022-11-03 23:45:40 -05:00
e5a808fb8d apply correct platform selection for OpenCL context 2022-11-01 04:05:57 -04:00
80a141d9c8 silence compiler warnings 2022-11-01 03:38:08 -04:00
ad54268544 silence compiler warning 2022-10-19 14:31:21 -04:00
9d081a5916 more adjustments for bogus timer results on Intel OpenCL 2022-10-19 07:39:56 -04:00
f867adc541 GPU Package fix where timing disable could result in event/marker destruction before completion on accelerator during initialization. 2022-10-19 02:16:29 -04:00
51c6eddd0d Fix to make the property list empty for command queues when timing disabled. 2022-10-19 02:15:39 -04:00
7c9666798e whitespace 2022-10-08 09:34:20 -04:00
7551c0a3ca GPU Package: Documenting some additional preprocessor flags, updating oneapi Makefile. 2022-10-07 22:44:21 -07:00
00f46120c7 Removed max_cus() from Device, used device->gpu->cus() instead 2022-10-07 15:50:30 -05:00
5a98a38e24 GPU Package: Switching to parallel GPU initialization / JIT compilation. 2022-10-07 13:25:14 -07:00
f715f174bb GPU Package: Print OCL platform name to screen when multiple platforms 2022-10-06 21:40:42 -07:00
a6a39d47e1 Fixing potential issues with automatic splitting of accelerators for NUMA. 2022-10-06 20:48:02 -07:00
e9f39f85d2 Fixing issue where shared main memory property only set for NVIDIA devices. 2022-10-06 20:05:33 -07:00
6b9e83fe20 Added timing for the induced dipole spreading part, computed the block size to ensure all the CUs are occupied by the fphi_uind and fphi_mpole kernels 2022-10-06 15:03:58 -05:00
2ef6a59c0a Merge branch 'develop' into amoeba-gpu 2022-10-01 00:38:24 -05:00
9a1f23a079 Cosmetic changes and cleanup 2022-09-30 17:32:25 -05:00
1d75ca3b20 Moved precompute() out of the terms in amoeba and hippo, to be involed in the first term in a time step: multipole for amoeba and repulsion for hippo 2022-09-30 16:31:13 -05:00
fb675028b9 whitespace 2022-09-29 02:42:11 -04:00
71464d8314 GPU Package: Fixing logic in OpenCL backend that could result in unnecessary device allocations. 2022-09-28 22:30:09 -07:00
6e34d21b24 GPU Package: Switching back to timer disabling with multiple MPI tasks per GPU. Logic added to prevent mem leak. 2022-09-28 21:02:16 -07:00
e6d2582642 Updated fphi_mpole, renamed precompute_induce to precompute_kspace 2022-09-28 15:08:18 -05:00
166701f13a Fixed missing commas in the argument list of the macros in amoeba and hippo cu files, added amoeba_convolution_gpu.cpp and .h to the source file list in GPU.cmake 2022-09-23 11:53:09 -05:00
785131932c Added fphi_mpole in amoeba/gpu, fixed a bug in the kernel when indexing grid 2022-09-20 13:58:17 -05:00
356c46c913 Replaced mem allocation/deallocation inside moduli() with using member variables and mem resize if needed 2022-09-18 16:28:30 -05:00
caa66d904e Cleaned up GPU lib functions 2022-09-18 15:54:12 -05:00
f9f777b099 Refactored precompute_induce to overlap data transfers with kernel launches 2022-09-18 15:09:26 -05:00
62ecf98cda Enabled fphi_uind in hippo/gpu, really need to refactor hippo and amoeba in the GPU lib to remove kernel duplicates 2022-09-16 14:47:16 -05:00
880f20c285 Cleaned up kernels 2022-09-15 15:29:14 -05:00
cd3a00c2c4 Added timing breakdown for fphi_uind 2022-09-14 15:28:44 -05:00
9c4d3db558 Cleaned up and converted arrays to ucl_vector of numtyp4 2022-09-13 16:48:39 -05:00
31047b4a31 Removed mem alloc in precompute_induce, used buffer for packing, and switched to using ucl_vector 2022-09-13 12:53:48 -05:00
7f4efa380a Re-arranged memory allocation for cgrid_brick, some issues need to be fixed 2022-09-11 18:58:34 -05:00
5e59c95be4 Moved temp variables inside loops 2022-09-10 02:45:06 -05:00
363b6c51d0 Used local arrays and re-arranged for coalesced global memory writes 2022-09-10 02:31:39 -05:00
c58343b2e2 Cleaned up debugging stuffs, need more refactoring and add to hippo 2022-09-09 13:50:41 -05:00