From 6e17446f3815146131ed94adf58818ef182e389d Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Sun, 5 Sep 2021 22:27:39 -0400 Subject: [PATCH] add section about parallelization in the OPENMP package --- doc/src/Developer_par_openmp.rst | 16 ++++++++++++++++ doc/src/Developer_parallel.rst | 7 +++++-- 2 files changed, 21 insertions(+), 2 deletions(-) create mode 100644 doc/src/Developer_par_openmp.rst diff --git a/doc/src/Developer_par_openmp.rst b/doc/src/Developer_par_openmp.rst new file mode 100644 index 0000000000..3a2b4f5ff7 --- /dev/null +++ b/doc/src/Developer_par_openmp.rst @@ -0,0 +1,16 @@ +OpenMP Parallelism +^^^^^^^^^^^^^^^^^^ + +The styles in the INTEL, KOKKOS, and OPENMP package offer to use OpenMP +thread parallelism to predominantly distribute loops over local data +and thus follow an orthogonal parallelization strategy to the +decomposition into spatial domains used by the :doc:`MPI partitioning +`. For clarity, this section discusses only the +implementation in the OPENMP package as it is the simplest. The INTEL +and KOKKOS package offer additional options and are more complex since +they support more features and different hardware like co-processors +or GPUs. + +Avoiding data races +------------------- + diff --git a/doc/src/Developer_parallel.rst b/doc/src/Developer_parallel.rst index be3c5f3c0d..84b228280d 100644 --- a/doc/src/Developer_parallel.rst +++ b/doc/src/Developer_parallel.rst @@ -5,10 +5,12 @@ LAMMPS is from ground up designed to be running in parallel using the MPI standard with distributed data via domain decomposition. The parallelization has to be efficient to enable good strong scaling (= good speedup for the same system) and good weak scaling (= the -computational cost of enlarging the system is linear with the system +computational cost of enlarging the system is proportional to the system size). Additional parallelization using GPUs or OpenMP can then be applied within the sub-domain assigned to an MPI process. For clarity, -most of the following illustrations show the 2d simulation case. +most of the following illustrations show the 2d simulation case. The +underlying algorithms in those cases, however, apply to both 2d and 3d +cases equally well. .. toctree:: :maxdepth: 1 @@ -17,3 +19,4 @@ most of the following illustrations show the 2d simulation case. Developer_par_comm Developer_par_neigh Developer_par_long + Developer_par_openmp