From 6e17446f3815146131ed94adf58818ef182e389d Mon Sep 17 00:00:00 2001
From: Axel Kohlmeyer <akohlmey@gmail.com>
Date: Sun, 5 Sep 2021 22:27:39 -0400
Subject: [PATCH] add section about parallelization in the OPENMP package

---
 doc/src/Developer_par_openmp.rst | 16 ++++++++++++++++
 doc/src/Developer_parallel.rst   |  7 +++++--
 2 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 doc/src/Developer_par_openmp.rst

diff --git a/doc/src/Developer_par_openmp.rst b/doc/src/Developer_par_openmp.rst
new file mode 100644
index 0000000000..3a2b4f5ff7
--- /dev/null
+++ b/doc/src/Developer_par_openmp.rst
@@ -0,0 +1,16 @@
+OpenMP Parallelism
+^^^^^^^^^^^^^^^^^^
+
+The styles in the INTEL, KOKKOS, and OPENMP package offer to use OpenMP
+thread parallelism to predominantly distribute loops over local data
+and thus follow an orthogonal parallelization strategy to the
+decomposition into spatial domains used by the :doc:`MPI partitioning
+<Developer_par_part>`.  For clarity, this section discusses only the
+implementation in the OPENMP package as it is the simplest. The INTEL
+and KOKKOS package offer additional options and are more complex since
+they support more features and different hardware like co-processors
+or GPUs.
+
+Avoiding data races
+-------------------
+
diff --git a/doc/src/Developer_parallel.rst b/doc/src/Developer_parallel.rst
index be3c5f3c0d..84b228280d 100644
--- a/doc/src/Developer_parallel.rst
+++ b/doc/src/Developer_parallel.rst
@@ -5,10 +5,12 @@ LAMMPS is from ground up designed to be running in parallel using the
 MPI standard with distributed data via domain decomposition.  The
 parallelization has to be efficient to enable good strong scaling (=
 good speedup for the same system) and good weak scaling (= the
-computational cost of enlarging the system is linear with the system
+computational cost of enlarging the system is proportional to the system
 size).  Additional parallelization using GPUs or OpenMP can then be
 applied within the sub-domain assigned to an MPI process.  For clarity,
-most of the following illustrations show the 2d simulation case.
+most of the following illustrations show the 2d simulation case. The
+underlying algorithms in those cases, however, apply to both 2d and 3d
+cases equally well.
 
 .. toctree::
    :maxdepth: 1
@@ -17,3 +19,4 @@ most of the following illustrations show the 2d simulation case.
    Developer_par_comm
    Developer_par_neigh
    Developer_par_long
+   Developer_par_openmp