diff --git a/doc/src/compute_msd.rst b/doc/src/compute_msd.rst
index 02b5550093..6e89cc59c1 100644
--- a/doc/src/compute_msd.rst
+++ b/doc/src/compute_msd.rst
@@ -75,10 +75,11 @@ solids undergoing thermal motion.
 .. note::
 
    Initial coordinates are stored in "unwrapped" form, by using the
-   image flags associated with each atom.  See the :doc:`dump custom <dump>` command for a discussion of "unwrapped" coordinates.
-   See the Atoms section of the :doc:`read_data <read_data>` command for a
-   discussion of image flags and how they are set for each atom.  You can
-   reset the image flags (e.g. to 0) before invoking this compute by
+   image flags associated with each atom.  See the :doc:`dump custom
+   <dump>` command for a discussion of "unwrapped" coordinates.  See the
+   Atoms section of the :doc:`read_data <read_data>` command for a
+   discussion of image flags and how they are set for each atom.  You
+   can reset the image flags (e.g. to 0) before invoking this compute by
    using the :doc:`set image <set>` command.
 
 .. note::
@@ -108,7 +109,8 @@ distance\^2 :doc:`units <units>`.
 
 Restrictions
 """"""""""""
- none
+
+Compute *msd* cannot be used with a dynamic group.
 
 Related commands
 """"""""""""""""
diff --git a/doc/src/compute_msd_nongauss.rst b/doc/src/compute_msd_nongauss.rst
index 70f4505d7f..1658d26f93 100644
--- a/doc/src/compute_msd_nongauss.rst
+++ b/doc/src/compute_msd_nongauss.rst
@@ -74,8 +74,11 @@ the third is dimensionless.
 Restrictions
 """"""""""""
 
-This compute is part of the EXTRA-COMPUTE package.  It is only enabled if
-LAMMPS was built with that package.  See the :doc:`Build package <Build_package>` page for more info.
+Compute *msd/nongauss* cannot be used with a dynamic group.
+
+This compute is part of the EXTRA-COMPUTE package.  It is only enabled
+if LAMMPS was built with that package.  See the :doc:`Build package
+<Build_package>` page for more info.
 
 Related commands
 """"""""""""""""
diff --git a/doc/src/fix_acks2_reaxff.rst b/doc/src/fix_acks2_reaxff.rst
index b555f77a66..c8804497e7 100644
--- a/doc/src/fix_acks2_reaxff.rst
+++ b/doc/src/fix_acks2_reaxff.rst
@@ -19,6 +19,12 @@ Syntax
 * cutlo,cuthi = lo and hi cutoff for Taper radius
 * tolerance = precision to which charges will be equilibrated
 * params = reaxff or a filename
+* one or more keywords or keyword/value pairs may be appended
+
+  .. parsed-literal::
+
+     keyword = *maxiter*
+       *maxiter* N = limit the number of iterations to *N*
 
 Examples
 """"""""
@@ -26,7 +32,7 @@ Examples
 .. code-block:: LAMMPS
 
    fix 1 all acks2/reaxff 1 0.0 10.0 1.0e-6 reaxff
-   fix 1 all acks2/reaxff 1 0.0 10.0 1.0e-6 param.acks2
+   fix 1 all acks2/reaxff 1 0.0 10.0 1.0e-6 param.acks2 maxiter 500
 
 Description
 """""""""""
@@ -44,14 +50,14 @@ the charge equilibration performed by fix acks2/reaxff, see the
 
 The ACKS2 method minimizes the electrostatic energy of the system by
 adjusting the partial charge on individual atoms based on interactions
-with their neighbors. It requires some parameters for each atom type.
+with their neighbors.  It requires some parameters for each atom type.
 If the *params* setting above is the word "reaxff", then these are
 extracted from the :doc:`pair_style reaxff <pair_reaxff>` command and
 the ReaxFF force field file it reads in.  If a file name is specified
-for *params*\ , then the parameters are taken from the specified file
-and the file must contain one line for each atom type.  The latter form
-must be used when performing QeQ with a non-ReaxFF potential.  The lines
-should be formatted as follows:
+for *params*, then the parameters are taken from the specified file
+and the file must contain one line for each atom type.  The latter
+form must be used when performing QeQ with a non-ReaxFF potential.
+The lines should be formatted as follows:
 
 .. parsed-literal::
 
@@ -67,13 +73,25 @@ ReaxFF potential file, except that eta is defined here as twice the eta
 value in the ReaxFF file. Note that unlike the rest of LAMMPS, the units
 of this fix are hard-coded to be A, eV, and electronic charge.
 
-**Restart, fix_modify, output, run start/stop, minimize info:**
+The optional *maxiter* keyword allows changing the max number
+of iterations in the linear solver. The default value is 200.
+
+.. note::
+
+   In order to solve the self-consistent equations for electronegativity
+   equalization, LAMMPS imposes the additional constraint that all the
+   charges in the fix group must add up to zero.  The initial charge
+   assignments should also satisfy this constraint.  LAMMPS will print a
+   warning if that is not the case.
+
+Restart, fix_modify, output, run start/stop, minimize info
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
 
 No information about this fix is written to :doc:`binary restart files
-<restart>`.  No global scalar or vector or per-atom quantities are
-stored by this fix for access by various :doc:`output commands
-<Howto_output>`.  No parameter of this fix can be used with the
-*start/stop* keywords of the :doc:`run <run>` command.
+<restart>`.  This fix computes a global scalar (the number of
+iterations) for access by various :doc:`output commands <Howto_output>`.
+No parameter of this fix can be used with the *start/stop* keywords of
+the :doc:`run <run>` command.
 
 This fix is invoked during :doc:`energy minimization <minimize>`.
 
@@ -86,12 +104,12 @@ This fix is invoked during :doc:`energy minimization <minimize>`.
 Restrictions
 """"""""""""
 
-This fix is part of the REAXFF package.  It is only enabled if LAMMPS
-was built with that package.  See the :doc:`Build package
-<Build_package>` doc page for more info.
+This fix is part of the REAXFF package.  It is only enabled if
+LAMMPS was built with that package. See the :doc:`Build package
+<Build_package>` page for more info.
 
 This fix does not correctly handle interactions involving multiple
-periodic images of the same atom. Hence, it should not be used for
+periodic images of the same atom.  Hence, it should not be used for
 periodic cell dimensions less than 10 angstroms.
 
 This fix may be used in combination with :doc:`fix efield <fix_efield>`
@@ -105,7 +123,10 @@ Related commands
 
 :doc:`pair_style reaxff <pair_reaxff>`, :doc:`fix qeq/reaxff <fix_qeq_reaxff>`
 
-**Default:** none
+Default
+"""""""
+
+maxiter 200
 
 ----------
 
diff --git a/doc/src/fix_deform.rst b/doc/src/fix_deform.rst
index a21a3a6044..805bd84382 100644
--- a/doc/src/fix_deform.rst
+++ b/doc/src/fix_deform.rst
@@ -70,7 +70,7 @@ Syntax
 
        *remap* value = *x* or *v* or *none*
          x = remap coords of atoms in group into deforming box
-         v = remap velocities of all atoms when they cross periodic boundaries
+         v = remap velocities of atoms in group when they cross periodic boundaries
          none = no remapping of x or v
        *flip* value = *yes* or *no*
          allow or disallow box flips when it becomes highly skewed
diff --git a/doc/src/fix_move.rst b/doc/src/fix_move.rst
index 5d3a9de47f..ff1f8403df 100644
--- a/doc/src/fix_move.rst
+++ b/doc/src/fix_move.rst
@@ -12,7 +12,7 @@ Syntax
 
 * ID, group-ID are documented in :doc:`fix <fix>` command
 * move = style name of this fix command
-* style = *linear* or *wiggle* or *rotate* or *variable*
+* style = *linear* or *wiggle* or *rotate* or *transrot* or *variable*
 
   .. parsed-literal::
 
@@ -25,6 +25,11 @@ Syntax
          Px,Py,Pz = origin point of axis of rotation (distance units)
          Rx,Ry,Rz = axis of rotation vector
          period = period of rotation (time units)
+       *transrot* args = Vx Vy Vz Px Py Pz Rx Ry Rz period
+         Vx,Vy,Vz = components of velocity vector (velocity units)
+         Px,Py,Pz = origin point of axis of rotation (distance units)
+         Rx,Ry,Rz = axis of rotation vector
+         period = period of rotation (time units)
        *variable* args = v_dx v_dy v_dz v_vx v_vy v_vz
          v_dx,v_dy,v_dz = 3 variable names that calculate x,y,z displacement as function of time, any component can be specified as NULL
          v_vx,v_vy,v_vz = 3 variable names that calculate x,y,z velocity as function of time, any component can be specified as NULL
@@ -44,6 +49,7 @@ Examples
    fix 1 boundary move wiggle 3.0 0.0 0.0 1.0 units box
    fix 2 boundary move rotate 0.0 0.0 0.0 0.0 0.0 1.0 5.0
    fix 2 boundary move variable v_myx v_myy NULL v_VX v_VY NULL
+   fix 3 boundary move transrot 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0 1.0 5.0 units box
 
 Description
 """""""""""
@@ -55,15 +61,17 @@ whose movement can influence nearby atoms.
 
 .. note::
 
-   The atoms affected by this fix should not normally be time
-   integrated by other fixes (e.g. :doc:`fix nve <fix_nve>`, :doc:`fix nvt <fix_nh>`), since that will change their positions and
-   velocities twice.
+   The atoms affected by this fix should not normally be time integrated
+   by other fixes (e.g. :doc:`fix nve <fix_nve>`, :doc:`fix nvt
+   <fix_nh>`), since that will change their positions and velocities
+   twice.
 
 .. note::
 
    As atoms move due to this fix, they will pass through periodic
    boundaries and be remapped to the other side of the simulation box,
-   just as they would during normal time integration (e.g. via the :doc:`fix nve <fix_nve>` command).  It is up to you to decide whether
+   just as they would during normal time integration (e.g. via the
+   :doc:`fix nve <fix_nve>` command).  It is up to you to decide whether
    periodic boundaries are appropriate with the kind of atom motion you
    are prescribing with this fix.
 
@@ -73,11 +81,11 @@ whose movement can influence nearby atoms.
    position at the time the fix is specified.  These initial coordinates
    are stored by the fix in "unwrapped" form, by using the image flags
    associated with each atom.  See the :doc:`dump custom <dump>` command
-   for a discussion of "unwrapped" coordinates.  See the Atoms section of
-   the :doc:`read_data <read_data>` command for a discussion of image flags
-   and how they are set for each atom.  You can reset the image flags
-   (e.g. to 0) before invoking this fix by using the :doc:`set image <set>`
-   command.
+   for a discussion of "unwrapped" coordinates.  See the Atoms section
+   of the :doc:`read_data <read_data>` command for a discussion of image
+   flags and how they are set for each atom.  You can reset the image
+   flags (e.g. to 0) before invoking this fix by using the :doc:`set
+   image <set>` command.
 
 ----------
 
@@ -118,13 +126,13 @@ notation as
 
 where *X0* = (x0,y0,z0) is their position at the time the fix is
 specified, *A* is the specified amplitude vector with components
-(Ax,Ay,Az), *omega* is 2 PI / *period*, and *delta* is the time
-elapsed since the fix was specified.  This style also sets the
-velocity of each atom to the time derivative of this expression.  If
-any of the amplitude components is specified as NULL, then the
-position and velocity of that component is time integrated the same as
-the :doc:`fix nve <fix_nve>` command would perform, using the
-corresponding force component on the atom.
+(Ax,Ay,Az), *omega* is 2 PI / *period*, and *delta* is the time elapsed
+since the fix was specified.  This style also sets the velocity of each
+atom to the time derivative of this expression.  If any of the amplitude
+components is specified as NULL, then the position and velocity of that
+component is time integrated the same as the :doc:`fix nve <fix_nve>`
+command would perform, using the corresponding force component on the
+atom.
 
 Note that the *wiggle* style is identical to using the *variable*
 style with :doc:`equal-style variables <variable>` that use the
@@ -139,21 +147,29 @@ swiggle() and cwiggle() functions.  E.g.
    variable v equal v_omega*($A-cwiggle(0.0,$A,$T))
    fix 1 boundary move variable v_x NULL NULL v_v NULL NULL
 
-The *rotate* style rotates atoms around a rotation axis *R* =
-(Rx,Ry,Rz) that goes through a point *P* = (Px,Py,Pz).  The *period* of
-the rotation is also specified.  The direction of rotation for the
-atoms around the rotation axis is consistent with the right-hand rule:
-if your right-hand thumb points along *R*, then your fingers wrap
-around the axis in the direction of rotation.
+The *rotate* style rotates atoms around a rotation axis *R* = (Rx,Ry,Rz)
+that goes through a point *P* = (Px,Py,Pz).  The *period* of the
+rotation is also specified.  The direction of rotation for the atoms
+around the rotation axis is consistent with the right-hand rule: if your
+right-hand thumb points along *R*, then your fingers wrap around the
+axis in the direction of rotation.
 
 This style also sets the velocity of each atom to (omega cross Rperp)
 where omega is its angular velocity around the rotation axis and Rperp
 is a perpendicular vector from the rotation axis to the atom.  If the
 defined :doc:`atom_style <atom_style>` assigns an angular velocity or
-angular momentum or orientation to each atom (:doc:`atom styles <atom_style>` sphere, ellipsoid, line, tri, body), then
+angular momentum or orientation to each atom (:doc:`atom styles
+<atom_style>` sphere, ellipsoid, line, tri, body), then
 those properties are also updated appropriately to correspond to the
 atom's motion and rotation over time.
 
+The *transrot* style combines the effects of *rotate* and *linear* so
+that it is possible to prescribe a rotating group of atoms that also
+moves at a constant velocity.  The arguments are for the translation
+first and then for the rotation.  Since the rotation affects all
+coordinate components, it is not possible to set any of the
+translation vector components to NULL.
+
 The *variable* style allows the position and velocity components of
 each atom to be set by formulas specified via the
 :doc:`variable <variable>` command.  Each of the 6 variables is
@@ -165,10 +181,10 @@ Each variable must be of either the *equal* or *atom* style.
 a function of the timestep as well as of other simulation values.
 *Atom*\ -style variables compute a numeric quantity for each atom, that
 can be a function per-atom quantities, such as the atom's position, as
-well as of the timestep and other simulation values.  Note that this
-fix stores the original coordinates of each atom (see note below) so
-that per-atom quantity can be used in an atom-style variable formula.
-See the :doc:`variable <variable>` command for details.
+well as of the timestep and other simulation values.  Note that this fix
+stores the original coordinates of each atom (see note below) so that
+per-atom quantity can be used in an atom-style variable formula.  See
+the :doc:`variable <variable>` command for details.
 
 The first 3 variables (v_dx,v_dy,v_dz) specified for the *variable*
 style are used to calculate a displacement from the atom's original
@@ -206,8 +222,9 @@ spacings can be different in x,y,z.
 Restart, fix_modify, output, run start/stop, minimize info
 """""""""""""""""""""""""""""""""""""""""""""""""""""""""""
 
-This fix writes the original coordinates of moving atoms to :doc:`binary restart files <restart>`, as well as the initial timestep, so that
-the motion can be continuous in a restarted simulation.  See the
+This fix writes the original coordinates of moving atoms to :doc:`binary
+restart files <restart>`, as well as the initial timestep, so that the
+motion can be continuous in a restarted simulation.  See the
 :doc:`read_restart <read_restart>` command for info on how to re-specify
 a fix in an input script that reads a restart file, so that the
 operation of the fix continues in an uninterrupted fashion.
@@ -224,11 +241,12 @@ fix.
 
 This fix produces a per-atom array which can be accessed by various
 :doc:`output commands <Howto_output>`.  The number of columns for each
-atom is 3, and the columns store the original unwrapped x,y,z coords
-of each atom.  The per-atom values can be accessed on any timestep.
+atom is 3, and the columns store the original unwrapped x,y,z coords of
+each atom.  The per-atom values can be accessed on any timestep.
 
 No parameter of this fix can be used with the *start/stop* keywords of
-the :doc:`run <run>` command.  This fix is not invoked during :doc:`energy minimization <minimize>`.
+the :doc:`run <run>` command.  This fix is not invoked during
+:doc:`energy minimization <minimize>`.
 
 For :doc:`rRESPA time integration <run_style>`, this fix adjusts the
 position and velocity of atoms on the outermost rRESPA level.
diff --git a/doc/src/package.rst b/doc/src/package.rst
index a037c91999..0d13de9711 100644
--- a/doc/src/package.rst
+++ b/doc/src/package.rst
@@ -71,7 +71,7 @@ Syntax
            *no_affinity* values = none
        *kokkos* args = keyword value ...
          zero or more keyword/value pairs may be appended
-         keywords = *neigh* or *neigh/qeq* or *neigh/thread* or *newton* or *binsize* or *comm* or *comm/exchange* or *comm/forward* *pair/comm/forward* *fix/comm/forward* or *comm/reverse* or *gpu/aware* or *pair/only*
+         keywords = *neigh* or *neigh/qeq* or *neigh/thread* or *newton* or *binsize* or *comm* or *comm/exchange* or *comm/forward* *comm/pair/forward* *comm/fix/forward* or *comm/reverse* or *gpu/aware* or *pair/only*
            *neigh* value = *full* or *half*
              full = full neighbor list
              half = half neighbor list built in thread-safe manner
@@ -90,11 +90,11 @@ Syntax
            *binsize* value = size
              size = bin size for neighbor list construction (distance units)
            *comm* value = *no* or *host* or *device*
-             use value for comm/exchange and comm/forward and pair/comm/forward and fix/comm/forward and comm/reverse
+             use value for comm/exchange and comm/forward and comm/pair/forward and comm/fix/forward and comm/reverse
            *comm/exchange* value = *no* or *host* or *device*
            *comm/forward* value = *no* or *host* or *device*
-           *pair/comm/forward* value = *no* or *device*
-           *fix/comm/forward* value = *no* or *device*
+           *comm/pair/forward* value = *no* or *device*
+           *comm/fix/forward* value = *no* or *device*
            *comm/reverse* value = *no* or *host* or *device*
              no = perform communication pack/unpack in non-KOKKOS mode
              host = perform pack/unpack on host (e.g. with OpenMP threading)
@@ -498,8 +498,8 @@ because the GPU is faster at performing pairwise interactions, then this
 rule of thumb may give too large a binsize and the default should be
 overridden with a smaller value.
 
-The *comm* and *comm/exchange* and *comm/forward* and *pair/comm/forward*
-and *fix/comm/forward* and comm/reverse*
+The *comm* and *comm/exchange* and *comm/forward* and *comm/pair/forward*
+and *comm/fix/forward* and comm/reverse*
 keywords determine whether the host or device performs the packing and
 unpacking of data when communicating per-atom data between processors.
 "Exchange" communication happens only on timesteps that neighbor lists
@@ -520,8 +520,8 @@ packing/unpacking data for the communication. A value of *host* means to
 use the host, typically a multi-core CPU, and perform the
 packing/unpacking in parallel with threads. A value of *device* means to
 use the device, typically a GPU, to perform the packing/unpacking
-operation. If a value of *host* is used for the *pair/comm/forward* or
-*fix/comm/forward* keyword, it will be automatically be changed to *no*
+operation. If a value of *host* is used for the *comm/pair/forward* or
+*comm/fix/forward* keyword, it will be automatically be changed to *no*
 since these keywords don't support *host* mode.
 
 The optimal choice for these keywords depends on the input script and
diff --git a/src/EXTRA-COMPUTE/compute_msd_nongauss.cpp b/src/EXTRA-COMPUTE/compute_msd_nongauss.cpp
index 99c5cee715..0f68e03ca7 100644
--- a/src/EXTRA-COMPUTE/compute_msd_nongauss.cpp
+++ b/src/EXTRA-COMPUTE/compute_msd_nongauss.cpp
@@ -1,4 +1,3 @@
-// clang-format off
 /* ----------------------------------------------------------------------
    LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
    https://www.lammps.org/, Sandia National Laboratories
@@ -19,17 +18,17 @@
 #include "compute_msd_nongauss.h"
 
 #include "atom.h"
-#include "update.h"
-#include "group.h"
 #include "domain.h"
 #include "fix_store.h"
+#include "group.h"
+#include "update.h"
 
 using namespace LAMMPS_NS;
 
 /* ---------------------------------------------------------------------- */
 
 ComputeMSDNonGauss::ComputeMSDNonGauss(LAMMPS *lmp, int narg, char **arg) :
-  ComputeMSD(lmp, narg, arg)
+    ComputeMSD(lmp, narg, arg)
 {
   size_vector = 3;
 }
@@ -43,8 +42,10 @@ void ComputeMSDNonGauss::compute_vector()
   // cm = current center of mass
 
   double cm[3];
-  if (comflag) group->xcm(igroup,masstotal,cm);
-  else cm[0] = cm[1] = cm[2] = 0.0;
+  if (comflag)
+    group->xcm(igroup, masstotal, cm);
+  else
+    cm[0] = cm[1] = cm[2] = 0.0;
 
   // dx,dy,dz = displacement of atom from original position
   // original unwrapped position is stored by fix
@@ -63,8 +64,8 @@ void ComputeMSDNonGauss::compute_vector()
   double yprd = domain->yprd;
   double zprd = domain->zprd;
 
-  double dx,dy,dz;
-  int xbox,ybox,zbox;
+  double dx, dy, dz;
+  int xbox, ybox, zbox;
 
   double msd[2];
   msd[0] = msd[1] = 0.0;
@@ -75,11 +76,11 @@ void ComputeMSDNonGauss::compute_vector()
         xbox = (image[i] & IMGMASK) - IMGMAX;
         ybox = (image[i] >> IMGBITS & IMGMASK) - IMGMAX;
         zbox = (image[i] >> IMG2BITS) - IMGMAX;
-        dx = x[i][0] + xbox*xprd - cm[0] - xoriginal[i][0];
-        dy = x[i][1] + ybox*yprd - cm[1] - xoriginal[i][1];
-        dz = x[i][2] + zbox*zprd - cm[2] - xoriginal[i][2];
-        msd[0] += dx*dx + dy*dy + dz*dz;
-        msd[1] += (dx*dx + dy*dy + dz*dz)*(dx*dx + dy*dy + dz*dz);
+        dx = x[i][0] + xbox * xprd - cm[0] - xoriginal[i][0];
+        dy = x[i][1] + ybox * yprd - cm[1] - xoriginal[i][1];
+        dz = x[i][2] + zbox * zprd - cm[2] - xoriginal[i][2];
+        msd[0] += dx * dx + dy * dy + dz * dz;
+        msd[1] += (dx * dx + dy * dy + dz * dz) * (dx * dx + dy * dy + dz * dz);
       }
 
   } else {
@@ -88,19 +89,18 @@ void ComputeMSDNonGauss::compute_vector()
         xbox = (image[i] & IMGMASK) - IMGMAX;
         ybox = (image[i] >> IMGBITS & IMGMASK) - IMGMAX;
         zbox = (image[i] >> IMG2BITS) - IMGMAX;
-        dx = x[i][0] + h[0]*xbox + h[5]*ybox + h[4]*zbox -
-          cm[0] - xoriginal[i][0];
-        dy = x[i][1] + h[1]*ybox + h[3]*zbox - cm[1] - xoriginal[i][1];
-        dz = x[i][2] + h[2]*zbox - cm[2] - xoriginal[i][2];
-        msd[0] += dx*dx + dy*dy + dz*dz;
-        msd[1] += (dx*dx + dy*dy + dz*dz)*(dx*dx + dy*dy + dz*dz);
+        dx = x[i][0] + h[0] * xbox + h[5] * ybox + h[4] * zbox - cm[0] - xoriginal[i][0];
+        dy = x[i][1] + h[1] * ybox + h[3] * zbox - cm[1] - xoriginal[i][1];
+        dz = x[i][2] + h[2] * zbox - cm[2] - xoriginal[i][2];
+        msd[0] += dx * dx + dy * dy + dz * dz;
+        msd[1] += (dx * dx + dy * dy + dz * dz) * (dx * dx + dy * dy + dz * dz);
       }
   }
 
-  MPI_Allreduce(msd,vector,2,MPI_DOUBLE,MPI_SUM,world);
+  MPI_Allreduce(msd, vector, 2, MPI_DOUBLE, MPI_SUM, world);
   if (nmsd) {
     vector[0] /= nmsd;
     vector[1] /= nmsd;
-    vector[2] = (3*vector[1])/(5*vector[0]*vector[0]) - 1;
+    vector[2] = (3 * vector[1]) / (5 * vector[0] * vector[0]) - 1;
   }
 }
diff --git a/src/KOKKOS/kokkos_type.h b/src/KOKKOS/kokkos_type.h
index 22620f0d3d..2926ba1d36 100644
--- a/src/KOKKOS/kokkos_type.h
+++ b/src/KOKKOS/kokkos_type.h
@@ -1214,6 +1214,12 @@ struct params_lj_coul {
   F_FLOAT cut_ljsq,cut_coulsq,lj1,lj2,lj3,lj4,offset;
 };
 
+// ReaxFF
+
+struct alignas(4 * sizeof(int)) reax_int4 {
+  int i0, i1, i2, i3;
+};
+
 // Pair SNAP
 
 #define SNAP_KOKKOS_REAL double
diff --git a/src/KOKKOS/pair_reaxff_kokkos.cpp b/src/KOKKOS/pair_reaxff_kokkos.cpp
index fd73bd7c4c..841d9044a8 100644
--- a/src/KOKKOS/pair_reaxff_kokkos.cpp
+++ b/src/KOKKOS/pair_reaxff_kokkos.cpp
@@ -13,7 +13,8 @@
 ------------------------------------------------------------------------- */
 
 /* ----------------------------------------------------------------------
-   Contributing authors: Ray Shan (SNL), Stan Moore (SNL)
+   Contributing authors: Ray Shan (SNL), Stan Moore (SNL),
+     Evan Weinberg (NVIDIA)
 
    Nicholas Curtis (AMD), Leopold Grinberd (AMD), and Gina Sitaraman (AMD):
      - Reduced math overhead: enabled specialized calls (e.g., cbrt for a
@@ -77,6 +78,12 @@ PairReaxFFKokkos<DeviceType>::PairReaxFFKokkos(LAMMPS *lmp) : PairReaxFF(lmp)
   k_error_flag = DAT::tdual_int_scalar("pair:error_flag");
   k_nbuf_local = DAT::tdual_int_scalar("pair:nbuf_local");
 
+  d_torsion_pack = t_reax_int4_2d("reaxff:torsion_pack",1,2);
+  d_angular_pack = t_reax_int4_2d("reaxff:angular_pack",1,2);
+
+  k_count_angular_torsion = DAT::tdual_int_1d("PairReaxFF::count_angular_torsion",2);
+  d_count_angular_torsion = k_count_angular_torsion.template view<DeviceType>();
+
   if (execution_space == Host) list_blocking_flag = 1;
 }
 
@@ -935,18 +942,49 @@ void PairReaxFFKokkos<DeviceType>::compute(int eflag_in, int vflag_in)
   pvector[3] = 0.0;
   ev_all.evdwl += ev.ereax[0] + ev.ereax[1] + ev.ereax[2];
 
+  int count_angular = 0;
+  int count_torsion = 0;
+
+  auto& h_count_angular_torsion = k_count_angular_torsion.h_view;
+  h_count_angular_torsion(0) = 0;
+  h_count_angular_torsion(1) = 0;
+  k_count_angular_torsion.template modify<LMPHostType>();
+  k_count_angular_torsion.template sync<DeviceType>();
+
+  Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxCountAngularTorsion<false> >(0,inum),*this);
+
+  k_count_angular_torsion.template modify<DeviceType>();
+  k_count_angular_torsion.template sync<LMPHostType>();
+  count_angular = h_count_angular_torsion(0);
+  count_torsion = h_count_angular_torsion(1);
+
+  if (count_angular > d_angular_pack.extent(0))
+    d_angular_pack = t_reax_int4_2d("reaxff:angular_pack",(int)(count_angular * 1.1),2);
+  if (count_torsion > d_torsion_pack.extent(0))
+    d_torsion_pack = t_reax_int4_2d("reaxff:torsion_pack",(int)(count_torsion * 1.1),2);
+
+  // need to zero to re-count
+  h_count_angular_torsion(0) = 0;
+  h_count_angular_torsion(1) = 0;
+  k_count_angular_torsion.template modify<LMPHostType>();
+  k_count_angular_torsion.template sync<DeviceType>();
+
+  Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxCountAngularTorsion<true>>(0,inum),*this);
+
+  // no need to re-sync count_angular, count_torsion
+
   // Angular
   if (neighflag == HALF) {
     if (evflag)
-      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngular<HALF,1>>(0,inum),*this,ev);
+      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngularPreprocessed<HALF,1>>(0,count_angular),*this,ev);
     else
-      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngular<HALF,0>>(0,inum),*this);
+      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngularPreprocessed<HALF,0>>(0,count_angular),*this);
     ev_all += ev;
   } else { //if (neighflag == HALFTHREAD) {
     if (evflag)
-      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngular<HALFTHREAD,1>>(0,inum),*this,ev);
+      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngularPreprocessed<HALFTHREAD,1>>(0,count_angular),*this,ev);
     else
-      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngular<HALFTHREAD,0>>(0,inum),*this);
+      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeAngularPreprocessed<HALFTHREAD,0>>(0,count_angular),*this);
     ev_all += ev;
   }
   pvector[4] = ev.ereax[3];
@@ -954,42 +992,20 @@ void PairReaxFFKokkos<DeviceType>::compute(int eflag_in, int vflag_in)
   pvector[6] = ev.ereax[5];
   ev_all.evdwl += ev.ereax[3] + ev.ereax[4] + ev.ereax[5];
 
-  if (inum > counters.extent(0)) {
-    // HIP backend note: use the "hipHostMallocNonCoherent" flag if/when
-    //  it is exposed in Kokkos for HIP pinned memory allocations
-    counters = t_hostpinned_int_1d("ReaxFF::counters", inum);
-    counters_jj_min = t_hostpinned_int_1d("ReaxFF::counters_jj_min", inum);
-    counters_jj_max = t_hostpinned_int_1d("ReaxFF::counters_jj_max", inum);
-    counters_kk_min = t_hostpinned_int_1d("ReaxFF::counters_kk_min", inum);
-    counters_kk_max = t_hostpinned_int_1d("ReaxFF::counters_kk_max", inum);
-  }
-
-  Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsionPreview>(0,inum),*this);
-  Kokkos::fence();
-
-  // Compress the counters list ; could be accomplished on device with parallel scan
-  int nnz = 0;
-  for (int i = 0; i < inum; ++i) {
-    if (counters[i] > 0) {
-      counters[nnz] = i;
-      nnz++;
-    }
-  }
-
+  // Torsion
   if (neighflag == HALF) {
     if (evflag)
-      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsion<HALF,1>>(0,nnz),*this,ev);
+      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsionPreprocessed<HALF,1>>(0,count_torsion),*this,ev);
     else
-      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsion<HALF,0>>(0,nnz),*this);
+      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsionPreprocessed<HALF,0>>(0,count_torsion),*this);
     ev_all += ev;
-  } else {
+  } else { //if (neighflag == HALFTHREAD) {
     if (evflag)
-      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsion<HALFTHREAD,1>>(0,nnz),*this,ev);
+      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsionPreprocessed<HALFTHREAD,1>>(0,count_torsion),*this,ev);
     else
-      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsion<HALFTHREAD,0>>(0,nnz),*this);
+      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagPairReaxComputeTorsionPreprocessed<HALFTHREAD,0>>(0,count_torsion),*this);
     ev_all += ev;
   }
-
   pvector[8] = ev.ereax[6];
   pvector[9] = ev.ereax[7];
   ev_all.evdwl += ev.ereax[6] + ev.ereax[7];
@@ -1514,13 +1530,8 @@ void PairReaxFFKokkos<DeviceType>::allocate_array()
   d_BO_pi = typename AT::t_ffloat_2d_dl("reaxff/kk:BO_pi",nmax,maxbo);
   d_BO_pi2 = typename AT::t_ffloat_2d_dl("reaxff/kk:BO_pi2",nmax,maxbo);
 
-  d_dln_BOp_pix = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_pix",nmax,maxbo);
-  d_dln_BOp_piy = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_piy",nmax,maxbo);
-  d_dln_BOp_piz = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_piz",nmax,maxbo);
-
-  d_dln_BOp_pi2x = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_pi2x",nmax,maxbo);
-  d_dln_BOp_pi2y = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_pi2y",nmax,maxbo);
-  d_dln_BOp_pi2z = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_pi2z",nmax,maxbo);
+  d_dln_BOp_pi = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_pi",nmax,maxbo);
+  d_dln_BOp_pi2 = typename AT::t_ffloat_2d_dl("reaxff/kk:d_dln_BOp_pi2",nmax,maxbo);
 
   d_C1dbo = typename AT::t_ffloat_2d_dl("reaxff/kk:d_C1dbo",nmax,maxbo);
   d_C2dbo = typename AT::t_ffloat_2d_dl("reaxff/kk:d_C2dbo",nmax,maxbo);
@@ -1536,9 +1547,7 @@ void PairReaxFFKokkos<DeviceType>::allocate_array()
   d_C3dbopi2 = typename AT::t_ffloat_2d_dl("reaxff/kk:d_C3dbopi2",nmax,maxbo);
   d_C4dbopi2 = typename AT::t_ffloat_2d_dl("reaxff/kk:d_C4dbopi2",nmax,maxbo);
 
-  d_dBOpx = typename AT::t_ffloat_2d_dl("reaxff/kk:dBOpx",nmax,maxbo);
-  d_dBOpy = typename AT::t_ffloat_2d_dl("reaxff/kk:dBOpy",nmax,maxbo);
-  d_dBOpz = typename AT::t_ffloat_2d_dl("reaxff/kk:dBOpz",nmax,maxbo);
+  d_dBOp = typename AT::t_ffloat_2d_dl("reaxff/kk:dBOp",nmax,maxbo);
 
   d_dDeltap_self = typename AT::t_ffloat_2d_dl("reaxff/kk:dDeltap_self",nmax,3);
   d_Deltap_boc = typename AT::t_ffloat_1d("reaxff/kk:Deltap_boc",nmax);
@@ -1561,6 +1570,9 @@ void PairReaxFFKokkos<DeviceType>::allocate_array()
   d_abo = typename AT::t_ffloat_2d("reaxff/kk:abo",nmax,maxbo);
   d_neighid = typename AT::t_tagint_2d("reaxff/kk:neighid",nmax,maxbo);
   d_numneigh_bonds = typename AT::t_int_1d("reaxff/kk:numneigh_bonds",nmax);
+
+  // ComputeAngular intermediates
+  d_angular_intermediates = typename AT::t_ffloat_2d("reaxff/kk:angular_intermediates",nmax,4);
 }
 
 /* ---------------------------------------------------------------------- */
@@ -1582,13 +1594,8 @@ void PairReaxFFKokkos<DeviceType>::deallocate_array()
   d_BO_pi = typename AT::t_ffloat_2d_dl();
   d_BO_pi2 = typename AT::t_ffloat_2d_dl();
 
-  d_dln_BOp_pix = typename AT::t_ffloat_2d_dl();
-  d_dln_BOp_piy = typename AT::t_ffloat_2d_dl();
-  d_dln_BOp_piz = typename AT::t_ffloat_2d_dl();
-
-  d_dln_BOp_pi2x = typename AT::t_ffloat_2d_dl();
-  d_dln_BOp_pi2y = typename AT::t_ffloat_2d_dl();
-  d_dln_BOp_pi2z = typename AT::t_ffloat_2d_dl();
+  d_dln_BOp_pi = typename AT::t_ffloat_2d_dl();
+  d_dln_BOp_pi2 = typename AT::t_ffloat_2d_dl();
 
   d_C1dbo = typename AT::t_ffloat_2d_dl();
   d_C2dbo = typename AT::t_ffloat_2d_dl();
@@ -1604,9 +1611,7 @@ void PairReaxFFKokkos<DeviceType>::deallocate_array()
   d_C3dbopi2 = typename AT::t_ffloat_2d_dl();
   d_C4dbopi2 = typename AT::t_ffloat_2d_dl();
 
-  d_dBOpx = typename AT::t_ffloat_2d_dl();
-  d_dBOpy = typename AT::t_ffloat_2d_dl();
-  d_dBOpz = typename AT::t_ffloat_2d_dl();
+  d_dBOp = typename AT::t_ffloat_2d_dl();
 
   d_dDeltap_self = typename AT::t_ffloat_2d_dl();
   d_Deltap_boc = typename AT::t_ffloat_1d();
@@ -1629,6 +1634,9 @@ void PairReaxFFKokkos<DeviceType>::deallocate_array()
   d_abo = typename AT::t_ffloat_2d();
   d_neighid = typename AT::t_tagint_2d();
   d_numneigh_bonds = typename AT::t_int_1d();
+
+  // ComputeAngular intermediates
+  d_angular_intermediates = typename AT::t_ffloat_2d();
 }
 
 /* ---------------------------------------------------------------------- */
@@ -1665,8 +1673,7 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfBlocking<
   const int itype = type(i);
   const int jnum = d_numneigh[i];
 
-  const int three = 3;
-  F_FLOAT C12, C34, C56, BO_s, BO_pi, BO_pi2, BO, delij[three], dBOp_i[three], dln_BOp_pi_i[three], dln_BOp_pi2_i[three];
+  F_FLOAT C12, C34, C56, BO_s, BO_pi, BO_pi2, BO, delij[3], dBOp_i[3];
   F_FLOAT dDeltap_self_i[3] = {0.0,0.0,0.0};
   F_FLOAT total_bo_i = 0.0;
 
@@ -1675,7 +1682,6 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfBlocking<
   const int bo_first_i = d_bo_first[i];
 
   int ihb = -1;
-  int jhb = -1;
 
   int hb_first_i;
   if (cut_hbsq > 0.0) {
@@ -1729,101 +1735,41 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfBlocking<
       const F_FLOAT rsq = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
 
       // hbond list
-      if (i < nlocal && cut_hbsq > 0.0 && (ihb == 1 || ihb == 2) && rsq <= cut_hbsq) {
-        jhb = paramssing(jtype).p_hbond;
-        if (ihb == 1 && jhb == 2) {
-          if (NEIGHFLAG == HALF) {
-            j_index = hb_first_i + d_hb_num[i];
-            d_hb_num[i]++;
-          } else
-            j_index = hb_first_i + Kokkos::atomic_fetch_add(&d_hb_num[i],1);
-
-          const int jj_index = j_index - hb_first_i;
-
-          if (jj_index >= maxhb)
-            d_resize_hb() = MAX(d_resize_hb(),jj_index+1);
-          else
-            d_hb_list[j_index] = j;
-        } else if (j < nlocal && ihb == 2 && jhb == 1) {
-          if (NEIGHFLAG == HALF) {
-            i_index = d_hb_first[j] + d_hb_num[j];
-           d_hb_num[j]++;
-          } else
-            i_index = d_hb_first[j] + Kokkos::atomic_fetch_add(&d_hb_num[j],1);
-
-          const int ii_index = i_index - d_hb_first[j];
-
-          if (ii_index >= maxhb)
-            d_resize_hb() = MAX(d_resize_hb(),ii_index+1);
-          else
-            d_hb_list[i_index] = i;
-        }
-      }
+      build_hb_list<NEIGHFLAG>(rsq, i, hb_first_i, ihb, j, jtype);
 
       if (rsq > cut_bosq) continue;
 
-       // bond_list
+      // bond_list
       const F_FLOAT rij = sqrt(rsq);
-      const F_FLOAT p_bo1 = paramstwbp(itype,jtype).p_bo1;
       const F_FLOAT p_bo2 = paramstwbp(itype,jtype).p_bo2;
-      const F_FLOAT p_bo3 = paramstwbp(itype,jtype).p_bo3;
       const F_FLOAT p_bo4 = paramstwbp(itype,jtype).p_bo4;
-      const F_FLOAT p_bo5 = paramstwbp(itype,jtype).p_bo5;
       const F_FLOAT p_bo6 = paramstwbp(itype,jtype).p_bo6;
-      const F_FLOAT r_s = paramstwbp(itype,jtype).r_s;
-      const F_FLOAT r_pi = paramstwbp(itype,jtype).r_pi;
-      const F_FLOAT r_pi2 = paramstwbp(itype,jtype).r_pi2;
 
-      if (paramssing(itype).r_s > 0.0  && paramssing(jtype).r_s > 0.0) {
-        C12 = p_bo1 * ((p_bo2 != 0) ? (pow(rij/r_s,p_bo2)) : 1.0);
-        BO_s = (1.0+bo_cut)*exp(C12);
-      } else BO_s = C12 = 0.0;
-
-      if (paramssing(itype).r_pi > 0.0  && paramssing(jtype).r_pi > 0.0) {
-        C34 = p_bo3 * ((p_bo4 != 0) ? (pow(rij/r_pi,p_bo4)) : 1.0);
-        BO_pi = exp(C34);
-      } else BO_pi = C34 = 0.0;
-
-      if (paramssing(itype).r_pi2 > 0.0  && paramssing(jtype).r_pi2 > 0.0) {
-        C56 = p_bo5 * ((p_bo6 != 0) ? (pow(rij/r_pi2,p_bo6)) : 1.0);
-        BO_pi2 = exp(C56);
-      } else BO_pi2 = C56 = 0.0;
+      // returns BO_*, C** by reference
+      compute_bo(rij, itype, jtype, p_bo2, p_bo4, p_bo6,
+        BO_s, BO_pi, BO_pi2, C12, C34, C56);
 
       BO = BO_s + BO_pi + BO_pi2;
       if (BO < bo_cut) continue;
 
-      if (NEIGHFLAG == HALF) {
-        j_index = bo_first_i + d_bo_num[i];
-        i_index = d_bo_first[j] + d_bo_num[j];
-        d_bo_num[i]++;
-        d_bo_num[j]++;
-      } else {
-        j_index = bo_first_i + Kokkos::atomic_fetch_add(&d_bo_num[i],1);
-        i_index = d_bo_first[j] + Kokkos::atomic_fetch_add(&d_bo_num[j],1);
-      }
-
-      const int jj_index = j_index - bo_first_i;
-      const int ii_index = i_index - d_bo_first[j];
-
-      if (jj_index >= maxbo || ii_index >= maxbo) {
-        const int max_val = MAX(ii_index+1,jj_index+1);
-        d_resize_bo() = MAX(d_resize_bo(),max_val);
-      } else {
-        d_bo_list[j_index] = j;
-        d_bo_list[i_index] = i;
+      int ii_index = -1;
+      int jj_index = -1;
+      if (build_bo_list<NEIGHFLAG>(bo_first_i, i, j, i_index, j_index, ii_index, jj_index)) {
 
         // from BondOrder1
 
         d_BO(i,jj_index) = BO;
         d_BO_s(i,jj_index) = BO_s;
-        d_BO_pi(i,jj_index) = BO_pi;
-        d_BO_pi2(i,jj_index) = BO_pi2;
 
         d_BO(j,ii_index) = BO;
         d_BO_s(j,ii_index) = BO_s;
+
         d_BO_pi(j,ii_index) = BO_pi;
         d_BO_pi2(j,ii_index) = BO_pi2;
 
+        d_BO_pi(i,jj_index) = BO_pi;
+        d_BO_pi2(i,jj_index) = BO_pi2;
+
         F_FLOAT Cln_BOp_s = p_bo2 * C12 / rij / rij;
         F_FLOAT Cln_BOp_pi = p_bo4 * C34 / rij / rij;
         F_FLOAT Cln_BOp_pi2 = p_bo6 * C56 / rij / rij;
@@ -1831,42 +1777,24 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfBlocking<
         if (nlocal == 0)
           Cln_BOp_s = Cln_BOp_pi = Cln_BOp_pi2 = 0.0;
 
-        for (int d = 0; d < 3; d++) dln_BOp_pi_i[d] = -(BO_pi*Cln_BOp_pi)*delij[d];
-        for (int d = 0; d < 3; d++) dln_BOp_pi2_i[d] = -(BO_pi2*Cln_BOp_pi2)*delij[d];
         for (int d = 0; d < 3; d++) dBOp_i[d] = -(BO_s*Cln_BOp_s+BO_pi*Cln_BOp_pi+BO_pi2*Cln_BOp_pi2)*delij[d];
         for (int d = 0; d < 3; d++) dDeltap_self_i[d] += dBOp_i[d];
         for (int d = 0; d < 3; d++) a_dDeltap_self(j,d) += -dBOp_i[d];
 
-        d_dln_BOp_pix(i,jj_index) = dln_BOp_pi_i[0];
-        d_dln_BOp_piy(i,jj_index) = dln_BOp_pi_i[1];
-        d_dln_BOp_piz(i,jj_index) = dln_BOp_pi_i[2];
+        d_dln_BOp_pi(i,jj_index) = -(BO_pi*Cln_BOp_pi);
+        d_dln_BOp_pi(j,ii_index) = -(BO_pi*Cln_BOp_pi);
 
-        d_dln_BOp_pix(j,ii_index) = -dln_BOp_pi_i[0];
-        d_dln_BOp_piy(j,ii_index) = -dln_BOp_pi_i[1];
-        d_dln_BOp_piz(j,ii_index) = -dln_BOp_pi_i[2];
+        d_dln_BOp_pi2(i,jj_index) = -(BO_pi2*Cln_BOp_pi2);
+        d_dln_BOp_pi2(j,ii_index) = -(BO_pi2*Cln_BOp_pi2);
 
-        d_dln_BOp_pi2x(i,jj_index) = dln_BOp_pi2_i[0];
-        d_dln_BOp_pi2y(i,jj_index) = dln_BOp_pi2_i[1];
-        d_dln_BOp_pi2z(i,jj_index) = dln_BOp_pi2_i[2];
-
-        d_dln_BOp_pi2x(j,ii_index) = -dln_BOp_pi2_i[0];
-        d_dln_BOp_pi2y(j,ii_index) = -dln_BOp_pi2_i[1];
-        d_dln_BOp_pi2z(j,ii_index) = -dln_BOp_pi2_i[2];
-
-        d_dBOpx(i,jj_index) = dBOp_i[0];
-        d_dBOpy(i,jj_index) = dBOp_i[1];
-        d_dBOpz(i,jj_index) = dBOp_i[2];
-
-        d_dBOpx(j,ii_index) = -dBOp_i[0];
-        d_dBOpy(j,ii_index) = -dBOp_i[1];
-        d_dBOpz(j,ii_index) = -dBOp_i[2];
-
-        d_BO(i,jj_index) -= bo_cut;
-        d_BO(j,ii_index) -= bo_cut;
-        d_BO_s(i,jj_index) -= bo_cut;
-        d_BO_s(j,ii_index) -= bo_cut;
-        total_bo_i += d_BO(i,jj_index);
-        a_total_bo[j] += d_BO(j,ii_index);
+        d_dBOp(i,jj_index) = -(BO_s*Cln_BOp_s+BO_pi*Cln_BOp_pi+BO_pi2*Cln_BOp_pi2);
+        d_dBOp(j,ii_index) = -(BO_s*Cln_BOp_s+BO_pi*Cln_BOp_pi+BO_pi2*Cln_BOp_pi2);
+        d_BO(i,jj_index) = BO - bo_cut;
+        d_BO(j,ii_index) = BO - bo_cut;
+        d_BO_s(i,jj_index) = BO_s - bo_cut;
+        d_BO_s(j,ii_index) = BO_s - bo_cut;
+        total_bo_i += (BO - bo_cut);
+        a_total_bo[j] += (BO - bo_cut);
       }
     }
   }
@@ -1899,7 +1827,6 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfBlockingP
   const int bo_first_i = d_bo_first[i];
 
   int ihb = -1;
-  int jhb = -1;
 
   int hb_first_i;
   if (cut_hbsq > 0.0) {
@@ -1954,89 +1881,26 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfBlockingP
       const F_FLOAT rsq = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
 
       // hbond list
-      if (i < nlocal && cut_hbsq > 0.0 && (ihb == 1 || ihb == 2) && rsq <= cut_hbsq) {
-        jhb = paramssing(jtype).p_hbond;
-        if (ihb == 1 && jhb == 2) {
-          if (NEIGHFLAG == HALF) {
-            j_index = hb_first_i + d_hb_num[i];
-            d_hb_num[i]++;
-          } else
-            j_index = hb_first_i + Kokkos::atomic_fetch_add(&d_hb_num[i],1);
-
-          const int jj_index = j_index - hb_first_i;
-
-          if (jj_index >= maxhb)
-            d_resize_hb() = MAX(d_resize_hb(),jj_index+1);
-          else
-            d_hb_list[j_index] = j;
-        } else if (j < nlocal && ihb == 2 && jhb == 1) {
-          if (NEIGHFLAG == HALF) {
-            i_index = d_hb_first[j] + d_hb_num[j];
-           d_hb_num[j]++;
-          } else
-            i_index = d_hb_first[j] + Kokkos::atomic_fetch_add(&d_hb_num[j],1);
-
-          const int ii_index = i_index - d_hb_first[j];
-
-          if (ii_index >= maxhb)
-            d_resize_hb() = MAX(d_resize_hb(),ii_index+1);
-          else
-            d_hb_list[i_index] = i;
-        }
-      }
+      build_hb_list<NEIGHFLAG>(rsq, i, hb_first_i, ihb, j, jtype);
 
       if (rsq > cut_bosq) continue;
 
-       // bond_list
+      // bond_list
       const F_FLOAT rij = sqrt(rsq);
-      const F_FLOAT p_bo1 = paramstwbp(itype,jtype).p_bo1;
       const F_FLOAT p_bo2 = paramstwbp(itype,jtype).p_bo2;
-      const F_FLOAT p_bo3 = paramstwbp(itype,jtype).p_bo3;
       const F_FLOAT p_bo4 = paramstwbp(itype,jtype).p_bo4;
-      const F_FLOAT p_bo5 = paramstwbp(itype,jtype).p_bo5;
       const F_FLOAT p_bo6 = paramstwbp(itype,jtype).p_bo6;
-      const F_FLOAT r_s = paramstwbp(itype,jtype).r_s;
-      const F_FLOAT r_pi = paramstwbp(itype,jtype).r_pi;
-      const F_FLOAT r_pi2 = paramstwbp(itype,jtype).r_pi2;
 
-      if (paramssing(itype).r_s > 0.0  && paramssing(jtype).r_s > 0.0) {
-        C12 = p_bo1 * ((p_bo2 != 0) ? (pow(rij/r_s,p_bo2)) : 1.0);
-        BO_s = (1.0+bo_cut)*exp(C12);
-      } else BO_s = C12 = 0.0;
-
-      if (paramssing(itype).r_pi > 0.0  && paramssing(jtype).r_pi > 0.0) {
-        C34 = p_bo3 * ((p_bo4 != 0) ? (pow(rij/r_pi,p_bo4)) : 1.0);
-        BO_pi = exp(C34);
-      } else BO_pi = C34 = 0.0;
-
-      if (paramssing(itype).r_pi2 > 0.0  && paramssing(jtype).r_pi2 > 0.0) {
-        C56 = p_bo5 * ((p_bo6 != 0) ? (pow(rij/r_pi2,p_bo6)) : 1.0);
-        BO_pi2 = exp(C56);
-      } else BO_pi2 = C56 = 0.0;
+      // returns BO_*, C** by reference
+      compute_bo(rij, itype, jtype, p_bo2, p_bo4, p_bo6,
+        BO_s, BO_pi, BO_pi2, C12, C34, C56);
 
       BO = BO_s + BO_pi + BO_pi2;
       if (BO < bo_cut) continue;
 
-      if (NEIGHFLAG == HALF) {
-        j_index = bo_first_i + d_bo_num[i];
-        i_index = d_bo_first[j] + d_bo_num[j];
-        d_bo_num[i]++;
-        d_bo_num[j]++;
-      } else {
-        j_index = bo_first_i + Kokkos::atomic_fetch_add(&d_bo_num[i],1);
-        i_index = d_bo_first[j] + Kokkos::atomic_fetch_add(&d_bo_num[j],1);
-      }
-
-      const int jj_index = j_index - bo_first_i;
-      const int ii_index = i_index - d_bo_first[j];
-
-      if (jj_index >= maxbo || ii_index >= maxbo) {
-        const int max_val = MAX(ii_index+1,jj_index+1);
-        d_resize_bo() = MAX(d_resize_bo(),max_val);
-      } else {
-        d_bo_list[j_index] = j;
-        d_bo_list[i_index] = i;
-      }
+      int ii_index = -1;
+      int jj_index = -1;
+      build_bo_list<NEIGHFLAG>(bo_first_i, i, j, i_index, j_index, ii_index, jj_index);
     }
   }
 }
@@ -2062,7 +1926,6 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfPreview<N
   const int bo_first_i = d_bo_first[i];
 
   int ihb = -1;
-  int jhb = -1;
 
   int hb_first_i;
   if (cut_hbsq > 0.0) {
@@ -2087,90 +1950,106 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsHalfPreview<N
     const F_FLOAT rsq = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
 
     // hbond list
-    if (i < nlocal && cut_hbsq > 0.0 && (ihb == 1 || ihb == 2) && rsq <= cut_hbsq) {
-      jhb = paramssing(jtype).p_hbond;
-      if (ihb == 1 && jhb == 2) {
-        if (NEIGHFLAG == HALF) {
-          j_index = hb_first_i + d_hb_num[i];
-          d_hb_num[i]++;
-        } else
-          j_index = hb_first_i + Kokkos::atomic_fetch_add(&d_hb_num[i],1);
-
-        const int jj_index = j_index - hb_first_i;
-
-        if (jj_index >= maxhb)
-          d_resize_hb() = MAX(d_resize_hb(),jj_index+1);
-        else
-          d_hb_list[j_index] = j;
-      } else if (j < nlocal && ihb == 2 && jhb == 1) {
-        if (NEIGHFLAG == HALF) {
-          i_index = d_hb_first[j] + d_hb_num[j];
-          d_hb_num[j]++;
-        } else
-          i_index = d_hb_first[j] + Kokkos::atomic_fetch_add(&d_hb_num[j],1);
-
-        const int ii_index = i_index - d_hb_first[j];
-
-        if (ii_index >= maxhb)
-          d_resize_hb() = MAX(d_resize_hb(),ii_index+1);
-        else
-          d_hb_list[i_index] = i;
-      }
-    }
+    build_hb_list<NEIGHFLAG>(rsq, i, hb_first_i, ihb, j, jtype);
 
     if (rsq > cut_bosq) continue;
 
     // bond_list
     const F_FLOAT rij = sqrt(rsq);
-    const F_FLOAT p_bo1 = paramstwbp(itype,jtype).p_bo1;
     const F_FLOAT p_bo2 = paramstwbp(itype,jtype).p_bo2;
-    const F_FLOAT p_bo3 = paramstwbp(itype,jtype).p_bo3;
     const F_FLOAT p_bo4 = paramstwbp(itype,jtype).p_bo4;
-    const F_FLOAT p_bo5 = paramstwbp(itype,jtype).p_bo5;
     const F_FLOAT p_bo6 = paramstwbp(itype,jtype).p_bo6;
-    const F_FLOAT r_s = paramstwbp(itype,jtype).r_s;
-    const F_FLOAT r_pi = paramstwbp(itype,jtype).r_pi;
-    const F_FLOAT r_pi2 = paramstwbp(itype,jtype).r_pi2;
 
-    if (paramssing(itype).r_s > 0.0  && paramssing(jtype).r_s > 0.0) {
-      C12 = p_bo1 * ((p_bo2 != 0) ? (pow(rij/r_s,p_bo2)) : 1.0);
-      BO_s = (1.0+bo_cut)*exp(C12);
-    } else BO_s = C12 = 0.0;
-
-    if (paramssing(itype).r_pi > 0.0  && paramssing(jtype).r_pi > 0.0) {
-      C34 = p_bo3 * ((p_bo4 != 0) ? (pow(rij/r_pi,p_bo4)) : 1.0);
-      BO_pi = exp(C34);
-    } else BO_pi = C34 = 0.0;
-
-    if (paramssing(itype).r_pi2 > 0.0  && paramssing(jtype).r_pi2 > 0.0) {
-      C56 = p_bo5 * ((p_bo6 != 0) ? (pow(rij/r_pi2,p_bo6)) : 1.0);
-      BO_pi2 = exp(C56);
-    } else BO_pi2 = C56 = 0.0;
+    // returns BO_*, C** by reference
+    compute_bo(rij, itype, jtype, p_bo2, p_bo4, p_bo6,
+      BO_s, BO_pi, BO_pi2, C12, C34, C56);
 
     BO = BO_s + BO_pi + BO_pi2;
     if (BO < bo_cut) continue;
 
-    if (NEIGHFLAG == HALF) {
-      j_index = bo_first_i + d_bo_num[i];
-      i_index = d_bo_first[j] + d_bo_num[j];
-      d_bo_num[i]++;
-      d_bo_num[j]++;
-    } else {
-      j_index = bo_first_i + Kokkos::atomic_fetch_add(&d_bo_num[i],1);
-      i_index = d_bo_first[j] + Kokkos::atomic_fetch_add(&d_bo_num[j],1);
-    }
+    int ii_index = -1;
+    int jj_index = -1;
 
-    const int jj_index = j_index - bo_first_i;
-    const int ii_index = i_index - d_bo_first[j];
+    build_bo_list<NEIGHFLAG>(bo_first_i, i, j, i_index, j_index, ii_index, jj_index);
+  }
+}
 
-    if (jj_index >= maxbo || ii_index >= maxbo) {
-      const int max_val = MAX(ii_index+1,jj_index+1);
-      d_resize_bo() = MAX(d_resize_bo(),max_val);
-    } else {
-      d_bo_list[j_index] = j;
-      d_bo_list[i_index] = i;
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+template<int NEIGHFLAG>
+KOKKOS_INLINE_FUNCTION
+void PairReaxFFKokkos<DeviceType>::build_hb_list(F_FLOAT rsq, int i, int hb_first_i, int ihb, int j, int jtype) const {
+
+  int i_index, j_index;
+  int jhb = -1;
+  if (i < nlocal && cut_hbsq > 0.0 && (ihb == 1 || ihb == 2) && rsq <= cut_hbsq) {
+    jhb = paramssing(jtype).p_hbond;
+    if (ihb == 1 && jhb == 2) {
+      if (NEIGHFLAG == HALF) {
+        j_index = hb_first_i + d_hb_num[i];
+        d_hb_num[i]++;
+      } else
+        j_index = hb_first_i + Kokkos::atomic_fetch_add(&d_hb_num[i],1);
+
+      const int jj_index = j_index - hb_first_i;
+
+      if (jj_index >= maxhb)
+        d_resize_hb() = MAX(d_resize_hb(),jj_index+1);
+      else
+        d_hb_list[j_index] = j;
+    } else if (j < nlocal && ihb == 2 && jhb == 1) {
+      if (NEIGHFLAG == HALF) {
+        i_index = d_hb_first[j] + d_hb_num[j];
+        d_hb_num[j]++;
+      } else
+        i_index = d_hb_first[j] + Kokkos::atomic_fetch_add(&d_hb_num[j],1);
+
+      const int ii_index = i_index - d_hb_first[j];
+
+      if (ii_index >= maxhb)
+        d_resize_hb() = MAX(d_resize_hb(),ii_index+1);
+      else
+        d_hb_list[i_index] = i;
     }
   }
+
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+template<int NEIGHFLAG>
+KOKKOS_INLINE_FUNCTION
+bool PairReaxFFKokkos<DeviceType>::build_bo_list(int bo_first_i, int i, int j, int i_index, int j_index, int& ii_index, int& jj_index) const {
+
+  if (NEIGHFLAG == HALF) {
+    j_index = bo_first_i + d_bo_num[i];
+    i_index = d_bo_first[j] + d_bo_num[j];
+    d_bo_num[i]++;
+    d_bo_num[j]++;
+  } else {
+    j_index = bo_first_i + Kokkos::atomic_fetch_add(&d_bo_num[i],1);
+    i_index = d_bo_first[j] + Kokkos::atomic_fetch_add(&d_bo_num[j],1);
+  }
+
+  jj_index = j_index - bo_first_i;
+  ii_index = i_index - d_bo_first[j];
+
+  bool set_dB_flag = true;
+
+  if (jj_index >= maxbo || ii_index >= maxbo) {
+    const int max_val = MAX(ii_index+1,jj_index+1);
+    d_resize_bo() = MAX(d_resize_bo(),max_val);
+    set_dB_flag = false;
+  } else {
+    d_bo_list[j_index] = j;
+    d_bo_list[i_index] = i;
+    set_dB_flag = true;
+  }
+
+  return set_dB_flag;
+
 }
 
 /* ---------------------------------------------------------------------- */
@@ -2185,7 +2064,7 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsFull, const i
   const X_FLOAT ztmp = x(i,2);
   const int itype = type(i);
 
-  F_FLOAT C12, C34, C56, BO_s, BO_pi, BO_pi2, BO, delij[3], dBOp_i[3], dln_BOp_pi_i[3], dln_BOp_pi2_i[3];
+  F_FLOAT C12, C34, C56, BO_s, BO_pi, BO_pi2, BO, delij[3], dBOp_i[3];
   F_FLOAT dDeltap_self_i[3] = {0.0,0.0,0.0};
   F_FLOAT total_bo_i = 0.0;
 
@@ -2202,32 +2081,15 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsFull, const i
     const F_FLOAT rsq = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
     const F_FLOAT rsq_inv = 1.0 / rsq;
 
-     // bond_list
+    // bond_list
     const F_FLOAT rij = sqrt(rsq);
-    const F_FLOAT p_bo1 = paramstwbp(itype,jtype).p_bo1;
     const F_FLOAT p_bo2 = paramstwbp(itype,jtype).p_bo2;
-    const F_FLOAT p_bo3 = paramstwbp(itype,jtype).p_bo3;
     const F_FLOAT p_bo4 = paramstwbp(itype,jtype).p_bo4;
-    const F_FLOAT p_bo5 = paramstwbp(itype,jtype).p_bo5;
     const F_FLOAT p_bo6 = paramstwbp(itype,jtype).p_bo6;
-    const F_FLOAT r_s = paramstwbp(itype,jtype).r_s;
-    const F_FLOAT r_pi = paramstwbp(itype,jtype).r_pi;
-    const F_FLOAT r_pi2 = paramstwbp(itype,jtype).r_pi2;
 
-    if (paramssing(itype).r_s > 0.0  && paramssing(jtype).r_s > 0.0) {
-      C12 = p_bo1 * ((p_bo2 != 0) ? (pow(rij/r_s,p_bo2)) : 1.0);
-      BO_s = (1.0+bo_cut)*exp(C12);
-    } else BO_s = C12 = 0.0;
-
-    if (paramssing(itype).r_pi > 0.0  && paramssing(jtype).r_pi > 0.0) {
-      C34 = p_bo3 * ((p_bo4 != 0) ? (pow(rij/r_pi,p_bo4)) : 1.0);
-      BO_pi = exp(C34);
-    } else BO_pi = C34 = 0.0;
-
-    if (paramssing(itype).r_pi2 > 0.0  && paramssing(jtype).r_pi2 > 0.0) {
-      C56 = p_bo5 * ((p_bo6 != 0) ? (pow(rij/r_pi2,p_bo6)) : 1.0);
-      BO_pi2 = exp(C56);
-    } else BO_pi2 = C56 = 0.0;
+    // returns BO_*, C** by reference
+    compute_bo(rij, itype, jtype, p_bo2, p_bo4, p_bo6,
+      BO_s, BO_pi, BO_pi2, C12, C34, C56);
 
     BO = BO_s + BO_pi + BO_pi2;
 
@@ -2245,26 +2107,17 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsFull, const i
     if (nlocal == 0)
       Cln_BOp_s = Cln_BOp_pi = Cln_BOp_pi2 = 0.0;
 
-    for (int d = 0; d < 3; d++) dln_BOp_pi_i[d] = -(BO_pi*Cln_BOp_pi)*delij[d];
-    for (int d = 0; d < 3; d++) dln_BOp_pi2_i[d] = -(BO_pi2*Cln_BOp_pi2)*delij[d];
     for (int d = 0; d < 3; d++) dBOp_i[d] = -(BO_s*Cln_BOp_s+BO_pi*Cln_BOp_pi+BO_pi2*Cln_BOp_pi2)*delij[d];
     for (int d = 0; d < 3; d++) dDeltap_self_i[d] += dBOp_i[d];
 
-    d_dln_BOp_pix(i,j_index) = dln_BOp_pi_i[0];
-    d_dln_BOp_piy(i,j_index) = dln_BOp_pi_i[1];
-    d_dln_BOp_piz(i,j_index) = dln_BOp_pi_i[2];
 
-    d_dln_BOp_pi2x(i,j_index) = dln_BOp_pi2_i[0];
-    d_dln_BOp_pi2y(i,j_index) = dln_BOp_pi2_i[1];
-    d_dln_BOp_pi2z(i,j_index) = dln_BOp_pi2_i[2];
+    d_dln_BOp_pi(i,j_index) = -(BO_pi*Cln_BOp_pi);
+    d_dln_BOp_pi2(i,j_index) = -(BO_pi2*Cln_BOp_pi2);
+    d_dBOp(i,j_index) = -(BO_s*Cln_BOp_s+BO_pi*Cln_BOp_pi+BO_pi2*Cln_BOp_pi2);
 
-    d_dBOpx(i,j_index) = dBOp_i[0];
-    d_dBOpy(i,j_index) = dBOp_i[1];
-    d_dBOpz(i,j_index) = dBOp_i[2];
-
-    d_BO(i,j_index) -= bo_cut;
-    d_BO_s(i,j_index) -= bo_cut;
-    total_bo_i += d_BO(i,j_index);
+    d_BO(i,j_index) = BO - bo_cut;
+    d_BO_s(i,j_index) = BO_s - bo_cut;
+    total_bo_i += (BO - bo_cut);
   }
 
   for (int d = 0; d < 3; d++)
@@ -2275,6 +2128,37 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBuildListsFull, const i
 
 /* ---------------------------------------------------------------------- */
 
+template<class DeviceType>
+KOKKOS_INLINE_FUNCTION
+void PairReaxFFKokkos<DeviceType>::compute_bo(F_FLOAT rij, int itype, int jtype, F_FLOAT p_bo2, F_FLOAT p_bo4, F_FLOAT p_bo6,
+  F_FLOAT& BO_s, F_FLOAT& BO_pi, F_FLOAT& BO_pi2, F_FLOAT& C12, F_FLOAT& C34, F_FLOAT& C56) const {
+
+  const F_FLOAT p_bo1 = paramstwbp(itype,jtype).p_bo1;
+  const F_FLOAT p_bo3 = paramstwbp(itype,jtype).p_bo3;
+  const F_FLOAT p_bo5 = paramstwbp(itype,jtype).p_bo5;
+  const F_FLOAT r_s = paramstwbp(itype,jtype).r_s;
+  const F_FLOAT r_pi = paramstwbp(itype,jtype).r_pi;
+  const F_FLOAT r_pi2 = paramstwbp(itype,jtype).r_pi2;
+
+  if (paramssing(itype).r_s > 0.0  && paramssing(jtype).r_s > 0.0) {
+    C12 = p_bo1 * ((p_bo2 != 0) ? (pow(rij/r_s,p_bo2)) : 1.0);
+    BO_s = (1.0+bo_cut)*exp(C12);
+  } else BO_s = C12 = 0.0;
+
+  if (paramssing(itype).r_pi > 0.0  && paramssing(jtype).r_pi > 0.0) {
+    C34 = p_bo3 * ((p_bo4 != 0) ? (pow(rij/r_pi,p_bo4)) : 1.0);
+    BO_pi = exp(C34);
+  } else BO_pi = C34 = 0.0;
+
+  if (paramssing(itype).r_pi2 > 0.0  && paramssing(jtype).r_pi2 > 0.0) {
+    C56 = p_bo5 * ((p_bo6 != 0) ? (pow(rij/r_pi2,p_bo6)) : 1.0);
+    BO_pi2 = exp(C56);
+  } else BO_pi2 = C56 = 0.0;
+
+}
+
+/* ---------------------------------------------------------------------- */
+
 template<class DeviceType>
 KOKKOS_INLINE_FUNCTION
 void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxBondOrder1, const int &ii) const {
@@ -2643,79 +2527,83 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeMulti2<NEIGHFLAG
 /* ---------------------------------------------------------------------- */
 
 template<class DeviceType>
-template<int NEIGHFLAG, int EVFLAG>
+template<bool POPULATE>
 KOKKOS_INLINE_FUNCTION
-void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLAG,EVFLAG>, const int &ii, EV_FLOAT_REAX& ev) const {
-
-  const auto v_f = ScatterViewHelper<NeedDup_v<NEIGHFLAG,DeviceType>,decltype(dup_f),decltype(ndup_f)>::get(dup_f,ndup_f);
-  const auto a_f = v_f.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
-  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value> > a_Cdbo = d_Cdbo;
-
-  const auto v_CdDelta = ScatterViewHelper<NeedDup_v<NEIGHFLAG,DeviceType>,decltype(dup_CdDelta),decltype(ndup_CdDelta)>::get(dup_CdDelta,ndup_CdDelta);
-  const auto a_CdDelta = v_CdDelta.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
+void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxCountAngularTorsion<POPULATE>, const int &ii) const {
 
   const int i = d_ilist[ii];
   const int itype = type(i);
-  const X_FLOAT xtmp = x(i,0);
-  const X_FLOAT ytmp = x(i,1);
-  const X_FLOAT ztmp = x(i,2);
-
-  F_FLOAT temp, temp_bo_jt, pBOjt7;
-  F_FLOAT p_val1, p_val2, p_val3, p_val4, p_val5;
-  F_FLOAT p_val6, p_val7, p_val8, p_val9, p_val10;
-  F_FLOAT p_pen1, p_pen2, p_pen3, p_pen4;
-  F_FLOAT p_coa1, p_coa2, p_coa3, p_coa4;
-  F_FLOAT trm8, expval6, expval7, expval2theta, expval12theta, exp3ij, exp3jk;
-  F_FLOAT exp_pen2ij, exp_pen2jk, exp_pen3, exp_pen4, trm_pen34, exp_coa2;
-  F_FLOAT dSBO1, dSBO2, SBO, SBO2, CSBO2, SBOp, prod_SBO, vlpadj;
-  F_FLOAT CEval1, CEval2, CEval3, CEval4, CEval5, CEval6, CEval7, CEval8;
-  F_FLOAT CEpen1, CEpen2, CEpen3;
-  F_FLOAT e_ang, e_coa, e_pen;
-  F_FLOAT CEcoa1, CEcoa2, CEcoa3, CEcoa4, CEcoa5;
-  F_FLOAT Cf7ij, Cf7jk, Cf8j, Cf9j;
-  F_FLOAT f7_ij, f7_jk, f8_Dj, f9_Dj;
-  F_FLOAT Ctheta_0, theta_0, theta_00, theta, cos_theta, sin_theta;
-  F_FLOAT BOA_ij, BOA_ik, rij, bo_ij, bo_ik;
-  F_FLOAT dcos_theta_di[3], dcos_theta_dj[3], dcos_theta_dk[3];
-  F_FLOAT eng_tmp, fi_tmp[3], fj_tmp[3], fk_tmp[3];
-  F_FLOAT delij[3], delik[3], delji[3], delki[3];
-
-  p_val6 = gp[14];
-  p_val8 = gp[33];
-  p_val9 = gp[16];
-  p_val10 = gp[17];
-
-  p_pen2 = gp[19];
-  p_pen3 = gp[20];
-  p_pen4 = gp[21];
-
-  p_coa2 = gp[2];
-  p_coa3 = gp[38];
-  p_coa4 = gp[30];
-
-  p_val3 = paramssing(itype).p_val3;
-  p_val5 = paramssing(itype).p_val5;
 
   const int j_start = d_bo_first[i];
   const int j_end = j_start + d_bo_num[i];
 
+  if (POPULATE) {
+    // Computes and stores SBO2, CSBO2, dSBO1, dSBO2
+    compute_angular_sbo(i, itype, j_start, j_end);
+  }
+
+  // Angular
+
+  // Count buffer size for `i`
+  int location_angular = 0; // dummy declaration
+  int count_angular = preprocess_angular<false>(i, itype, j_start, j_end, location_angular);
+  location_angular = Kokkos::atomic_fetch_add(&d_count_angular_torsion(0), count_angular);
+
+  if (POPULATE) {
+    // Fill buffer for `i`
+    preprocess_angular<true>(i, itype, j_start, j_end, location_angular);
+  }
+
+  // Torsion
+
+  const tagint itag = tag(i);
+  const X_FLOAT xtmp = x(i,0);
+  const X_FLOAT ytmp = x(i,1);
+  const X_FLOAT ztmp = x(i,2);
+
+  // Count buffer size for `i`
+  int location_torsion = 0; // dummy declaration
+  int count_torsion = preprocess_torsion<false>(i, itype, itag, xtmp, ytmp, ztmp, j_start, j_end, location_torsion);
+  location_torsion = Kokkos::atomic_fetch_add(&d_count_angular_torsion(1), count_torsion);
+
+  if (POPULATE) {
+    // Fill buffer for `i`
+    preprocess_torsion<true>(i, itype, itag, xtmp, ytmp, ztmp, j_start, j_end, location_torsion);
+  }
+
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+KOKKOS_INLINE_FUNCTION
+void PairReaxFFKokkos<DeviceType>::compute_angular_sbo(int i, int itype, int j_start, int j_end) const {
+
+  F_FLOAT SBO2, CSBO2, dSBO1, dSBO2;
+
+  const F_FLOAT p_val8 = gp[33];
+  const F_FLOAT p_val9 = gp[16];
+
   const F_FLOAT Delta_val = d_total_bo[i] - paramssing(itype).valency_val;
 
-  SBOp = 0.0, prod_SBO = 1.0;
+  F_FLOAT SBOp = 0.0;
+  F_FLOAT prod_SBO = 1.0;
 
   for (int jj = j_start; jj < j_end; jj++) {
     int j = d_bo_list[jj];
     j &= NEIGHMASK;
     const int j_index = jj - j_start;
-    bo_ij = d_BO(i,j_index);
+    const F_FLOAT bo_ij = d_BO(i,j_index);
 
     SBOp += (d_BO_pi(i,j_index) + d_BO_pi2(i,j_index));
-    temp = SQR(bo_ij);
+    F_FLOAT temp = SQR(bo_ij);
     temp *= temp;
     temp *= temp;
     prod_SBO *= exp(-temp);
   }
 
+  F_FLOAT vlpadj;
+
   const F_FLOAT Delta_e = d_total_bo[i] - paramssing(itype).valency_e;
   const F_FLOAT vlpex = Delta_e - 2.0 * (int)(Delta_e/2.0);
   const F_FLOAT explp1 = exp(-gp[15] * SQR(2.0 + vlpex));
@@ -2728,51 +2616,51 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
     dSBO2 = (prod_SBO - 1.0) * (1.0 - p_val8 * d_dDelta_lp[i]);
   }
 
-  SBO = SBOp + (1.0 - prod_SBO) * (-d_Delta_boc[i] - p_val8 * vlpadj);
+  const F_FLOAT SBO = SBOp + (1.0 - prod_SBO) * (-d_Delta_boc[i] - p_val8 * vlpadj);
   dSBO1 = -8.0 * prod_SBO * (d_Delta_boc[i] + p_val8 * vlpadj);
 
   if (SBO <= 0.0) {
     SBO2 = 0.0;
     CSBO2 = 0.0;
   } else if (SBO > 0.0 && SBO <= 1.0) {
-    CSBO2 = pow(SBO, p_val9 - 1.0);
-    SBO2 = CSBO2*SBO;
-    CSBO2 = p_val9 * CSBO2;
+    SBO2 = pow(SBO, p_val9);
+    CSBO2 = p_val9 * pow(SBO, p_val9 - 1.0);
   } else if (SBO > 1.0 && SBO < 2.0) {
-    CSBO2 = pow(2.0 - SBO, p_val9 - 1.0);
-    SBO2 = 2.0 - CSBO2*(2.0 - SBO);
-    CSBO2 = p_val9 * CSBO2;
+    SBO2 = 2.0 - pow(2.0-SBO, p_val9);
+    CSBO2 = p_val9 * pow(2.0 - SBO, p_val9 - 1.0);
   } else {
     SBO2 = 2.0;
     CSBO2 = 0.0;
   }
-  expval6 = exp(p_val6 * d_Delta_boc[i]);
 
-  F_FLOAT CdDelta_i = 0.0;
-  F_FLOAT fitmp[3],fjtmp[3];
-  for (int j = 0; j < 3; j++) fitmp[j] = 0.0;
+  d_angular_intermediates(i,0) = SBO2;
+  d_angular_intermediates(i,1) = CSBO2;
+  d_angular_intermediates(i,2) = dSBO1;
+  d_angular_intermediates(i,3) = dSBO2;
+
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+template<bool POPULATE>
+KOKKOS_INLINE_FUNCTION
+int PairReaxFFKokkos<DeviceType>::preprocess_angular(int i, int itype, int j_start, int j_end, int location_angular) const {
+
+  int count_angular = 0;
 
   for (int jj = j_start; jj < j_end; jj++) {
     int j = d_bo_list[jj];
     j &= NEIGHMASK;
     const int j_index = jj - j_start;
-    delij[0] = x(j,0) - xtmp;
-    delij[1] = x(j,1) - ytmp;
-    delij[2] = x(j,2) - ztmp;
-    const F_FLOAT rsqij = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
-    rij = sqrt(rsqij);
-    bo_ij = d_BO(i,j_index);
-    const int i_index = maxbo+j_index;
+    const F_FLOAT bo_ij = d_BO(i,j_index);
 
-    BOA_ij = bo_ij - thb_cut;
-    if (BOA_ij <= 0.0) continue;
+    if (bo_ij <= thb_cut) continue;
     if (i >= nlocal && j >= nlocal) continue;
 
+    const int i_index = maxbo + j_index;
     const int jtype = type(j);
 
-    F_FLOAT CdDelta_j = 0.0;
-    for (int k = 0; k < 3; k++) fjtmp[k] = 0.0;
-
     for (int kk = jj+1; kk < j_end; kk++) {
     //for (int kk = j_start; kk < j_end; kk++) {
       int k = d_bo_list[kk];
@@ -2780,6 +2668,217 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
       if (k == j) continue;
 
       const int k_index = kk - j_start;
+      const F_FLOAT bo_ik = d_BO(i,k_index);
+
+      if (bo_ij <= thb_cut || bo_ik <= thb_cut || bo_ij * bo_ik <= thb_cutsq) continue;
+
+      const int ktype = type(k);
+
+      F_FLOAT p_val1 = paramsthbp(jtype,itype,ktype).p_val1;
+
+      if (fabs(p_val1) <= 0.001) continue;
+
+      if (POPULATE) {
+        reax_int4 pack;
+
+        // First pack stores i, j, k, and j_start
+        pack.i0 = i;
+        pack.i1 = j;
+        pack.i2 = k;
+        pack.i3 = j_start;
+        d_angular_pack(location_angular, 0) = pack;
+
+        // Second pack stores i_index, j_index, k_index, and j_end
+        pack.i0 = i_index;
+        pack.i1 = j_index;
+        pack.i2 = k_index;
+        pack.i3 = j_end;
+        d_angular_pack(location_angular, 1) = pack;
+
+        location_angular++;
+      } else {
+        count_angular++;
+      }
+    }
+  }
+
+  return count_angular;
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+template<bool POPULATE>
+KOKKOS_INLINE_FUNCTION
+int PairReaxFFKokkos<DeviceType>::preprocess_torsion(int i, int itype, int itag,
+  F_FLOAT xtmp, F_FLOAT ytmp, F_FLOAT ztmp, int j_start, int j_end, int location_torsion) const {
+
+  // in reaxff_torsion_angles: j = i, k = j, i = k;
+
+  int count_torsion = 0;
+
+  for (int jj = j_start; jj < j_end; jj++) {
+    int j = d_bo_list[jj];
+    j &= NEIGHMASK;
+    const tagint jtag = tag(j);
+    const int jtype = type(j);
+    const int j_index = jj - j_start;
+
+    // skip half of the interactions
+    if (itag > jtag) {
+      if ((itag+jtag) % 2 == 0) continue;
+    } else if (itag < jtag) {
+      if ((itag+jtag) % 2 == 1) continue;
+    } else {
+      if (x(j,2)  < ztmp) continue;
+      if (x(j,2) == ztmp && x(j,1)  < ytmp) continue;
+      if (x(j,2) == ztmp && x(j,1) == ytmp && x(j,0) < xtmp) continue;
+    }
+
+    const F_FLOAT bo_ij = d_BO(i,j_index);
+    if (bo_ij < thb_cut) continue;
+
+    const int l_start = d_bo_first[j];
+    const int l_end = l_start + d_bo_num[j];
+
+    for (int kk = j_start; kk < j_end; kk++) {
+      int k = d_bo_list[kk];
+      k &= NEIGHMASK;
+      if (k == j) continue;
+      const int k_index = kk - j_start;
+
+      const F_FLOAT bo_ik = d_BO(i,k_index);
+      if (bo_ik < thb_cut) continue;
+
+      for (int ll = l_start; ll < l_end; ll++) {
+        int l = d_bo_list[ll];
+        l &= NEIGHMASK;
+        if (l == i) continue;
+        const int l_index = ll - l_start;
+
+        const F_FLOAT bo_jl = d_BO(j,l_index);
+        if (l == k || bo_jl < thb_cut || bo_ij*bo_ik*bo_jl < thb_cut) continue;
+
+        if (POPULATE) {
+          reax_int4 pack;
+
+          pack.i0 = i;
+          pack.i1 = j;
+          pack.i2 = k;
+          pack.i3 = l;
+          d_torsion_pack(location_torsion, 0) = pack;
+
+          pack.i0 = 0; // no i_index
+          pack.i1 = j_index;
+          pack.i2 = k_index;
+          pack.i3 = l_index;
+          d_torsion_pack(location_torsion, 1) = pack;
+
+          location_torsion++;
+        } else {
+          count_torsion++;
+        }
+      }
+    }
+  }
+
+  return count_torsion;
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+template<int NEIGHFLAG, int EVFLAG>
+KOKKOS_INLINE_FUNCTION
+void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngularPreprocessed<NEIGHFLAG,EVFLAG>, const int &apack, EV_FLOAT_REAX& ev) const {
+
+  auto v_f = ScatterViewHelper<typename NeedDup<NEIGHFLAG,DeviceType>::value,decltype(dup_f),decltype(ndup_f)>::get(dup_f,ndup_f);
+  auto a_f = v_f.template access<typename AtomicDup<NEIGHFLAG,DeviceType>::value>();
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,typename KKDevice<DeviceType>::value,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbo = d_Cdbo;
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,typename KKDevice<DeviceType>::value,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbopi = d_Cdbopi;
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,typename KKDevice<DeviceType>::value,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbopi2 = d_Cdbopi2;
+
+  auto v_CdDelta = ScatterViewHelper<typename NeedDup<NEIGHFLAG,DeviceType>::value,decltype(dup_CdDelta),decltype(ndup_CdDelta)>::get(dup_CdDelta,ndup_CdDelta);
+  auto a_CdDelta = v_CdDelta.template access<typename AtomicDup<NEIGHFLAG,DeviceType>::value>();
+
+  F_FLOAT temp, temp_bo_jt, pBOjt7;
+  F_FLOAT p_val1, p_val2, p_val3, p_val4, p_val5;
+  F_FLOAT p_val6, p_val7, p_val10;
+  F_FLOAT p_pen1, p_pen2, p_pen3, p_pen4;
+  F_FLOAT p_coa1, p_coa2, p_coa3, p_coa4;
+  F_FLOAT trm8, expval6, expval7, expval2theta, expval12theta, exp3ij, exp3jk;
+  F_FLOAT exp_pen2ij, exp_pen2jk, exp_pen3, exp_pen4, trm_pen34, exp_coa2;
+  F_FLOAT dSBO1, dSBO2, SBO2, CSBO2;
+  F_FLOAT CEval1, CEval2, CEval3, CEval4, CEval5, CEval6, CEval7, CEval8;
+  F_FLOAT CEpen1, CEpen2, CEpen3;
+  F_FLOAT e_ang, e_coa, e_pen;
+  F_FLOAT CEcoa1, CEcoa2, CEcoa3, CEcoa4, CEcoa5;
+  F_FLOAT Cf7ij, Cf7jk, Cf8j, Cf9j;
+  F_FLOAT f7_ij, f7_jk, f8_Dj, f9_Dj;
+  F_FLOAT Ctheta_0, theta_0, theta_00, theta, cos_theta, sin_theta;
+  F_FLOAT BOA_ij, BOA_ik, rij, bo_ij, bo_ik;
+  F_FLOAT dcos_theta_di[3], dcos_theta_dj[3], dcos_theta_dk[3];
+  F_FLOAT eng_tmp, fi_tmp[3], fj_tmp[3], fk_tmp[3];
+  F_FLOAT delij[3], delik[3], delji[3], delki[3];
+
+  p_val6 = gp[14];
+  p_val10 = gp[17];
+
+  p_pen2 = gp[19];
+  p_pen3 = gp[20];
+  p_pen4 = gp[21];
+
+  p_coa2 = gp[2];
+  p_coa3 = gp[38];
+  p_coa4 = gp[30];
+
+  reax_int4 pack = d_angular_pack(apack,0);
+  const int i = pack.i0;
+  const int j = pack.i1;
+  const int k = pack.i2;
+  const int j_start = pack.i3;
+
+  pack = d_angular_pack(apack, 1);
+  const int i_index = pack.i0;
+  const int j_index = pack.i1;
+  const int k_index = pack.i2;
+  const int j_end = pack.i3;
+
+  const int itype = type(i);
+  const X_FLOAT xtmp = x(i,0);
+  const X_FLOAT ytmp = x(i,1);
+  const X_FLOAT ztmp = x(i,2);
+
+  p_val3 = paramssing(itype).p_val3;
+  p_val5 = paramssing(itype).p_val5;
+
+  const F_FLOAT Delta_val = d_total_bo[i] - paramssing(itype).valency_val;
+
+  SBO2 = d_angular_intermediates(i, 0);
+  CSBO2 = d_angular_intermediates(i, 1);
+  dSBO1 = d_angular_intermediates(i, 2);
+  dSBO2 = d_angular_intermediates(i, 3);
+
+  expval6 = exp(p_val6 * d_Delta_boc[i]);
+
+  F_FLOAT CdDelta_i = 0.0;
+  F_FLOAT fitmp[3],fjtmp[3];
+  for (int j = 0; j < 3; j++) fitmp[j] = 0.0;
+
+    delij[0] = x(j,0) - xtmp;
+    delij[1] = x(j,1) - ytmp;
+    delij[2] = x(j,2) - ztmp;
+    const F_FLOAT rsqij = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
+    rij = sqrt(rsqij);
+    bo_ij = d_BO(i,j_index);
+
+    BOA_ij = bo_ij - thb_cut;
+
+    const int jtype = type(j);
+
+    F_FLOAT CdDelta_j = 0.0;
+    for (int k = 0; k < 3; k++) fjtmp[k] = 0.0;
+
       delik[0] = x(k,0) - xtmp;
       delik[1] = x(k,1) - ytmp;
       delik[2] = x(k,2) - ztmp;
@@ -2788,8 +2887,6 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
       bo_ik = d_BO(i,k_index);
       BOA_ik   = bo_ik - thb_cut;
 
-      if (BOA_ik <= 0.0 || bo_ij <= thb_cut || bo_ik <= thb_cut || bo_ij * bo_ik <= thb_cutsq) continue;
-
       const int ktype = type(k);
 
       // theta and derivatives
@@ -2812,8 +2909,6 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
       if (sin_theta < 1.0e-5) sin_theta = 1.0e-5;
       p_val1 = paramsthbp(jtype,itype,ktype).p_val1;
 
-      if (fabs(p_val1) <= 0.001) continue;
-
       // ANGLE ENERGY
 
       p_val1 = paramsthbp(jtype,itype,ktype).p_val1;
@@ -2822,16 +2917,12 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
       p_val7 = paramsthbp(jtype,itype,ktype).p_val7;
       theta_00 = paramsthbp(jtype,itype,ktype).theta_00;
 
-      F_FLOAT tmp_var;
-      tmp_var = pow(BOA_ij, p_val4 - 1.0);
-      exp3ij = exp(-p_val3 * tmp_var * BOA_ij);
+      exp3ij = exp(-p_val3 * pow(BOA_ij, p_val4));
       f7_ij = 1.0 - exp3ij;
-      Cf7ij = p_val3 * p_val4 * tmp_var * exp3ij;
-
-      tmp_var =  pow(BOA_ik, p_val4 - 1.0);
-      exp3jk = exp(-p_val3 * tmp_var * BOA_ik);
+      Cf7ij = p_val3 * p_val4 * pow(BOA_ij, p_val4 - 1.0) * exp3ij;
+      exp3jk = exp(-p_val3 * pow(BOA_ik, p_val4));
       f7_jk = 1.0 - exp3jk;
-      Cf7jk = p_val3 * p_val4 * tmp_var * exp3jk;
+      Cf7jk = p_val3 * p_val4 * pow(BOA_ik, p_val4 - 1.0) * exp3jk;
       expval7 = exp(-p_val7 * d_Delta_boc[i]);
       trm8 = 1.0 + expval6 + expval7;
       f8_Dj = p_val5 - ((p_val5 - 1.0) * (2.0 + expval6) / trm8);
@@ -2857,7 +2948,7 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
       CEval8 = -CEval4 / sin_theta;
 
       e_ang = f7_ij * f7_jk * f8_Dj * expval12theta;
-      if (EVFLAG && eflag_global) ev.ereax[3] += e_ang;
+      if (eflag) ev.ereax[3] += e_ang;
 
       // Penalty energy
 
@@ -2873,7 +2964,7 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
        (-p_pen3 * exp_pen3 + p_pen4 * exp_pen4))/(trm_pen34*trm_pen34);
 
       e_pen = p_pen1 * f9_Dj * exp_pen2ij * exp_pen2jk;
-      if (EVFLAG && eflag_global) ev.ereax[4] += e_pen;
+      if (eflag) ev.ereax[4] += e_pen;
 
       CEpen1 = e_pen * Cf9j / f9_Dj;
       temp   = -2.0 * p_pen2 * e_pen;
@@ -2896,7 +2987,7 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
       CEcoa4 = -2 * p_coa3 * (d_total_bo[j]-BOA_ij) * e_coa;
       CEcoa5 = -2 * p_coa3 * (d_total_bo[k]-BOA_ik) * e_coa;
 
-      if (EVFLAG && eflag_global) ev.ereax[5] += e_coa;
+      if (eflag) ev.ereax[5] += e_coa;
 
       // Forces
 
@@ -2919,8 +3010,8 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
         pBOjt7 = temp * temp * temp_bo_jt;
 
         a_Cdbo(i,l_index) += (CEval6 * pBOjt7);
-        d_Cdbopi(i,l_index) += CEval5;
-        d_Cdbopi2(i,l_index) += CEval5;
+        a_Cdbopi(i,l_index) += CEval5;
+        a_Cdbopi2(i,l_index) += CEval5;
       }
 
       for (int d = 0; d < 3; d++) fi_tmp[d] = CEval8 * dcos_theta_di[d];
@@ -2940,10 +3031,8 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
         if (vflag_either) this->template v_tally3<NEIGHFLAG>(ev,i,j,k,fj_tmp,fk_tmp,delji,delki);
       }
 
-    }
     a_CdDelta[j] += CdDelta_j;
     for (int d = 0; d < 3; d++) a_f(j,d) += fjtmp[d];
-  }
   a_CdDelta[i] += CdDelta_i;
   for (int d = 0; d < 3; d++) a_f(i,d) += fitmp[d];
 }
@@ -2951,84 +3040,10 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLA
 template<class DeviceType>
 template<int NEIGHFLAG, int EVFLAG>
 KOKKOS_INLINE_FUNCTION
-void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngular<NEIGHFLAG,EVFLAG>, const int &ii) const {
+void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeAngularPreprocessed<NEIGHFLAG,EVFLAG>, const int &apack) const {
   EV_FLOAT_REAX ev;
-  this->template operator()<NEIGHFLAG,EVFLAG>(TagPairReaxComputeAngular<NEIGHFLAG,EVFLAG>(), ii, ev);
-}
+  this->template operator()<NEIGHFLAG,EVFLAG>(TagPairReaxComputeAngularPreprocessed<NEIGHFLAG,EVFLAG>(), apack, ev);
 
-/* ---------------------------------------------------------------------- */
-
-template<class DeviceType>
-KOKKOS_INLINE_FUNCTION
-void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsionPreview, const int &ii) const {
-
-  F_FLOAT  bo_ij, bo_ik;
-  int counter = 0;
-
-  const int i = d_ilist[ii];
-  const int itype = type(i);
-  const tagint itag = tag(i);
-  const X_FLOAT xtmp = x(i,0);
-  const X_FLOAT ytmp = x(i,1);
-  const X_FLOAT ztmp = x(i,2);
-
-  const int j_start = d_bo_first[i];
-  const int j_end = j_start + d_bo_num[i];
-
-  int jj_min =  j_end+1;
-  int jj_max =  j_start-1;
-  int kk_min =  j_end+1;
-  int kk_max =  j_start-1;
-
-  for (int jj = j_start; jj < j_end; jj++) {
-
-   // j_counter1++;
-
-    int j = d_bo_list[jj];
-    j &= NEIGHMASK;
-    const tagint jtag = tag(j);
-    const int j_index = jj - j_start;
-
-    // skip half of the interactions
-    if (itag > jtag) {
-      if ((itag+jtag) % 2 == 0) continue;
-    } else if (itag < jtag) {
-      if ((itag+jtag) % 2 == 1) continue;
-    } else {
-      if (x(j,2)  < ztmp) continue;
-      if (x(j,2) == ztmp && x(j,1)  < ytmp) continue;
-      if (x(j,2) == ztmp && x(j,1) == ytmp && x(j,0) < xtmp) continue;
-    }
-
-    bo_ij = d_BO(i,j_index);
-    if (bo_ij < thb_cut) continue;
-
-    const int l_start = d_bo_first[j];
-    const int l_end = l_start + d_bo_num[j];
-
-    for (int kk = j_start; kk < j_end; kk++) {
-      int k = d_bo_list[kk];
-      k &= NEIGHMASK;
-      if (k == j) continue;
-      const int k_index = kk - j_start;
-      bo_ik = d_BO(i,k_index);
-      if (bo_ik < thb_cut) continue;
-
-      counter++;
-      jj_min = jj < jj_min ? jj : jj_min;
-      jj_max = jj >= jj_max ? (jj+1) : jj_max;
-      kk_min = kk < kk_min ? kk : kk_min;
-      kk_max = kk >= kk_max ? (kk+1) : kk_max;
-    }
-
-  }
-  counters[ii] = counter;
-  if (counter > 0) {
-    counters_jj_min[ii] = jj_min;
-    counters_jj_max[ii] = jj_max;
-    counters_kk_min[ii] = kk_min;
-    counters_kk_max[ii] = kk_max;
-  }
 }
 
 /* ---------------------------------------------------------------------- */
@@ -3036,19 +3051,16 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsionPreview,
 template<class DeviceType>
 template<int NEIGHFLAG, int EVFLAG>
 KOKKOS_INLINE_FUNCTION
-void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsion<NEIGHFLAG,EVFLAG>,  const int &iii, EV_FLOAT_REAX& ev) const {
+void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsionPreprocessed<NEIGHFLAG,EVFLAG>, const int &tpack, EV_FLOAT_REAX& ev) const {
 
-  constexpr int blocksize = PairReaxFFKokkos<DeviceType>::compute_torsion_blocksize;
+  auto v_f = ScatterViewHelper<typename NeedDup<NEIGHFLAG,DeviceType>::value,decltype(dup_f),decltype(ndup_f)>::get(dup_f,ndup_f);
+  auto a_f = v_f.template access<typename AtomicDup<NEIGHFLAG,DeviceType>::value>();
 
-  const int ii = counters[iii];
-
-  const auto v_f = ScatterViewHelper<NeedDup_v<NEIGHFLAG,DeviceType>,decltype(dup_f),decltype(ndup_f)>::get(dup_f,ndup_f);
-  const auto a_f = v_f.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
-
-  const auto v_CdDelta = ScatterViewHelper<NeedDup_v<NEIGHFLAG,DeviceType>,decltype(dup_CdDelta),decltype(ndup_CdDelta)>::get(dup_CdDelta,ndup_CdDelta);
-  const auto a_CdDelta = v_CdDelta.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
-  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbo = d_Cdbo;
-  //auto a_Cdbo = dup_Cdbo.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
+  auto v_CdDelta = ScatterViewHelper<typename NeedDup<NEIGHFLAG,DeviceType>::value,decltype(dup_CdDelta),decltype(ndup_CdDelta)>::get(dup_CdDelta,ndup_CdDelta);
+  auto a_CdDelta = v_CdDelta.template access<typename AtomicDup<NEIGHFLAG,DeviceType>::value>();
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,typename KKDevice<DeviceType>::value,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbo = d_Cdbo;
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,typename KKDevice<DeviceType>::value,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbopi = d_Cdbopi;
+  //auto a_Cdbo = dup_Cdbo.template access<typename AtomicDup<NEIGHFLAG,DeviceType>::value>();
 
   // in reaxff_torsion_angles: j = i, k = j, i = k;
 
@@ -3064,8 +3076,7 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsion<NEIGHFLA
   F_FLOAT e_tor, e_con, eng_tmp;
 
   F_FLOAT delij[3], delik[3], deljl[3], dellk[3], delil[3], delkl[3];
-  F_FLOAT fi_tmp[3], fj_tmp[3], fk_tmp[3], fl_tmp[3];
-  F_FLOAT dcos_omega_di[3], dcos_omega_dj[3], dcos_omega_dk[3], dcos_omega_dl[3];
+  F_FLOAT fi_tmp[3], fj_tmp[3], fk_tmp[3];
   F_FLOAT dcos_ijk_di[3], dcos_ijk_dj[3], dcos_ijk_dk[3], dcos_jil_di[3], dcos_jil_dj[3], dcos_jil_dk[3];
 
   F_FLOAT p_tor2 = gp[23];
@@ -3073,400 +3084,312 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsion<NEIGHFLA
   F_FLOAT p_tor4 = gp[25];
   F_FLOAT p_cot2 = gp[27];
 
-  const int i = d_ilist[ii];
-  const int jj_start = counters_jj_min[ii];
-  const int jj_stop = counters_jj_max[ii];
-  const int kk_start = counters_kk_min[ii];
-  const int kk_stop = counters_kk_max[ii];
+  reax_int4 pack = d_torsion_pack(tpack,0);
+  const int i = pack.i0;
+  const int j = pack.i1;
+  const int k = pack.i2;
+  const int l = pack.i3;
+
+  pack = d_torsion_pack(tpack, 1);
+  //const int i = pack.i0;
+  const int j_index = pack.i1;
+  const int k_index = pack.i2;
+  const int l_index = pack.i3;
 
   const int itype = type(i);
-  const tagint itag = tag(i);
   const X_FLOAT xtmp = x(i,0);
   const X_FLOAT ytmp = x(i,1);
   const X_FLOAT ztmp = x(i,2);
   Delta_i = d_Delta_boc[i];
 
-  const int j_start = d_bo_first[i];
-  const int j_end = j_start + d_bo_num[i];
+    const int jtype = type(j);
 
-  F_FLOAT fitmp[3], fjtmp[3], fktmp[3];
-  for(int j = 0; j < 3; j++) fitmp[j] = 0.0;
-  F_FLOAT CdDelta_i = 0.0;
+    bo_ij = d_BO(i,j_index);
 
-  int nnz_jj;
-  blocking_t selected_jj[blocksize];
-  int jj_current = jj_start;
+    delij[0] = x(j,0) - xtmp;
+    delij[1] = x(j,1) - ytmp;
+    delij[2] = x(j,2) - ztmp;
+    const F_FLOAT rsqij = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
+    const F_FLOAT rij = sqrt(rsqij);
 
-  while (jj_current < jj_stop) {
+    BOA_ij = bo_ij - thb_cut;
+    Delta_j = d_Delta_boc[j];
+    exp_tor2_ij = exp(-p_tor2 * BOA_ij);
+    exp_cot2_ij = exp(-p_cot2 * SQR(BOA_ij - 1.5));
+    exp_tor3_DiDj = exp(-p_tor3 * (Delta_i + Delta_j));
+    exp_tor4_DiDj = exp(p_tor4  * (Delta_i + Delta_j));
+    exp_tor34_inv = 1.0 / (1.0 + exp_tor3_DiDj + exp_tor4_DiDj);
+    f11_DiDj = (2.0 + exp_tor3_DiDj) * exp_tor34_inv;
 
-    nnz_jj=0;
-    while (nnz_jj < blocksize) {
-      int jj = jj_current;
-      int j = d_bo_list[jj];
-      j &= NEIGHMASK;
-      const tagint jtag = tag(j);
-      const int j_index = jj - j_start;
-      bool continue_flag = false;
+      const int ktype = type(k);
 
-      // skip half of the interactions
-      if (itag > jtag) {
-        if ((itag+jtag) % 2 == 0) continue_flag = true;
-      } else if (itag < jtag) {
-        if ((itag+jtag) % 2 == 1) continue_flag = true;
-      } else {
-        if (x(j,2)  < ztmp) continue_flag = true;
-        else if (x(j,2) == ztmp && x(j,1)  < ytmp) continue_flag = true;
-        else if (x(j,2) == ztmp && x(j,1) == ytmp && x(j,0) < xtmp) continue_flag = true;
+      bo_ik = d_BO(i,k_index);
+
+      BOA_ik = bo_ik - thb_cut;
+      for (int d = 0; d < 3; d ++) delik[d] = x(k,d) - x(i,d);
+      const F_FLOAT rsqik = delik[0]*delik[0] + delik[1]*delik[1] + delik[2]*delik[2];
+      const F_FLOAT rik = sqrt(rsqik);
+
+      cos_ijk = (delij[0]*delik[0]+delij[1]*delik[1]+delij[2]*delik[2])/(rij*rik);
+      if (cos_ijk > 1.0) cos_ijk  = 1.0;
+      if (cos_ijk < -1.0) cos_ijk  = -1.0;
+      theta_ijk = acos(cos_ijk);
+
+      // dcos_ijk
+      const F_FLOAT inv_dists = 1.0 / (rij * rik);
+      const F_FLOAT cos_ijk_tmp = cos_ijk / ((rij*rik)*(rij*rik));
+
+      for (int d = 0; d < 3; d++) {
+        dcos_ijk_di[d] = -(delik[d] + delij[d]) * inv_dists + cos_ijk_tmp * (rsqik * delij[d] + rsqij * delik[d]);
+        dcos_ijk_dj[d] = delik[d] * inv_dists - cos_ijk_tmp * rsqik * delij[d];
+        dcos_ijk_dk[d] = delij[d] * inv_dists - cos_ijk_tmp * rsqij * delik[d];
       }
 
-      bo_ij = d_BO(i,j_index);
-      if (bo_ij < thb_cut) continue_flag = true;
+      sin_ijk = sin(theta_ijk);
+      if (sin_ijk >= 0 && sin_ijk <= 1e-10)
+        tan_ijk_i = cos_ijk / 1e-10;
+      else if (sin_ijk <= 0 && sin_ijk >= -1e-10)
+        tan_ijk_i = -cos_ijk / 1e-10;
+      else tan_ijk_i = cos_ijk / sin_ijk;
 
-      if (!continue_flag) {
-         selected_jj[nnz_jj] = jj_current-jj_start;
-         nnz_jj++;
-      }
-      jj_current++;
-      if (jj_current == jj_stop) break;
-    }
+      exp_tor2_ik = exp(-p_tor2 * BOA_ik);
+      exp_cot2_ik = exp(-p_cot2 * SQR(BOA_ik -1.5));
 
-    for (int jj_inner = 0; jj_inner < nnz_jj; jj_inner++) {
-      const int jj = jj_start + selected_jj[jj_inner];
-      int j = d_bo_list[jj];
-      j &= NEIGHMASK;
-      const tagint jtag = tag(j);
-      const int jtype = type(j);
-      const int j_index = jj - j_start;
-      bo_ij = d_BO(i,j_index);
+        const int ltype = type(l);
 
+        bo_jl = d_BO(j,l_index);
 
-      delij[0] = x(j,0) - xtmp;
-      delij[1] = x(j,1) - ytmp;
-      delij[2] = x(j,2) - ztmp;
-      const F_FLOAT rsqij = delij[0]*delij[0] + delij[1]*delij[1] + delij[2]*delij[2];
-      const F_FLOAT rij = sqrt(rsqij);
+        for (int d = 0; d < 3; d ++) deljl[d] = x(l,d) - x(j,d);
+        const F_FLOAT rsqjl = deljl[0]*deljl[0] + deljl[1]*deljl[1] + deljl[2]*deljl[2];
+        const F_FLOAT rjl = sqrt(rsqjl);
+        BOA_jl = bo_jl - thb_cut;
 
-      BOA_ij = bo_ij - thb_cut;
-      Delta_j = d_Delta_boc[j];
-      exp_tor2_ij = exp(-p_tor2 * BOA_ij);
-      exp_cot2_ij = exp(-p_cot2 * SQR(BOA_ij - 1.5));
-      exp_tor3_DiDj = exp(-p_tor3 * (Delta_i + Delta_j));
-      exp_tor4_DiDj = exp(p_tor4  * (Delta_i + Delta_j));
-      exp_tor34_inv = 1.0 / (1.0 + exp_tor3_DiDj + exp_tor4_DiDj);
-      f11_DiDj = (2.0 + exp_tor3_DiDj) * exp_tor34_inv;
+        cos_jil = -(delij[0]*deljl[0]+delij[1]*deljl[1]+delij[2]*deljl[2])/(rij*rjl);
+        if (cos_jil > 1.0) cos_jil  = 1.0;
+        if (cos_jil < -1.0) cos_jil  = -1.0;
+        theta_jil = acos(cos_jil);
 
-      const int l_start = d_bo_first[j];
-      const int l_end = l_start + d_bo_num[j];
+        // dcos_jil
+        const F_FLOAT inv_distjl = 1.0 / (rij * rjl);
+        const F_FLOAT cos_jil_tmp = cos_jil / ((rij*rjl)*(rij*rjl));
 
-      for(int k = 0; k < 3; k++) fjtmp[k] = 0.0;
-      F_FLOAT CdDelta_j = 0.0;
-
-      int nnz_kk;
-      int selected_kk[blocksize];
-      int kk_current = kk_start;
-
-      while (kk_current < kk_stop) {
-        nnz_kk=0;
-        while (nnz_kk < blocksize) {
-          int kk = kk_current;
-          int k = d_bo_list[kk];
-          k &= NEIGHMASK;
-          bool continue_flag = false;
-
-          if (k == j)
-            continue_flag = true;
-          else{
-            const int k_index = kk - j_start;
-            bo_ik = d_BO(i,k_index);
-            if (bo_ik < thb_cut) continue_flag = true;
-          }
-
-          if (!continue_flag) {
-            selected_kk[nnz_kk] = kk_current-kk_start;
-            nnz_kk++;
-          }
-          kk_current++;
-          if (kk_current == kk_stop) break;
+        for (int d = 0; d < 3; d++) {
+          dcos_jil_di[d] = deljl[d] * inv_distjl - cos_jil_tmp * rsqjl * -delij[d];
+          dcos_jil_dj[d] = (-deljl[d] + delij[d]) * inv_distjl - cos_jil_tmp * (rsqjl * delij[d] + rsqij * -deljl[d]);
+          dcos_jil_dk[d] = -delij[d] * inv_distjl - cos_jil_tmp * rsqij * deljl[d];
         }
 
-        for (int kk_inner = 0; kk_inner < nnz_kk; kk_inner++) {
-          const int kk = kk_start + selected_kk[kk_inner];
-          int k = d_bo_list[kk];
-          k &= NEIGHMASK;
-          const int ktype = type(k);
-          const int k_index = kk - j_start;
-          bo_ik = d_BO(i,k_index);
+        sin_jil = sin(theta_jil);
+        if (sin_jil >= 0 && sin_jil <= 1e-10)
+          tan_jil_i = cos_jil / 1e-10;
+        else if (sin_jil <= 0 && sin_jil >= -1e-10)
+          tan_jil_i = -cos_jil / 1e-10;
+        else tan_jil_i = cos_jil / sin_jil;
 
-          BOA_ik = bo_ik - thb_cut;
-          for (int d = 0; d < 3; d ++) delik[d] = x(k,d) - x(i,d);
-          const F_FLOAT rsqik = delik[0]*delik[0] + delik[1]*delik[1] + delik[2]*delik[2];
-          const F_FLOAT rik = sqrt(rsqik);
+        for (int d = 0; d < 3; d ++) dellk[d] = x(k,d) - x(l,d);
+        const F_FLOAT rsqlk = dellk[0]*dellk[0] + dellk[1]*dellk[1] + dellk[2]*dellk[2];
+        const F_FLOAT rlk = sqrt(rsqlk);
 
-          cos_ijk = (delij[0]*delik[0]+delij[1]*delik[1]+delij[2]*delik[2])/(rij*rik);
-          if (cos_ijk > 1.0) cos_ijk  = 1.0;
-          else if (cos_ijk < -1.0) cos_ijk  = -1.0;
-          theta_ijk = acos(cos_ijk);
+        F_FLOAT unnorm_cos_omega, unnorm_sin_omega, omega;
+        F_FLOAT htra, htrb, htrc, hthd, hthe, hnra, hnrc, hnhd, hnhe;
+        F_FLOAT arg, poem, tel;
+        F_FLOAT cross_ij_jl[3];
 
-          // dcos_ijk
-          const F_FLOAT inv_dists = 1.0 / (rij * rik);
-          const F_FLOAT cos_ijk_tmp = cos_ijk *inv_dists * inv_dists;
+        // omega
 
-          for(int d = 0; d < 3; d++) {
-            dcos_ijk_di[d] = -(delik[d] + delij[d]) * inv_dists + cos_ijk_tmp * (rsqik * delij[d] + rsqij * delik[d]);
-            dcos_ijk_dj[d] = delik[d] * inv_dists - cos_ijk_tmp * rsqik * delij[d];
-            dcos_ijk_dk[d] = delij[d] * inv_dists - cos_ijk_tmp * rsqij * delik[d];
-          }
+        F_FLOAT dot_ij_jk = -(delij[0]*delik[0]+delij[1]*delik[1]+delij[2]*delik[2]);
+        F_FLOAT dot_ij_lj = delij[0]*deljl[0]+delij[1]*deljl[1]+delij[2]*deljl[2];
+        F_FLOAT dot_ik_jl = delik[0]*deljl[0]+delik[1]*deljl[1]+delik[2]*deljl[2];
+        unnorm_cos_omega = dot_ij_jk * dot_ij_lj + rsqij * dot_ik_jl;
 
-          sin_ijk = sin(theta_ijk);
-          if (sin_ijk >= 0 && sin_ijk <= 1e-10)
-            tan_ijk_i = cos_ijk / 1e-10;
-          else if (sin_ijk <= 0 && sin_ijk >= -1e-10)
-            tan_ijk_i = -cos_ijk / 1e-10;
-          else tan_ijk_i = cos_ijk / sin_ijk;
+        cross_ij_jl[0] = delij[1]*deljl[2] - delij[2]*deljl[1];
+        cross_ij_jl[1] = delij[2]*deljl[0] - delij[0]*deljl[2];
+        cross_ij_jl[2] = delij[0]*deljl[1] - delij[1]*deljl[0];
 
-          exp_tor2_ik = exp(-p_tor2 * BOA_ik);
-          exp_cot2_ik = exp(-p_cot2 * SQR(BOA_ik -1.5));
+        unnorm_sin_omega = -rij*(delik[0]*cross_ij_jl[0]+delik[1]*cross_ij_jl[1]+delik[2]*cross_ij_jl[2]);
+        omega = atan2(unnorm_sin_omega, unnorm_cos_omega);
 
-          for(int l = 0; l < 3; l++) fktmp[l] = 0.0;
+        htra = rik + cos_ijk * (rjl * cos_jil - rij);
+        htrb = rij - rik * cos_ijk - rjl * cos_jil;
+        htrc = rjl + cos_jil * (rik * cos_ijk - rij);
+        hthd = rik * sin_ijk * (rij - rjl * cos_jil);
+        hthe = rjl * sin_jil * (rij - rik * cos_ijk);
+        hnra = rjl * sin_ijk * sin_jil;
+        hnrc = rik * sin_ijk * sin_jil;
+        hnhd = rik * rjl * cos_ijk * sin_jil;
+        hnhe = rik * rjl * sin_ijk * cos_jil;
 
-          for (int ll = l_start; ll < l_end; ll++) {
-            int l = d_bo_list[ll];
-            l &= NEIGHMASK;
-            if (l == i) continue;
-            const int ltype = type(l);
-            const int l_index = ll - l_start;
+        poem = 2.0 * rik * rjl * sin_ijk * sin_jil;
+        if (poem < 1e-20) poem = 1e-20;
 
-            bo_jl = d_BO(j,l_index);
-            if (l == k || bo_jl < thb_cut || bo_ij*bo_ik*bo_jl < thb_cut) continue;
+        tel = SQR(rik) + SQR(rij) + SQR(rjl) - SQR(rlk) -
+              2.0 * (rik * rij * cos_ijk - rik * rjl * cos_ijk * cos_jil + rij * rjl * cos_jil);
 
-            for (int d = 0; d < 3; d ++) deljl[d] = x(l,d) - x(j,d);
-            const F_FLOAT rsqjl = deljl[0]*deljl[0] + deljl[1]*deljl[1] + deljl[2]*deljl[2];
-            const F_FLOAT rjl = sqrt(rsqjl);
-            BOA_jl = bo_jl - thb_cut;
+        F_FLOAT inv_poem = 1.0 / poem;
 
-            cos_jil = -(delij[0]*deljl[0]+delij[1]*deljl[1]+delij[2]*deljl[2])/(rij*rjl);
-            if (cos_jil > 1.0) cos_jil  = 1.0;
-            else if (cos_jil < -1.0) cos_jil  = -1.0;
-            theta_jil = acos(cos_jil);
+        arg = tel * inv_poem;
+        if (arg >  1.0) arg =  1.0;
+        if (arg < -1.0) arg = -1.0;
 
-            // dcos_jil
-            const F_FLOAT inv_distjl = 1.0 / (rij * rjl);
-            const F_FLOAT cos_jil_tmp = cos_jil / ((rij*rjl)*(rij*rjl));
+        F_FLOAT sin_ijk_rnd = sin_ijk;
+        F_FLOAT sin_jil_rnd = sin_jil;
 
-            for(int d = 0; d < 3; d++) {
-              dcos_jil_di[d] = deljl[d] * inv_distjl - cos_jil_tmp * rsqjl * -delij[d];
-              dcos_jil_dj[d] = (-deljl[d] + delij[d]) * inv_distjl - cos_jil_tmp * (rsqjl * delij[d] + rsqij * -deljl[d]);
-              dcos_jil_dk[d] = -delij[d] * inv_distjl - cos_jil_tmp * rsqij * deljl[d];
-            }
+        if (sin_ijk >= 0 && sin_ijk <= 1e-10) sin_ijk_rnd = 1e-10;
+        else if (sin_ijk <= 0 && sin_ijk >= -1e-10) sin_ijk_rnd = -1e-10;
+        if (sin_jil >= 0 && sin_jil <= 1e-10) sin_jil_rnd = 1e-10;
+        else if (sin_jil <= 0 && sin_jil >= -1e-10) sin_jil_rnd = -1e-10;
 
-            sin_jil = sin(theta_jil);
-            if (sin_jil >= 0 && sin_jil <= 1e-10)
-              tan_jil_i = cos_jil / 1e-10;
-            else if (sin_jil <= 0 && sin_jil >= -1e-10)
-              tan_jil_i = -cos_jil / 1e-10;
-            else tan_jil_i = cos_jil / sin_jil;
+        cos_omega = cos(omega);
+        cos2omega = cos(2. * omega);
+        cos3omega = cos(3. * omega);
 
-            for (int d = 0; d < 3; d ++) dellk[d] = x(k,d) - x(l,d);
-            const F_FLOAT rsqlk = dellk[0]*dellk[0] + dellk[1]*dellk[1] + dellk[2]*dellk[2];
-            const F_FLOAT rlk = sqrt(rsqlk);
+        // torsion energy
 
-            F_FLOAT unnorm_cos_omega, unnorm_sin_omega, omega;
-            F_FLOAT htra, htrb, htrc, hthd, hthe, hnra, hnrc, hnhd, hnhe;
-            F_FLOAT arg, poem, tel;
-            F_FLOAT cross_ij_jl[3];
+        p_tor1 = paramsfbp(ktype,itype,jtype,ltype).p_tor1;
+        p_cot1 = paramsfbp(ktype,itype,jtype,ltype).p_cot1;
+        V1 = paramsfbp(ktype,itype,jtype,ltype).V1;
+        V2 = paramsfbp(ktype,itype,jtype,ltype).V2;
+        V3 = paramsfbp(ktype,itype,jtype,ltype).V3;
 
-            // omega
+        exp_tor1 = exp(p_tor1 * SQR(2.0 - d_BO_pi(i,j_index) - f11_DiDj));
+        exp_tor2_jl = exp(-p_tor2 * BOA_jl);
+        exp_cot2_jl = exp(-p_cot2 * SQR(BOA_jl - 1.5));
+        fn10 = (1.0 - exp_tor2_ik) * (1.0 - exp_tor2_ij) * (1.0 - exp_tor2_jl);
 
-            F_FLOAT dot_ij_jk = -(delij[0]*delik[0]+delij[1]*delik[1]+delij[2]*delik[2]);
-            F_FLOAT dot_ij_lj = delij[0]*deljl[0]+delij[1]*deljl[1]+delij[2]*deljl[2];
-            F_FLOAT dot_ik_jl = delik[0]*deljl[0]+delik[1]*deljl[1]+delik[2]*deljl[2];
-            unnorm_cos_omega = dot_ij_jk * dot_ij_lj + rsqij * dot_ik_jl;
+        CV = 0.5 * (V1 * (1.0 + cos_omega) + V2 * exp_tor1 * (1.0 - cos2omega) + V3 * (1.0 + cos3omega));
 
-            cross_ij_jl[0] = delij[1]*deljl[2] - delij[2]*deljl[1];
-            cross_ij_jl[1] = delij[2]*deljl[0] - delij[0]*deljl[2];
-            cross_ij_jl[2] = delij[0]*deljl[1] - delij[1]*deljl[0];
+        e_tor = fn10 * sin_ijk * sin_jil * CV;
+        if (eflag) ev.ereax[6] += e_tor;
 
-            unnorm_sin_omega = -rij*(delik[0]*cross_ij_jl[0]+delik[1]*cross_ij_jl[1]+delik[2]*cross_ij_jl[2]);
-            omega = atan2(unnorm_sin_omega, unnorm_cos_omega);
+        dfn11 = (-p_tor3 * exp_tor3_DiDj + (p_tor3 * exp_tor3_DiDj - p_tor4 * exp_tor4_DiDj) *
+                (2.0 + exp_tor3_DiDj) * exp_tor34_inv) * exp_tor34_inv;
 
-            htra = rik + cos_ijk * (rjl * cos_jil - rij);
-            htrb = rij - rik * cos_ijk - rjl * cos_jil;
-            htrc = rjl + cos_jil * (rik * cos_ijk - rij);
-            hthd = rik * sin_ijk * (rij - rjl * cos_jil);
-            hthe = rjl * sin_jil * (rij - rik * cos_ijk);
-            hnra = rjl * sin_ijk * sin_jil;
-            hnrc = rik * sin_ijk * sin_jil;
-            hnhd = rik * rjl * cos_ijk * sin_jil;
-            hnhe = rik * rjl * sin_ijk * cos_jil;
+        CEtors1 = sin_ijk * sin_jil * CV;
 
-            poem = 2.0 * rik * rjl * sin_ijk * sin_jil;
-            if (poem < 1e-20) poem = 1e-20;
+        CEtors2 = -fn10 * 2.0 * p_tor1 * V2 * exp_tor1 * (2.0 - d_BO_pi(i,j_index) - f11_DiDj) *
+                  (1.0 - SQR(cos_omega)) * sin_ijk * sin_jil;
+        CEtors3 = CEtors2 * dfn11;
 
-            tel = SQR(rik) + SQR(rij) + SQR(rjl) - SQR(rlk) -
-                  2.0 * (rik * rij * cos_ijk - rik * rjl * cos_ijk * cos_jil + rij * rjl * cos_jil);
+        CEtors4 = CEtors1 * p_tor2 * exp_tor2_ik * (1.0 - exp_tor2_ij) * (1.0 - exp_tor2_jl);
+        CEtors5 = CEtors1 * p_tor2 * (1.0 - exp_tor2_ik) * exp_tor2_ij * (1.0 - exp_tor2_jl);
+        CEtors6 = CEtors1 * p_tor2 * (1.0 - exp_tor2_ik) * (1.0 - exp_tor2_ij) * exp_tor2_jl;
 
-            arg = tel / poem;
-            if (arg >  1.0) arg =  1.0;
-            else if (arg < -1.0) arg = -1.0;
+        cmn = -fn10 * CV;
+        CEtors7 = cmn * sin_jil * tan_ijk_i;
+        CEtors8 = cmn * sin_ijk * tan_jil_i;
 
-            F_FLOAT sin_ijk_rnd = sin_ijk;
-            F_FLOAT sin_jil_rnd = sin_jil;
+        CEtors9 = fn10 * sin_ijk * sin_jil *
+          (0.5 * V1 - 2.0 * V2 * exp_tor1 * cos_omega + 1.5 * V3 * (cos2omega + 2.0 * SQR(cos_omega)));
 
-            if (sin_ijk >= 0 && sin_ijk <= 1e-10) sin_ijk_rnd = 1e-10;
-            else if (sin_ijk <= 0 && sin_ijk >= -1e-10) sin_ijk_rnd = -1e-10;
-            if (sin_jil >= 0 && sin_jil <= 1e-10) sin_jil_rnd = 1e-10;
-            else if (sin_jil <= 0 && sin_jil >= -1e-10) sin_jil_rnd = -1e-10;
+        // 4-body conjugation energy
 
-            // dcos_omega_di
-            for (int d = 0; d < 3; d++) dcos_omega_dk[d] = ((htra-arg*hnra)/rik) * delik[d] - dellk[d];
-            for (int d = 0; d < 3; d++) dcos_omega_dk[d] += (hthd-arg*hnhd)/sin_ijk_rnd * -dcos_ijk_dk[d];
-            for (int d = 0; d < 3; d++) dcos_omega_dk[d] *= 2.0/poem;
+        fn12 = exp_cot2_ik * exp_cot2_ij * exp_cot2_jl;
+        e_con = p_cot1 * fn12 * (1.0 + (SQR(cos_omega) - 1.0) * sin_ijk * sin_jil);
+        if (eflag) ev.ereax[7] += e_con;
 
-            // dcos_omega_dj
-            for (int d = 0; d < 3; d++) dcos_omega_di[d] = -((htra-arg*hnra)/rik) * delik[d] - htrb/rij * delij[d];
-            for (int d = 0; d < 3; d++) dcos_omega_di[d] += -(hthd-arg*hnhd)/sin_ijk_rnd * dcos_ijk_di[d];
-            for (int d = 0; d < 3; d++) dcos_omega_di[d] += -(hthe-arg*hnhe)/sin_jil_rnd * dcos_jil_di[d];
-            for (int d = 0; d < 3; d++) dcos_omega_di[d] *= 2.0/poem;
+        Cconj = -2.0 * fn12 * p_cot1 * p_cot2 * (1.0 + (SQR(cos_omega) - 1.0) * sin_ijk * sin_jil);
 
-            // dcos_omega_dk
-            for (int d = 0; d < 3; d++) dcos_omega_dj[d] = -((htrc-arg*hnrc)/rjl) * deljl[d] + htrb/rij * delij[d];
-            for (int d = 0; d < 3; d++) dcos_omega_dj[d] += -(hthd-arg*hnhd)/sin_ijk_rnd * dcos_ijk_dj[d];
-            for (int d = 0; d < 3; d++) dcos_omega_dj[d] += -(hthe-arg*hnhe)/sin_jil_rnd * dcos_jil_dj[d];
-            for (int d = 0; d < 3; d++) dcos_omega_dj[d] *= 2.0/poem;
+        CEconj1 = Cconj * (BOA_ik - 1.5e0);
+        CEconj2 = Cconj * (BOA_ij - 1.5e0);
+        CEconj3 = Cconj * (BOA_jl - 1.5e0);
 
-            // dcos_omega_dl
-            for (int d = 0; d < 3; d++) dcos_omega_dl[d] = ((htrc-arg*hnrc)/rjl) * deljl[d] + dellk[d];
-            for (int d = 0; d < 3; d++) dcos_omega_dl[d] += (hthe-arg*hnhe)/sin_jil_rnd * -dcos_jil_dk[d];
-            for (int d = 0; d < 3; d++) dcos_omega_dl[d] *= 2.0/poem;
+        CEconj4 = -p_cot1 * fn12 * (SQR(cos_omega) - 1.0) * sin_jil * tan_ijk_i;
+        CEconj5 = -p_cot1 * fn12 * (SQR(cos_omega) - 1.0) * sin_ijk * tan_jil_i;
+        CEconj6 = 2.0 * p_cot1 * fn12 * cos_omega * sin_ijk * sin_jil;
 
-            cos_omega = cos(omega);
-            cos2omega = cos(2. * omega);
-            cos3omega = cos(3. * omega);
+        // forces
 
-            // torsion energy
+        // contribution to bond order
 
-            p_tor1 = paramsfbp(ktype,itype,jtype,ltype).p_tor1;
-            p_cot1 = paramsfbp(ktype,itype,jtype,ltype).p_cot1;
-            V1 = paramsfbp(ktype,itype,jtype,ltype).V1;
-            V2 = paramsfbp(ktype,itype,jtype,ltype).V2;
-            V3 = paramsfbp(ktype,itype,jtype,ltype).V3;
+        a_Cdbopi(i,j_index) += CEtors2;
 
-            exp_tor1 = exp(p_tor1 * SQR(2.0 - d_BO_pi(i,j_index) - f11_DiDj));
-            exp_tor2_jl = exp(-p_tor2 * BOA_jl);
-            exp_cot2_jl = exp(-p_cot2 * SQR(BOA_jl - 1.5));
-            fn10 = (1.0 - exp_tor2_ik) * (1.0 - exp_tor2_ij) * (1.0 - exp_tor2_jl);
+        a_CdDelta[j] += CEtors3;
+        a_CdDelta[i] += CEtors3;
 
-            CV = 0.5 * (V1 * (1.0 + cos_omega) + V2 * exp_tor1 * (1.0 - cos2omega) + V3 * (1.0 + cos3omega));
+        a_Cdbo(i,k_index) += CEtors4 + CEconj1;
+        a_Cdbo(i,j_index) += CEtors5 + CEconj2;
+        a_Cdbo(j,l_index) += CEtors6 + CEconj3; // trouble
 
-            e_tor = fn10 * sin_ijk * sin_jil * CV;
-            if (EVFLAG && eflag_global) ev.ereax[6] += e_tor;
+        const F_FLOAT coeff74 = CEtors7 + CEconj4;
+        const F_FLOAT coeff85 = CEtors8 + CEconj5;
+        const F_FLOAT coeff96 = CEtors9 + CEconj6;
 
-            dfn11 = (-p_tor3 * exp_tor3_DiDj + (p_tor3 * exp_tor3_DiDj - p_tor4 * exp_tor4_DiDj) *
-                    (2.0 + exp_tor3_DiDj) * exp_tor34_inv) * exp_tor34_inv;
+        const F_FLOAT inv_rij = 1.0 / rij;
+        const F_FLOAT inv_rik = 1.0 / rik;
+        const F_FLOAT inv_rjl = 1.0 / rjl;
+        const F_FLOAT inv_sin_ijk_rnd = 1.0 / sin_ijk_rnd;
+        const F_FLOAT inv_sin_jil_rnd = 1.0 / sin_jil_rnd;
 
-            CEtors1 = sin_ijk * sin_jil * CV;
+        #pragma unroll
+        for (int d = 0; d < 3; d++) {
+          // dcos_omega_di
+          F_FLOAT dcos_omega_dk = ((htra-arg*hnra) * inv_rik) * delik[d] - dellk[d];
+          dcos_omega_dk += (hthd-arg*hnhd) * inv_sin_ijk_rnd * -dcos_ijk_dk[d];
+          dcos_omega_dk *= 2.0 * inv_poem;
 
-            CEtors2 = -fn10 * 2.0 * p_tor1 * V2 * exp_tor1 * (2.0 - d_BO_pi(i,j_index) - f11_DiDj) *
-                      (1.0 - SQR(cos_omega)) * sin_ijk * sin_jil;
-            CEtors3 = CEtors2 * dfn11;
+          // dcos_omega_dj
+          F_FLOAT dcos_omega_di = -((htra-arg*hnra) * inv_rik) * delik[d] - htrb * inv_rij * delij[d];
+          dcos_omega_di += -(hthd-arg*hnhd) * inv_sin_ijk_rnd * dcos_ijk_di[d];
+          dcos_omega_di += -(hthe-arg*hnhe) * inv_sin_jil_rnd * dcos_jil_di[d];
+          dcos_omega_di *= 2.0 * inv_poem;
 
-            CEtors4 = CEtors1 * p_tor2 * exp_tor2_ik * (1.0 - exp_tor2_ij) * (1.0 - exp_tor2_jl);
-            CEtors5 = CEtors1 * p_tor2 * (1.0 - exp_tor2_ik) * exp_tor2_ij * (1.0 - exp_tor2_jl);
-            CEtors6 = CEtors1 * p_tor2 * (1.0 - exp_tor2_ik) * (1.0 - exp_tor2_ij) * exp_tor2_jl;
+          // dcos_omega_dk
+          F_FLOAT dcos_omega_dj = -((htrc-arg*hnrc) * inv_rjl) * deljl[d] + htrb * inv_rij * delij[d];
+          dcos_omega_dj += -(hthd-arg*hnhd) * inv_sin_ijk_rnd * dcos_ijk_dj[d];
+          dcos_omega_dj += -(hthe-arg*hnhe) * inv_sin_jil_rnd * dcos_jil_dj[d];
+          dcos_omega_dj *= 2.0 * inv_poem;
 
-            cmn = -fn10 * CV;
-            CEtors7 = cmn * sin_jil * tan_ijk_i;
-            CEtors8 = cmn * sin_ijk * tan_jil_i;
+          // dcos_omega_dl
+          F_FLOAT dcos_omega_dl = ((htrc-arg*hnrc) * inv_rjl) * deljl[d] + dellk[d];
+          dcos_omega_dl += (hthe-arg*hnhe) * inv_sin_jil_rnd * -dcos_jil_dk[d];
+          dcos_omega_dl *= 2.0 * inv_poem;
 
-            CEtors9 = fn10 * sin_ijk * sin_jil *
-              (0.5 * V1 - 2.0 * V2 * exp_tor1 * cos_omega + 1.5 * V3 * (cos2omega + 2.0 * SQR(cos_omega)));
+          // dcos_theta_ijk
+          fi_tmp[d] = (coeff74) * dcos_ijk_di[d];
+          fj_tmp[d] = (coeff74) * dcos_ijk_dj[d];
+          fk_tmp[d] = (coeff74) * dcos_ijk_dk[d];
 
-            // 4-body conjugation energy
+          // dcos_theta_jil
+          fi_tmp[d] += (coeff85) * dcos_jil_di[d];
+          fj_tmp[d] += (coeff85) * dcos_jil_dj[d];
+          F_FLOAT fl_tmp =  (coeff85) * dcos_jil_dk[d];
 
-            fn12 = exp_cot2_ik * exp_cot2_ij * exp_cot2_jl;
-            e_con = p_cot1 * fn12 * (1.0 + (SQR(cos_omega) - 1.0) * sin_ijk * sin_jil);
-            if (EVFLAG && eflag_global) ev.ereax[7] += e_con;
+          // dcos_omega
+          fi_tmp[d] += (coeff96) * dcos_omega_di;
+          fj_tmp[d] += (coeff96) * dcos_omega_dj;
+          fk_tmp[d] += (coeff96) * dcos_omega_dk;
+          fl_tmp += (coeff96) * dcos_omega_dl;
 
-            Cconj = -2.0 * fn12 * p_cot1 * p_cot2 * (1.0 + (SQR(cos_omega) - 1.0) * sin_ijk * sin_jil);
-
-            CEconj1 = Cconj * (BOA_ik - 1.5e0);
-            CEconj2 = Cconj * (BOA_ij - 1.5e0);
-            CEconj3 = Cconj * (BOA_jl - 1.5e0);
-
-            CEconj4 = -p_cot1 * fn12 * (SQR(cos_omega) - 1.0) * sin_jil * tan_ijk_i;
-            CEconj5 = -p_cot1 * fn12 * (SQR(cos_omega) - 1.0) * sin_ijk * tan_jil_i;
-            CEconj6 = 2.0 * p_cot1 * fn12 * cos_omega * sin_ijk * sin_jil;
-
-            // forces
-
-            // contribution to bond order
-
-            d_Cdbopi(i,j_index) += CEtors2;
-            CdDelta_i += CEtors3;
-            CdDelta_j += CEtors3;
-
-            a_Cdbo(i,k_index) += CEtors4 + CEconj1;
-            a_Cdbo(i,j_index) += CEtors5 + CEconj2;
-            a_Cdbo(j,l_index) += CEtors6 + CEconj3; // trouble
-
-            // dcos_theta_ijk
-            const F_FLOAT coeff74 = CEtors7 + CEconj4;
-            for (int d = 0; d < 3; d++) fi_tmp[d] = (coeff74) * dcos_ijk_di[d];
-            for (int d = 0; d < 3; d++) fj_tmp[d] = (coeff74) * dcos_ijk_dj[d];
-            for (int d = 0; d < 3; d++) fk_tmp[d] = (coeff74) * dcos_ijk_dk[d];
-
-            const F_FLOAT coeff85 = CEtors8 + CEconj5;
-            // dcos_theta_jil
-            for (int d = 0; d < 3; d++) fi_tmp[d] += (coeff85) * dcos_jil_di[d];
-            for (int d = 0; d < 3; d++) fj_tmp[d] += (coeff85) * dcos_jil_dj[d];
-            for (int d = 0; d < 3; d++) fl_tmp[d] =  (coeff85) * dcos_jil_dk[d];
-
-            // dcos_omega
-            const F_FLOAT coeff96 = CEtors9 + CEconj6;
-            for (int d = 0; d < 3; d++) fi_tmp[d] += (coeff96) * dcos_omega_di[d];
-            for (int d = 0; d < 3; d++) fj_tmp[d] += (coeff96) * dcos_omega_dj[d];
-            for (int d = 0; d < 3; d++) fk_tmp[d] += (coeff96) * dcos_omega_dk[d];
-            for (int d = 0; d < 3; d++) fl_tmp[d] += (coeff96) * dcos_omega_dl[d];
-
-            // total forces
-
-            for (int d = 0; d < 3; d++) fitmp[d] -= fi_tmp[d];
-            for (int d = 0; d < 3; d++) fjtmp[d] -= fj_tmp[d];
-            for (int d = 0; d < 3; d++) fktmp[d] -= fk_tmp[d];
-            for (int d = 0; d < 3; d++) a_f(l,d) -= fl_tmp[d];
-
-            // per-atom energy/virial tally
-
-            if (EVFLAG) {
-              eng_tmp = e_tor + e_con;
-              //if (eflag_atom) this->template ev_tally<NEIGHFLAG>(ev,i,j,eng_tmp,0.0,0.0,0.0,0.0);
-              if (eflag_atom) this->template e_tally<NEIGHFLAG>(ev,i,j,eng_tmp);
-              if (vflag_either) {
-                  for (int d = 0; d < 3; d ++) delil[d] = x(l,d) - x(i,d);
-                  for (int d = 0; d < 3; d ++) delkl[d] = x(l,d) - x(k,d);
-                  this->template v_tally4<NEIGHFLAG>(ev,k,i,j,l,fk_tmp,fi_tmp,fj_tmp,delkl,delil,deljl);
-              }
-            }
-          }
-
-          for (int d = 0; d < 3; d++) a_f(k,d) += fktmp[d];
+          // total forces
+          a_f(i,d) -= fi_tmp[d];
+          a_f(j,d) -= fj_tmp[d];
+          a_f(k,d) -= fk_tmp[d];
+          a_f(l,d) -= fl_tmp;
         }
-      }
-      a_CdDelta[j] += CdDelta_j;
-      for (int d = 0; d < 3; d++) a_f(j,d) += fjtmp[d];
-    }
-  }
-  a_CdDelta[i] += CdDelta_i;
-  for (int d = 0; d < 3; d++) a_f(i,d) += fitmp[d];
+
+        // per-atom energy/virial tally
+
+        if (EVFLAG) {
+          eng_tmp = e_tor + e_con;
+          //if (eflag_atom) this->template ev_tally<NEIGHFLAG>(ev,i,j,eng_tmp,0.0,0.0,0.0,0.0);
+          if (eflag_atom) this->template e_tally<NEIGHFLAG>(ev,i,j,eng_tmp);
+          if (vflag_either) {
+              for (int d = 0; d < 3; d ++) delil[d] = x(l,d) - x(i,d);
+              for (int d = 0; d < 3; d ++) delkl[d] = x(l,d) - x(k,d);
+              this->template v_tally4<NEIGHFLAG>(ev,k,i,j,l,fk_tmp,fi_tmp,fj_tmp,delkl,delil,deljl);
+          }
+        }
+
+
 }
 
 template<class DeviceType>
 template<int NEIGHFLAG, int EVFLAG>
 KOKKOS_INLINE_FUNCTION
-void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsion<NEIGHFLAG,EVFLAG>, const int &ii) const {
-
+void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeTorsionPreprocessed<NEIGHFLAG,EVFLAG>, const int &tpack) const {
   EV_FLOAT_REAX ev;
-  this->template operator()<NEIGHFLAG,EVFLAG>(TagPairReaxComputeTorsion<NEIGHFLAG,EVFLAG>(), ii, ev);
+  this->template operator()<NEIGHFLAG,EVFLAG>(TagPairReaxComputeTorsionPreprocessed<NEIGHFLAG,EVFLAG>(), tpack, ev);
+
 }
 
 /* ---------------------------------------------------------------------- */
@@ -3626,9 +3549,9 @@ template<int NEIGHFLAG>
 KOKKOS_INLINE_FUNCTION
 void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxUpdateBond<NEIGHFLAG>, const int &ii) const {
 
-  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value> > a_Cdbo = d_Cdbo;
-  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value> > a_Cdbopi = d_Cdbopi;
-  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value> > a_Cdbopi2 = d_Cdbopi2;
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbo = d_Cdbo;
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbopi = d_Cdbopi;
+  Kokkos::View<F_FLOAT**, typename DAT::t_ffloat_2d_dl::array_layout,KKDeviceType,Kokkos::MemoryTraits<AtomicF<NEIGHFLAG>::value>> a_Cdbopi2 = d_Cdbopi2;
   //auto a_Cdbo = dup_Cdbo.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
   //auto a_Cdbopi = dup_Cdbopi.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
   //auto a_Cdbopi2 = dup_Cdbopi2.template access<AtomicDup_v<NEIGHFLAG,DeviceType>>();
@@ -3870,17 +3793,14 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeBond2<NEIGHFLAG,
 
     F_FLOAT temp[3];
 
-    dln_BOp_pi[0] = d_dln_BOp_pix(i,j_index);
-    dln_BOp_pi[1] = d_dln_BOp_piy(i,j_index);
-    dln_BOp_pi[2] = d_dln_BOp_piz(i,j_index);
+    F_FLOAT d_dln_BOp_pi_local = d_dln_BOp_pi(i,j_index);
+    for (int d = 0; d < 3; d++) dln_BOp_pi[d] = d_dln_BOp_pi_local * delij[d];
 
-    dln_BOp_pi2[0] = d_dln_BOp_pi2x(i,j_index);
-    dln_BOp_pi2[1] = d_dln_BOp_pi2y(i,j_index);
-    dln_BOp_pi2[2] = d_dln_BOp_pi2z(i,j_index);
+    F_FLOAT d_dln_BOp_pi2_local = d_dln_BOp_pi2(i,j_index);
+    for (int d = 0; d < 3; d++) dln_BOp_pi2[d] = d_dln_BOp_pi2_local * delij[d];
 
-    dBOp_i[0] = d_dBOpx(i,j_index);
-    dBOp_i[1] = d_dBOpy(i,j_index);
-    dBOp_i[2] = d_dBOpz(i,j_index);
+    F_FLOAT d_dBOp_local = d_dBOp(i,j_index);
+    for (int d = 0; d < 3; d++) dBOp_i[d] = d_dBOp_local * delij[d];
 
     // forces on i
     for (int d = 0; d < 3; d++) temp[d] =  coef_C1dbo * dBOp_i[d];
@@ -3927,9 +3847,13 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeBond2<NEIGHFLAG,
       k &= NEIGHMASK;
       const int k_index = kk - j_start;
 
-      dBOp_k[0] = d_dBOpx(i,k_index);
-      dBOp_k[1] = d_dBOpy(i,k_index);
-      dBOp_k[2] = d_dBOpz(i,k_index);
+      delik[0] = x(k,0) - xtmp;
+      delik[1] = x(k,1) - ytmp;
+      delik[2] = x(k,2) - ztmp;
+
+      d_dBOp_local = d_dBOp(i,k_index);
+      for (int d = 0; d < 3; d++) dBOp_k[d] = d_dBOp_local * delik[d];
+
       const F_FLOAT coef_all = -coef_C2dbo - coef_C2dDelta - coef_C3dbopi - coef_C3dbopi2;
       for (int d = 0; d < 3; d++) temp[d] = coef_all * dBOp_k[d];
 
@@ -3953,9 +3877,10 @@ void PairReaxFFKokkos<DeviceType>::operator()(TagPairReaxComputeBond2<NEIGHFLAG,
       k &= NEIGHMASK;
       const int k_index = kk - k_start;
 
-      dBOp_k[0] = d_dBOpx(j,k_index);
-      dBOp_k[1] = d_dBOpy(j,k_index);
-      dBOp_k[2] = d_dBOpz(j,k_index);
+      for (int d = 0; d < 3; d++) deljk[d] = x(k,d) - x(j,d);
+
+      d_dBOp_local = d_dBOp(j,k_index);
+      for (int d = 0; d < 3; d++) dBOp_k[d] = d_dBOp_local * deljk[d];
       const F_FLOAT coef_all = -coef_C3dbo - coef_C3dDelta - coef_C4dbopi - coef_C4dbopi2;
       for (int d = 0; d < 3; d++) temp[d] = coef_all * dBOp_k[d];
 
diff --git a/src/KOKKOS/pair_reaxff_kokkos.h b/src/KOKKOS/pair_reaxff_kokkos.h
index 34faabfd1e..e0da0d8819 100644
--- a/src/KOKKOS/pair_reaxff_kokkos.h
+++ b/src/KOKKOS/pair_reaxff_kokkos.h
@@ -90,13 +90,13 @@ struct TagPairReaxComputeMulti1{};
 template<int NEIGHFLAG, int EFLAG>
 struct TagPairReaxComputeMulti2{};
 
+template<bool POPULATE>
+struct TagPairReaxCountAngularTorsion{};
 template<int NEIGHFLAG, int EVFLAG>
-struct TagPairReaxComputeAngular{};
-
-struct TagPairReaxComputeTorsionPreview{};
+struct TagPairReaxComputeAngularPreprocessed{};
 
 template<int NEIGHFLAG, int EVFLAG>
-struct TagPairReaxComputeTorsion{};
+struct TagPairReaxComputeTorsionPreprocessed{};
 
 template<int NEIGHFLAG, int EVFLAG>
 struct TagPairReaxComputeHydrogen{};
@@ -120,7 +120,7 @@ class PairReaxFFKokkos : public PairReaxFF {
   // "Blocking" factors to reduce thread divergence within some kernels
   using blocking_t = unsigned short int;
 
-  // "PairReaxFFComputeTorsionBlocking"
+  // "PairReaxFFComputeTorsion"
   static constexpr int compute_torsion_blocksize = 8;
 
   // "PairReaxBuildListsHalfBlocking"
@@ -176,9 +176,28 @@ class PairReaxFFKokkos : public PairReaxFF {
   KOKKOS_INLINE_FUNCTION
   void operator()(TagPairReaxBuildListsHalfPreview<NEIGHFLAG>, const int&) const;
 
+  // Isolated function that builds the hbond list, reused across
+  // TagPairReaxBuildListsHalfBlocking, HalfBlockingPreview, HalfPreview
+  template<int NEIGHFLAG>
+  KOKKOS_INLINE_FUNCTION
+  void build_hb_list(F_FLOAT, int, int, int, int, int) const;
+
+  // Isolated function that builds the bond order list, reused across
+  // TagPairReaxBuildListsHalfBlocking, HalfBlockingPreview, HalfPreview
+  // Returns if we need to populate d_d* functions or not
+  template<int NEIGHFLAG>
+  KOKKOS_INLINE_FUNCTION
+  bool build_bo_list(int, int, int, int, int, int&, int&) const;
+
   KOKKOS_INLINE_FUNCTION
   void operator()(TagPairReaxBuildListsFull, const int&) const;
 
+  // Isolated function that computes bond order parameters
+  // Returns BO_s, BO_pi, BO_pi2, C12, C34, C56 by reference
+  KOKKOS_INLINE_FUNCTION
+  void compute_bo(F_FLOAT, int, int, F_FLOAT, F_FLOAT, F_FLOAT,
+    F_FLOAT&, F_FLOAT&, F_FLOAT&, F_FLOAT&, F_FLOAT&, F_FLOAT&) const;
+
   KOKKOS_INLINE_FUNCTION
   void operator()(TagPairReaxZero, const int&) const;
 
@@ -222,24 +241,39 @@ class PairReaxFFKokkos : public PairReaxFF {
   KOKKOS_INLINE_FUNCTION
   void operator()(TagPairReaxComputeMulti2<NEIGHFLAG,EFLAG>, const int&) const;
 
-  template<int NEIGHFLAG, int EVFLAG>
+  template<bool POPULATE>
   KOKKOS_INLINE_FUNCTION
-  void operator()(TagPairReaxComputeAngular<NEIGHFLAG,EVFLAG>, const int&, EV_FLOAT_REAX&) const;
+  void operator()(TagPairReaxCountAngularTorsion<POPULATE>, const int&) const;
+
+  // Abstraction for computing SBSO2, CSBO2, dSBO1, dsBO2
+  KOKKOS_INLINE_FUNCTION
+  void compute_angular_sbo(int, int, int, int) const;
+
+  // Abstraction for counting and populating angular intermediates
+  template<bool POPULATE>
+  KOKKOS_INLINE_FUNCTION
+  int preprocess_angular(int, int, int, int, int) const;
+
+  // Abstraction for counting and populating torsion intermediated
+  template<bool POPULATE>
+  KOKKOS_INLINE_FUNCTION
+  int preprocess_torsion(int, int, int, F_FLOAT, F_FLOAT, F_FLOAT, int, int, int) const;
 
   template<int NEIGHFLAG, int EVFLAG>
   KOKKOS_INLINE_FUNCTION
-  void operator()(TagPairReaxComputeAngular<NEIGHFLAG,EVFLAG>, const int&) const;
-
-  KOKKOS_INLINE_FUNCTION
-  void operator()(TagPairReaxComputeTorsionPreview, const int&) const;
+  void operator()(TagPairReaxComputeAngularPreprocessed<NEIGHFLAG,EVFLAG>, const int&, EV_FLOAT_REAX&) const;
 
   template<int NEIGHFLAG, int EVFLAG>
   KOKKOS_INLINE_FUNCTION
-  void operator()(TagPairReaxComputeTorsion<NEIGHFLAG,EVFLAG>, const int&, EV_FLOAT_REAX&) const;
+  void operator()(TagPairReaxComputeAngularPreprocessed<NEIGHFLAG,EVFLAG>, const int&) const;
 
   template<int NEIGHFLAG, int EVFLAG>
   KOKKOS_INLINE_FUNCTION
-  void operator()(TagPairReaxComputeTorsion<NEIGHFLAG,EVFLAG>, const int&) const;
+  void operator()(TagPairReaxComputeTorsionPreprocessed<NEIGHFLAG,EVFLAG>, const int&, EV_FLOAT_REAX&) const;
+
+  template<int NEIGHFLAG, int EVFLAG>
+  KOKKOS_INLINE_FUNCTION
+  void operator()(TagPairReaxComputeTorsionPreprocessed<NEIGHFLAG,EVFLAG>, const int&) const;
 
   template<int NEIGHFLAG, int EVFLAG>
   KOKKOS_INLINE_FUNCTION
@@ -395,9 +429,8 @@ class PairReaxFFKokkos : public PairReaxFF {
 
   typename AT::t_float_1d d_bo_rij, d_hb_rsq, d_Deltap, d_Deltap_boc, d_total_bo, d_s;
   typename AT::t_float_1d d_Delta, d_Delta_boc, d_Delta_lp, d_dDelta_lp, d_Delta_lp_temp, d_CdDelta;
-  typename AT::t_ffloat_2d_dl d_BO, d_BO_s, d_BO_pi, d_BO_pi2, d_dBOp;
-  typename AT::t_ffloat_2d_dl d_dln_BOp_pix, d_dln_BOp_piy, d_dln_BOp_piz;
-  typename AT::t_ffloat_2d_dl d_dln_BOp_pi2x, d_dln_BOp_pi2y, d_dln_BOp_pi2z;
+  typename AT::t_ffloat_2d_dl d_BO, d_BO_s, d_BO_pi, d_BO_pi2;
+  typename AT::t_ffloat_2d_dl d_dln_BOp_pi, d_dln_BOp_pi2;
   typename AT::t_ffloat_2d_dl d_C1dbo, d_C2dbo, d_C3dbo;
   typename AT::t_ffloat_2d_dl d_C1dbopi, d_C2dbopi, d_C3dbopi, d_C4dbopi;
   typename AT::t_ffloat_2d_dl d_C1dbopi2, d_C2dbopi2, d_C3dbopi2, d_C4dbopi2;
@@ -447,7 +480,7 @@ class PairReaxFFKokkos : public PairReaxFF {
   typename AT::t_int_scalar d_resize_bo, d_resize_hb;
 
   typename AT::t_ffloat_2d_dl d_sum_ovun;
-  typename AT::t_ffloat_2d_dl d_dBOpx, d_dBOpy, d_dBOpz;
+  typename AT::t_ffloat_2d_dl d_dBOp;
 
   int neighflag, newton_pair, maxnumneigh, maxhb, maxbo;
   int nlocal,nn,NN,eflag,vflag,acks2_flag;
@@ -480,15 +513,15 @@ class PairReaxFFKokkos : public PairReaxFF {
   typename AT::t_ffloat_1d d_buf;
   DAT::tdual_int_scalar k_nbuf_local;
 
-  // for fast ComputeTorsion preprocessor kernel
-  typedef Kokkos::View<int*, LMPPinnedHostType> t_hostpinned_int_1d;
+  typedef Kokkos::View<reax_int4**, LMPDeviceType::array_layout, DeviceType> t_reax_int4_2d;
+
+  t_reax_int4_2d d_angular_pack, d_torsion_pack;
+
+  typename AT::t_ffloat_2d d_angular_intermediates;
+
+  typename AT::tdual_int_1d k_count_angular_torsion;
+  typename AT::t_int_1d d_count_angular_torsion;
 
-  int inum_store;
-  t_hostpinned_int_1d counters;
-  t_hostpinned_int_1d counters_jj_min;
-  t_hostpinned_int_1d counters_jj_max;
-  t_hostpinned_int_1d counters_kk_min;
-  t_hostpinned_int_1d counters_kk_max;
 };
 
 template <class DeviceType>
diff --git a/src/KOKKOS/pair_snap_kokkos_impl.h b/src/KOKKOS/pair_snap_kokkos_impl.h
index 957ae953a7..6cf40b31a5 100644
--- a/src/KOKKOS/pair_snap_kokkos_impl.h
+++ b/src/KOKKOS/pair_snap_kokkos_impl.h
@@ -102,13 +102,15 @@ void PairSNAPKokkos<DeviceType, real_type, vector_length>::init_style()
   if (force->newton_pair == 0)
     error->all(FLERR,"Pair style SNAP requires newton pair on");
 
-  // adjust neighbor list request for KOKKOS
+  // neighbor list request for KOKKOS
+
+  neighflag = lmp->kokkos->neighflag;
 
   auto request = neighbor->add_request(this, NeighConst::REQ_FULL);
   request->set_kokkos_host(std::is_same<DeviceType,LMPHostType>::value &&
                            !std::is_same<DeviceType,LMPDeviceType>::value);
   request->set_kokkos_device(std::is_same<DeviceType,LMPDeviceType>::value);
-  if (lmp->kokkos->neighflag == FULL)
+  if (neighflag == FULL)
     error->all(FLERR,"Must use half neighbor list style with pair snap/kk");
 }
 
diff --git a/src/compute_msd.cpp b/src/compute_msd.cpp
index 229b965761..bff9dffd87 100644
--- a/src/compute_msd.cpp
+++ b/src/compute_msd.cpp
@@ -1,4 +1,3 @@
-// clang-format off
 /* ----------------------------------------------------------------------
    LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
    https://www.lammps.org/, Sandia National Laboratories
@@ -28,11 +27,9 @@ using namespace LAMMPS_NS;
 
 /* ---------------------------------------------------------------------- */
 
-ComputeMSD::ComputeMSD(LAMMPS *lmp, int narg, char **arg) :
-  Compute(lmp, narg, arg),
-  id_fix(nullptr)
+ComputeMSD::ComputeMSD(LAMMPS *lmp, int narg, char **arg) : Compute(lmp, narg, arg), id_fix(nullptr)
 {
-  if (narg < 3) error->all(FLERR,"Illegal compute msd command");
+  if (narg < 3) error->all(FLERR, "Illegal compute msd command");
 
   vector_flag = 1;
   size_vector = 4;
@@ -47,28 +44,33 @@ ComputeMSD::ComputeMSD(LAMMPS *lmp, int narg, char **arg) :
 
   int iarg = 3;
   while (iarg < narg) {
-    if (strcmp(arg[iarg],"com") == 0) {
-      if (iarg+2 > narg) error->all(FLERR,"Illegal compute msd command");
-      comflag = utils::logical(FLERR,arg[iarg+1],false,lmp);
+    if (strcmp(arg[iarg], "com") == 0) {
+      if (iarg + 2 > narg) error->all(FLERR, "Illegal compute msd command");
+      comflag = utils::logical(FLERR, arg[iarg + 1], false, lmp);
       iarg += 2;
-    } else if (strcmp(arg[iarg],"average") == 0) {
-      if (iarg+2 > narg) error->all(FLERR,"Illegal compute msd command");
-      avflag = utils::logical(FLERR,arg[iarg+1],false,lmp);
+    } else if (strcmp(arg[iarg], "average") == 0) {
+      if (iarg + 2 > narg) error->all(FLERR, "Illegal compute msd command");
+      avflag = utils::logical(FLERR, arg[iarg + 1], false, lmp);
       iarg += 2;
-    } else error->all(FLERR,"Illegal compute msd command");
+    } else
+      error->all(FLERR, "Illegal compute msd command");
   }
 
+  if (group->dynamic[igroup])
+    error->all(FLERR, "Compute {} is not compatible with dynamic groups", style);
+
   // create a new fix STORE style for reference positions
   // id = compute-ID + COMPUTE_STORE, fix group = compute group
 
   id_fix = utils::strdup(id + std::string("_COMPUTE_STORE"));
-  fix = (FixStore *) modify->add_fix(fmt::format("{} {} STORE peratom 1 3",
-                                                 id_fix, group->names[igroup]));
+  fix = (FixStore *) modify->add_fix(
+      fmt::format("{} {} STORE peratom 1 3", id_fix, group->names[igroup]));
 
   // calculate xu,yu,zu for fix store array
   // skip if reset from restart file
 
-  if (fix->restart_reset) fix->restart_reset = 0;
+  if (fix->restart_reset)
+    fix->restart_reset = 0;
   else {
     double **xoriginal = fix->astore;
 
@@ -78,15 +80,17 @@ ComputeMSD::ComputeMSD(LAMMPS *lmp, int narg, char **arg) :
     int nlocal = atom->nlocal;
 
     for (int i = 0; i < nlocal; i++)
-      if (mask[i] & groupbit) domain->unmap(x[i],image[i],xoriginal[i]);
-      else xoriginal[i][0] = xoriginal[i][1] = xoriginal[i][2] = 0.0;
+      if (mask[i] & groupbit)
+        domain->unmap(x[i], image[i], xoriginal[i]);
+      else
+        xoriginal[i][0] = xoriginal[i][1] = xoriginal[i][2] = 0.0;
 
     // adjust for COM if requested
 
     if (comflag) {
       double cm[3];
       masstotal = group->mass(igroup);
-      group->xcm(igroup,masstotal,cm);
+      group->xcm(igroup, masstotal, cm);
       for (int i = 0; i < nlocal; i++)
         if (mask[i] & groupbit) {
           xoriginal[i][0] -= cm[0];
@@ -124,7 +128,7 @@ void ComputeMSD::init()
   // set fix which stores reference atom coords
 
   fix = (FixStore *) modify->get_fix_by_id(id_fix);
-  if (!fix) error->all(FLERR,"Could not find compute msd fix with ID {}", id_fix);
+  if (!fix) error->all(FLERR, "Could not find compute msd fix with ID {}", id_fix);
 
   // nmsd = # of atoms in group
 
@@ -141,8 +145,10 @@ void ComputeMSD::compute_vector()
   // cm = current center of mass
 
   double cm[3];
-  if (comflag) group->xcm(igroup,masstotal,cm);
-  else cm[0] = cm[1] = cm[2] = 0.0;
+  if (comflag)
+    group->xcm(igroup, masstotal, cm);
+  else
+    cm[0] = cm[1] = cm[2] = 0.0;
 
   // dx,dy,dz = displacement of atom from reference position
   // reference unwrapped position is stored by fix
@@ -161,8 +167,8 @@ void ComputeMSD::compute_vector()
   double yprd = domain->yprd;
   double zprd = domain->zprd;
 
-  double dx,dy,dz;
-  int xbox,ybox,zbox;
+  double dx, dy, dz;
+  int xbox, ybox, zbox;
 
   double msd[4];
   msd[0] = msd[1] = msd[2] = msd[3] = 0.0;
@@ -174,36 +180,34 @@ void ComputeMSD::compute_vector()
   double navfac;
   if (avflag) {
     naverage++;
-    navfac = 1.0/(naverage+1);
+    navfac = 1.0 / (naverage + 1);
   }
 
-
   if (domain->triclinic == 0) {
     for (int i = 0; i < nlocal; i++)
       if (mask[i] & groupbit) {
         xbox = (image[i] & IMGMASK) - IMGMAX;
         ybox = (image[i] >> IMGBITS & IMGMASK) - IMGMAX;
         zbox = (image[i] >> IMG2BITS) - IMGMAX;
-        xtmp = x[i][0] + xbox*xprd - cm[0];
-        ytmp = x[i][1] + ybox*yprd - cm[1];
-        ztmp = x[i][2] + zbox*zprd - cm[2];
+        xtmp = x[i][0] + xbox * xprd - cm[0];
+        ytmp = x[i][1] + ybox * yprd - cm[1];
+        ztmp = x[i][2] + zbox * zprd - cm[2];
 
         // use running average position for reference if requested
 
         if (avflag) {
-          xoriginal[i][0] = (xoriginal[i][0]*naverage + xtmp)*navfac;
-          xoriginal[i][1] = (xoriginal[i][1]*naverage + ytmp)*navfac;
-          xoriginal[i][2] = (xoriginal[i][2]*naverage + ztmp)*navfac;
+          xoriginal[i][0] = (xoriginal[i][0] * naverage + xtmp) * navfac;
+          xoriginal[i][1] = (xoriginal[i][1] * naverage + ytmp) * navfac;
+          xoriginal[i][2] = (xoriginal[i][2] * naverage + ztmp) * navfac;
         }
 
         dx = xtmp - xoriginal[i][0];
         dy = ytmp - xoriginal[i][1];
         dz = ztmp - xoriginal[i][2];
-        msd[0] += dx*dx;
-        msd[1] += dy*dy;
-        msd[2] += dz*dz;
-        msd[3] += dx*dx + dy*dy + dz*dz;
-
+        msd[0] += dx * dx;
+        msd[1] += dy * dy;
+        msd[2] += dz * dz;
+        msd[3] += dx * dx + dy * dy + dz * dz;
       }
   } else {
     for (int i = 0; i < nlocal; i++)
@@ -211,29 +215,29 @@ void ComputeMSD::compute_vector()
         xbox = (image[i] & IMGMASK) - IMGMAX;
         ybox = (image[i] >> IMGBITS & IMGMASK) - IMGMAX;
         zbox = (image[i] >> IMG2BITS) - IMGMAX;
-        xtmp = x[i][0] + h[0]*xbox + h[5]*ybox + h[4]*zbox - cm[0];
-        ytmp = x[i][1] + h[1]*ybox + h[3]*zbox - cm[1];
-        ztmp = x[i][2] + h[2]*zbox - cm[2];
+        xtmp = x[i][0] + h[0] * xbox + h[5] * ybox + h[4] * zbox - cm[0];
+        ytmp = x[i][1] + h[1] * ybox + h[3] * zbox - cm[1];
+        ztmp = x[i][2] + h[2] * zbox - cm[2];
 
         // use running average position for reference if requested
 
         if (avflag) {
-          xoriginal[i][0] = (xoriginal[i][0]*naverage + xtmp)*navfac;
-          xoriginal[i][1] = (xoriginal[i][0]*naverage + xtmp)*navfac;
-          xoriginal[i][2] = (xoriginal[i][0]*naverage + xtmp)*navfac;
+          xoriginal[i][0] = (xoriginal[i][0] * naverage + xtmp) * navfac;
+          xoriginal[i][1] = (xoriginal[i][0] * naverage + xtmp) * navfac;
+          xoriginal[i][2] = (xoriginal[i][0] * naverage + xtmp) * navfac;
         }
 
         dx = xtmp - xoriginal[i][0];
         dy = ytmp - xoriginal[i][1];
         dz = ztmp - xoriginal[i][2];
-        msd[0] += dx*dx;
-        msd[1] += dy*dy;
-        msd[2] += dz*dz;
-        msd[3] += dx*dx + dy*dy + dz*dz;
+        msd[0] += dx * dx;
+        msd[1] += dy * dy;
+        msd[2] += dz * dz;
+        msd[3] += dx * dx + dy * dy + dz * dz;
       }
   }
 
-  MPI_Allreduce(msd,vector,4,MPI_DOUBLE,MPI_SUM,world);
+  MPI_Allreduce(msd, vector, 4, MPI_DOUBLE, MPI_SUM, world);
   if (nmsd) {
     vector[0] /= nmsd;
     vector[1] /= nmsd;
diff --git a/src/fix_move.cpp b/src/fix_move.cpp
index f7bc4d3640..756292d06b 100644
--- a/src/fix_move.cpp
+++ b/src/fix_move.cpp
@@ -1,4 +1,3 @@
-// clang-format off
 /* ----------------------------------------------------------------------
    LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
    https://www.lammps.org/, Sandia National Laboratories
@@ -40,21 +39,19 @@ using namespace LAMMPS_NS;
 using namespace FixConst;
 using namespace MathConst;
 
-enum{LINEAR,WIGGLE,ROTATE,VARIABLE};
-enum{EQUAL,ATOM};
+enum { LINEAR, WIGGLE, ROTATE, VARIABLE, TRANSROT };
+enum { EQUAL, ATOM };
 
-#define INERTIA 0.2          // moment of inertia prefactor for ellipsoid
+#define INERTIA 0.2    // moment of inertia prefactor for ellipsoid
 
 /* ---------------------------------------------------------------------- */
 
 FixMove::FixMove(LAMMPS *lmp, int narg, char **arg) :
-  Fix(lmp, narg, arg),
-  xvarstr(nullptr), yvarstr(nullptr), zvarstr(nullptr), vxvarstr(nullptr),
-  vyvarstr(nullptr), vzvarstr(nullptr),
-  xoriginal(nullptr), toriginal(nullptr), qoriginal(nullptr),
-  displace(nullptr), velocity(nullptr)
+    Fix(lmp, narg, arg), xvarstr(nullptr), yvarstr(nullptr), zvarstr(nullptr), vxvarstr(nullptr),
+    vyvarstr(nullptr), vzvarstr(nullptr), xoriginal(nullptr), toriginal(nullptr),
+    qoriginal(nullptr), displace(nullptr), velocity(nullptr)
 {
-  if (narg < 4) error->all(FLERR,"Illegal fix move command");
+  if (narg < 4) error->all(FLERR, "Illegal fix move command");
 
   restart_global = 1;
   restart_peratom = 1;
@@ -71,125 +68,162 @@ FixMove::FixMove(LAMMPS *lmp, int narg, char **arg) :
 
   int iarg = 0;
 
-  if (strcmp(arg[3],"linear") == 0) {
-    if (narg < 7) error->all(FLERR,"Illegal fix move command");
+  if (strcmp(arg[3], "linear") == 0) {
+    if (narg < 7) error->all(FLERR, "Illegal fix move command");
     iarg = 7;
     mstyle = LINEAR;
-    if (strcmp(arg[4],"NULL") == 0) vxflag = 0;
+    if (strcmp(arg[4], "NULL") == 0)
+      vxflag = 0;
     else {
       vxflag = 1;
-      vx = utils::numeric(FLERR,arg[4],false,lmp);
+      vx = utils::numeric(FLERR, arg[4], false, lmp);
     }
-    if (strcmp(arg[5],"NULL") == 0) vyflag = 0;
+    if (strcmp(arg[5], "NULL") == 0)
+      vyflag = 0;
     else {
       vyflag = 1;
-      vy = utils::numeric(FLERR,arg[5],false,lmp);
+      vy = utils::numeric(FLERR, arg[5], false, lmp);
     }
-    if (strcmp(arg[6],"NULL") == 0) vzflag = 0;
+    if (strcmp(arg[6], "NULL") == 0)
+      vzflag = 0;
     else {
       vzflag = 1;
-      vz = utils::numeric(FLERR,arg[6],false,lmp);
+      vz = utils::numeric(FLERR, arg[6], false, lmp);
     }
 
-  } else if (strcmp(arg[3],"wiggle") == 0) {
-    if (narg < 8) error->all(FLERR,"Illegal fix move command");
+  } else if (strcmp(arg[3], "wiggle") == 0) {
+    if (narg < 8) error->all(FLERR, "Illegal fix move command");
     iarg = 8;
     mstyle = WIGGLE;
-    if (strcmp(arg[4],"NULL") == 0) axflag = 0;
+    if (strcmp(arg[4], "NULL") == 0)
+      axflag = 0;
     else {
       axflag = 1;
-      ax = utils::numeric(FLERR,arg[4],false,lmp);
+      ax = utils::numeric(FLERR, arg[4], false, lmp);
     }
-    if (strcmp(arg[5],"NULL") == 0) ayflag = 0;
+    if (strcmp(arg[5], "NULL") == 0)
+      ayflag = 0;
     else {
       ayflag = 1;
-      ay = utils::numeric(FLERR,arg[5],false,lmp);
+      ay = utils::numeric(FLERR, arg[5], false, lmp);
     }
-    if (strcmp(arg[6],"NULL") == 0) azflag = 0;
+    if (strcmp(arg[6], "NULL") == 0)
+      azflag = 0;
     else {
       azflag = 1;
-      az = utils::numeric(FLERR,arg[6],false,lmp);
+      az = utils::numeric(FLERR, arg[6], false, lmp);
     }
-    period = utils::numeric(FLERR,arg[7],false,lmp);
-    if (period <= 0.0) error->all(FLERR,"Illegal fix move command");
+    period = utils::numeric(FLERR, arg[7], false, lmp);
+    if (period <= 0.0) error->all(FLERR, "Illegal fix move command");
 
-  } else if (strcmp(arg[3],"rotate") == 0) {
-    if (narg < 11) error->all(FLERR,"Illegal fix move command");
+  } else if (strcmp(arg[3], "rotate") == 0) {
+    if (narg < 11) error->all(FLERR, "Illegal fix move command");
     iarg = 11;
     mstyle = ROTATE;
-    point[0] = utils::numeric(FLERR,arg[4],false,lmp);
-    point[1] = utils::numeric(FLERR,arg[5],false,lmp);
-    point[2] = utils::numeric(FLERR,arg[6],false,lmp);
-    axis[0] = utils::numeric(FLERR,arg[7],false,lmp);
-    axis[1] = utils::numeric(FLERR,arg[8],false,lmp);
-    axis[2] = utils::numeric(FLERR,arg[9],false,lmp);
-    period = utils::numeric(FLERR,arg[10],false,lmp);
-    if (period <= 0.0) error->all(FLERR,"Illegal fix move command");
+    point[0] = utils::numeric(FLERR, arg[4], false, lmp);
+    point[1] = utils::numeric(FLERR, arg[5], false, lmp);
+    point[2] = utils::numeric(FLERR, arg[6], false, lmp);
+    axis[0] = utils::numeric(FLERR, arg[7], false, lmp);
+    axis[1] = utils::numeric(FLERR, arg[8], false, lmp);
+    axis[2] = utils::numeric(FLERR, arg[9], false, lmp);
+    period = utils::numeric(FLERR, arg[10], false, lmp);
+    if (period <= 0.0) error->all(FLERR, "Illegal fix move command");
 
-  } else if (strcmp(arg[3],"variable") == 0) {
-    if (narg < 10) error->all(FLERR,"Illegal fix move command");
+  } else if (strcmp(arg[3], "transrot") == 0) {
+    if (narg < 11) error->all(FLERR, "Illegal fix move command");
+    iarg = 14;
+    mstyle = TRANSROT;
+    vxflag = vyflag = vzflag = 1;
+    vx = utils::numeric(FLERR, arg[4], false, lmp);
+    vy = utils::numeric(FLERR, arg[5], false, lmp);
+    vz = utils::numeric(FLERR, arg[6], false, lmp);
+    point[0] = utils::numeric(FLERR, arg[7], false, lmp);
+    point[1] = utils::numeric(FLERR, arg[8], false, lmp);
+    point[2] = utils::numeric(FLERR, arg[9], false, lmp);
+    axis[0] = utils::numeric(FLERR, arg[10], false, lmp);
+    axis[1] = utils::numeric(FLERR, arg[11], false, lmp);
+    axis[2] = utils::numeric(FLERR, arg[12], false, lmp);
+    period = utils::numeric(FLERR, arg[13], false, lmp);
+    if (period <= 0.0) error->all(FLERR, "Illegal fix move command");
+
+  } else if (strcmp(arg[3], "variable") == 0) {
+    if (narg < 10) error->all(FLERR, "Illegal fix move command");
     iarg = 10;
     mstyle = VARIABLE;
-    if (strcmp(arg[4],"NULL") == 0) xvarstr = nullptr;
-    else if (utils::strmatch(arg[4],"^v_")) {
-      xvarstr = utils::strdup(arg[4]+2);
-    } else error->all(FLERR,"Illegal fix move command");
-    if (strcmp(arg[5],"NULL") == 0) yvarstr = nullptr;
-    else if (utils::strmatch(arg[5],"^v_")) {
-      yvarstr = utils::strdup(arg[5]+2);
-    } else error->all(FLERR,"Illegal fix move command");
-    if (strcmp(arg[6],"NULL") == 0) zvarstr = nullptr;
-    else if (utils::strmatch(arg[6],"^v_")) {
-      zvarstr = utils::strdup(arg[6]+2);
-    } else error->all(FLERR,"Illegal fix move command");
-    if (strcmp(arg[7],"NULL") == 0) vxvarstr = nullptr;
-    else if (utils::strmatch(arg[7],"^v_")) {
-      vxvarstr = utils::strdup(arg[7]+2);
-    } else error->all(FLERR,"Illegal fix move command");
-    if (strcmp(arg[8],"NULL") == 0) vyvarstr = nullptr;
-    else if (utils::strmatch(arg[8],"^v_")) {
-      vyvarstr = utils::strdup(arg[8]+2);
-    } else error->all(FLERR,"Illegal fix move command");
-    if (strcmp(arg[9],"NULL") == 0) vzvarstr = nullptr;
-    else if (utils::strmatch(arg[9],"^v_")) {
-      vzvarstr = utils::strdup(arg[9]+2);
-    } else error->all(FLERR,"Illegal fix move command");
+    if (strcmp(arg[4], "NULL") == 0)
+      xvarstr = nullptr;
+    else if (utils::strmatch(arg[4], "^v_")) {
+      xvarstr = utils::strdup(arg[4] + 2);
+    } else
+      error->all(FLERR, "Illegal fix move command");
+    if (strcmp(arg[5], "NULL") == 0)
+      yvarstr = nullptr;
+    else if (utils::strmatch(arg[5], "^v_")) {
+      yvarstr = utils::strdup(arg[5] + 2);
+    } else
+      error->all(FLERR, "Illegal fix move command");
+    if (strcmp(arg[6], "NULL") == 0)
+      zvarstr = nullptr;
+    else if (utils::strmatch(arg[6], "^v_")) {
+      zvarstr = utils::strdup(arg[6] + 2);
+    } else
+      error->all(FLERR, "Illegal fix move command");
+    if (strcmp(arg[7], "NULL") == 0)
+      vxvarstr = nullptr;
+    else if (utils::strmatch(arg[7], "^v_")) {
+      vxvarstr = utils::strdup(arg[7] + 2);
+    } else
+      error->all(FLERR, "Illegal fix move command");
+    if (strcmp(arg[8], "NULL") == 0)
+      vyvarstr = nullptr;
+    else if (utils::strmatch(arg[8], "^v_")) {
+      vyvarstr = utils::strdup(arg[8] + 2);
+    } else
+      error->all(FLERR, "Illegal fix move command");
+    if (strcmp(arg[9], "NULL") == 0)
+      vzvarstr = nullptr;
+    else if (utils::strmatch(arg[9], "^v_")) {
+      vzvarstr = utils::strdup(arg[9] + 2);
+    } else
+      error->all(FLERR, "Illegal fix move command");
 
-  } else error->all(FLERR,"Illegal fix move command");
+  } else
+    error->all(FLERR, "Illegal fix move command");
 
   // optional args
 
   int scaleflag = 1;
 
   while (iarg < narg) {
-    if (strcmp(arg[iarg],"units") == 0) {
-      if (iarg+2 > narg) error->all(FLERR,"Illegal fix move command");
-      if (strcmp(arg[iarg+1],"box") == 0) scaleflag = 0;
-      else if (strcmp(arg[iarg+1],"lattice") == 0) scaleflag = 1;
-      else error->all(FLERR,"Illegal fix move command");
+    if (strcmp(arg[iarg], "units") == 0) {
+      if (iarg + 2 > narg) error->all(FLERR, "Illegal fix move command");
+      if (strcmp(arg[iarg + 1], "box") == 0)
+        scaleflag = 0;
+      else if (strcmp(arg[iarg + 1], "lattice") == 0)
+        scaleflag = 1;
+      else
+        error->all(FLERR, "Illegal fix move command");
       iarg += 2;
-    } else error->all(FLERR,"Illegal fix move command");
+    } else
+      error->all(FLERR, "Illegal fix move command");
   }
 
   // error checks and warnings
 
   if (domain->dimension == 2) {
-    if (mstyle == LINEAR && vzflag && vz != 0.0)
-      error->all(FLERR,"Fix move cannot set linear z motion for 2d problem");
+    if (((mstyle == LINEAR) || (mstyle == TRANSROT)) && vzflag && (vz != 0.0))
+      error->all(FLERR, "Fix move cannot set linear z motion for 2d problem");
     if (mstyle == WIGGLE && azflag && az != 0.0)
-      error->all(FLERR,"Fix move cannot set wiggle z motion for 2d problem");
-    if (mstyle == ROTATE && (axis[0] != 0.0 || axis[1] != 0.0))
-      error->all(FLERR,
-                 "Fix move cannot rotate around non z-axis for 2d problem");
+      error->all(FLERR, "Fix move cannot set wiggle z motion for 2d problem");
+    if (((mstyle == ROTATE) || (mstyle == TRANSROT)) && (axis[0] != 0.0 || axis[1] != 0.0))
+      error->all(FLERR, "Fix move cannot rotate around non z-axis for 2d problem");
     if (mstyle == VARIABLE && (zvarstr || vzvarstr))
-      error->all(FLERR,
-                 "Fix move cannot define z or vz variable for 2d problem");
+      error->all(FLERR, "Fix move cannot define z or vz variable for 2d problem");
   }
 
   // setup scaling and apply scaling factors to velocity & amplitude
 
-  if ((mstyle == LINEAR || mstyle == WIGGLE || mstyle == ROTATE) &&
-      scaleflag) {
+  if ((mstyle != VARIABLE) && scaleflag) {
     double xscale = domain->lattice->xlattice;
     double yscale = domain->lattice->ylattice;
     double zscale = domain->lattice->zlattice;
@@ -206,22 +240,29 @@ FixMove::FixMove(LAMMPS *lmp, int narg, char **arg) :
       point[0] *= xscale;
       point[1] *= yscale;
       point[2] *= zscale;
+    } else if (mstyle == TRANSROT) {
+      vx *= xscale;
+      vy *= yscale;
+      vz *= zscale;
+      point[0] *= xscale;
+      point[1] *= yscale;
+      point[2] *= zscale;
     }
   }
 
   // set omega_rotate from period
 
-  if (mstyle == WIGGLE || mstyle == ROTATE) omega_rotate = MY_2PI / period;
+  if ((mstyle == WIGGLE) || (mstyle == ROTATE) || (mstyle == TRANSROT))
+    omega_rotate = MY_2PI / period;
 
   // runit = unit vector along rotation axis
 
-  if (mstyle == ROTATE) {
-    double len = sqrt(axis[0]*axis[0] + axis[1]*axis[1] + axis[2]*axis[2]);
-    if (len == 0.0)
-      error->all(FLERR,"Zero length rotation vector with fix move");
-    runit[0] = axis[0]/len;
-    runit[1] = axis[1]/len;
-    runit[2] = axis[2]/len;
+  if ((mstyle == ROTATE) || (mstyle == TRANSROT)) {
+    double len = sqrt(axis[0] * axis[0] + axis[1] * axis[1] + axis[2] * axis[2]);
+    if (len == 0.0) error->all(FLERR, "Zero length rotation vector with fix move");
+    runit[0] = axis[0] / len;
+    runit[1] = axis[1] / len;
+    runit[2] = axis[2] / len;
   }
 
   // set flags for extra attributes particles may store
@@ -273,15 +314,18 @@ FixMove::FixMove(LAMMPS *lmp, int narg, char **arg) :
   int nlocal = atom->nlocal;
 
   for (int i = 0; i < nlocal; i++) {
-    if (mask[i] & groupbit) domain->unmap(x[i],image[i],xoriginal[i]);
-    else xoriginal[i][0] = xoriginal[i][1] = xoriginal[i][2] = 0.0;
+    if (mask[i] & groupbit)
+      domain->unmap(x[i], image[i], xoriginal[i]);
+    else
+      xoriginal[i][0] = xoriginal[i][1] = xoriginal[i][2] = 0.0;
   }
 
   if (theta_flag) {
     for (int i = 0; i < nlocal; i++) {
       if ((mask[i] & groupbit) && line[i] >= 0)
         toriginal[i] = avec_line->bonus[line[i]].theta;
-      else toriginal[i] = 0.0;
+      else
+        toriginal[i] = 0.0;
     }
   }
 
@@ -302,8 +346,8 @@ FixMove::FixMove(LAMMPS *lmp, int narg, char **arg) :
         qoriginal[i][1] = quat[1];
         qoriginal[i][2] = quat[2];
         qoriginal[i][3] = quat[3];
-      } else qoriginal[i][0] = qoriginal[i][1] =
-               qoriginal[i][2] = qoriginal[i][3] = 0.0;
+      } else
+        qoriginal[i][0] = qoriginal[i][1] = qoriginal[i][2] = qoriginal[i][3] = 0.0;
     }
   }
 
@@ -325,8 +369,8 @@ FixMove::~FixMove()
 {
   // unregister callbacks to this fix from Atom class
 
-  atom->delete_callback(id,Atom::GROW);
-  atom->delete_callback(id,Atom::RESTART);
+  atom->delete_callback(id, Atom::GROW);
+  atom->delete_callback(id, Atom::RESTART);
 
   // delete locally stored arrays
 
@@ -336,12 +380,12 @@ FixMove::~FixMove()
   memory->destroy(displace);
   memory->destroy(velocity);
 
-  delete [] xvarstr;
-  delete [] yvarstr;
-  delete [] zvarstr;
-  delete [] vxvarstr;
-  delete [] vyvarstr;
-  delete [] vzvarstr;
+  delete[] xvarstr;
+  delete[] yvarstr;
+  delete[] zvarstr;
+  delete[] vxvarstr;
+  delete[] vyvarstr;
+  delete[] vzvarstr;
 }
 
 /* ---------------------------------------------------------------------- */
@@ -371,51 +415,63 @@ void FixMove::init()
   if (mstyle == VARIABLE) {
     if (xvarstr) {
       xvar = input->variable->find(xvarstr);
-      if (xvar < 0) error->all(FLERR,
-                               "Variable name for fix move does not exist");
-      if (input->variable->equalstyle(xvar)) xvarstyle = EQUAL;
-      else if (input->variable->atomstyle(xvar)) xvarstyle = ATOM;
-      else error->all(FLERR,"Variable for fix move is invalid style");
+      if (xvar < 0) error->all(FLERR, "Variable name for fix move does not exist");
+      if (input->variable->equalstyle(xvar))
+        xvarstyle = EQUAL;
+      else if (input->variable->atomstyle(xvar))
+        xvarstyle = ATOM;
+      else
+        error->all(FLERR, "Variable for fix move is invalid style");
     }
     if (yvarstr) {
       yvar = input->variable->find(yvarstr);
-      if (yvar < 0) error->all(FLERR,
-                               "Variable name for fix move does not exist");
-      if (input->variable->equalstyle(yvar)) yvarstyle = EQUAL;
-      else if (input->variable->atomstyle(yvar)) yvarstyle = ATOM;
-      else error->all(FLERR,"Variable for fix move is invalid style");
+      if (yvar < 0) error->all(FLERR, "Variable name for fix move does not exist");
+      if (input->variable->equalstyle(yvar))
+        yvarstyle = EQUAL;
+      else if (input->variable->atomstyle(yvar))
+        yvarstyle = ATOM;
+      else
+        error->all(FLERR, "Variable for fix move is invalid style");
     }
     if (zvarstr) {
       zvar = input->variable->find(zvarstr);
-      if (zvar < 0) error->all(FLERR,
-                               "Variable name for fix move does not exist");
-      if (input->variable->equalstyle(zvar)) zvarstyle = EQUAL;
-      else if (input->variable->atomstyle(zvar)) zvarstyle = ATOM;
-      else error->all(FLERR,"Variable for fix move is invalid style");
+      if (zvar < 0) error->all(FLERR, "Variable name for fix move does not exist");
+      if (input->variable->equalstyle(zvar))
+        zvarstyle = EQUAL;
+      else if (input->variable->atomstyle(zvar))
+        zvarstyle = ATOM;
+      else
+        error->all(FLERR, "Variable for fix move is invalid style");
     }
     if (vxvarstr) {
       vxvar = input->variable->find(vxvarstr);
-      if (vxvar < 0) error->all(FLERR,
-                                "Variable name for fix move does not exist");
-      if (input->variable->equalstyle(vxvar)) vxvarstyle = EQUAL;
-      else if (input->variable->atomstyle(vxvar)) vxvarstyle = ATOM;
-      else error->all(FLERR,"Variable for fix move is invalid style");
+      if (vxvar < 0) error->all(FLERR, "Variable name for fix move does not exist");
+      if (input->variable->equalstyle(vxvar))
+        vxvarstyle = EQUAL;
+      else if (input->variable->atomstyle(vxvar))
+        vxvarstyle = ATOM;
+      else
+        error->all(FLERR, "Variable for fix move is invalid style");
     }
     if (vyvarstr) {
       vyvar = input->variable->find(vyvarstr);
-      if (vyvar < 0) error->all(FLERR,
-                                "Variable name for fix move does not exist");
-      if (input->variable->equalstyle(vyvar)) vyvarstyle = EQUAL;
-      else if (input->variable->atomstyle(vyvar)) vyvarstyle = ATOM;
-      else error->all(FLERR,"Variable for fix move is invalid style");
+      if (vyvar < 0) error->all(FLERR, "Variable name for fix move does not exist");
+      if (input->variable->equalstyle(vyvar))
+        vyvarstyle = EQUAL;
+      else if (input->variable->atomstyle(vyvar))
+        vyvarstyle = ATOM;
+      else
+        error->all(FLERR, "Variable for fix move is invalid style");
     }
     if (vzvarstr) {
       vzvar = input->variable->find(vzvarstr);
-      if (vzvar < 0) error->all(FLERR,
-                                "Variable name for fix move does not exist");
-      if (input->variable->equalstyle(vzvar)) vzvarstyle = EQUAL;
-      else if (input->variable->atomstyle(vzvar)) vzvarstyle = ATOM;
-      else error->all(FLERR,"Variable for fix move is invalid style");
+      if (vzvar < 0) error->all(FLERR, "Variable name for fix move does not exist");
+      if (input->variable->equalstyle(vzvar))
+        vzvarstyle = EQUAL;
+      else if (input->variable->atomstyle(vzvar))
+        vzvarstyle = ATOM;
+      else
+        error->all(FLERR, "Variable for fix move is invalid style");
     }
 
     if (xvarstr && xvarstyle == ATOM) displaceflag = 1;
@@ -429,12 +485,16 @@ void FixMove::init()
   maxatom = atom->nmax;
   memory->destroy(displace);
   memory->destroy(velocity);
-  if (displaceflag) memory->create(displace,maxatom,3,"move:displace");
-  else displace = nullptr;
-  if (velocityflag) memory->create(velocity,maxatom,3,"move:velocity");
-  else velocity = nullptr;
+  if (displaceflag)
+    memory->create(displace, maxatom, 3, "move:displace");
+  else
+    displace = nullptr;
+  if (velocityflag)
+    memory->create(velocity, maxatom, 3, "move:velocity");
+  else
+    velocity = nullptr;
 
-  if (utils::strmatch(update->integrate_style,"^respa"))
+  if (utils::strmatch(update->integrate_style, "^respa"))
     nlevels_respa = ((Respa *) update->integrate)->nlevels;
 }
 
@@ -445,11 +505,11 @@ void FixMove::init()
 void FixMove::initial_integrate(int /*vflag*/)
 {
   int flag;
-  double ddotr,dx,dy,dz;
-  double dtfm,theta_new;
-  double xold[3],a[3],b[3],c[3],d[3],disp[3],w[3],ex[3],ey[3],ez[3];
-  double inertia_ellipsoid[3],qrotate[4];
-  double *quat,*inertia,*shape;
+  double ddotr, dx, dy, dz;
+  double dtfm, theta_new;
+  double xold[3], a[3], b[3], c[3], d[3], disp[3], w[3], ex[3], ey[3], ez[3];
+  double inertia_ellipsoid[3], qrotate[4];
+  double *quat, *inertia, *shape;
 
   double delta = (update->ntimestep - time_origin) * dt;
 
@@ -481,7 +541,7 @@ void FixMove::initial_integrate(int /*vflag*/)
 
         if (vxflag) {
           v[i][0] = vx;
-          x[i][0] = xoriginal[i][0] + vx*delta;
+          x[i][0] = xoriginal[i][0] + vx * delta;
         } else if (rmass) {
           dtfm = dtf / rmass[i];
           v[i][0] += dtfm * f[i][0];
@@ -494,7 +554,7 @@ void FixMove::initial_integrate(int /*vflag*/)
 
         if (vyflag) {
           v[i][1] = vy;
-          x[i][1] = xoriginal[i][1] + vy*delta;
+          x[i][1] = xoriginal[i][1] + vy * delta;
         } else if (rmass) {
           dtfm = dtf / rmass[i];
           v[i][1] += dtfm * f[i][1];
@@ -507,7 +567,7 @@ void FixMove::initial_integrate(int /*vflag*/)
 
         if (vzflag) {
           v[i][2] = vz;
-          x[i][2] = xoriginal[i][2] + vz*delta;
+          x[i][2] = xoriginal[i][2] + vz * delta;
         } else if (rmass) {
           dtfm = dtf / rmass[i];
           v[i][2] += dtfm * f[i][2];
@@ -518,7 +578,7 @@ void FixMove::initial_integrate(int /*vflag*/)
           x[i][2] += dtv * v[i][2];
         }
 
-        domain->remap_near(x[i],xold);
+        domain->remap_near(x[i], xold);
       }
     }
 
@@ -536,8 +596,8 @@ void FixMove::initial_integrate(int /*vflag*/)
         xold[2] = x[i][2];
 
         if (axflag) {
-          v[i][0] = ax*omega_rotate*cosine;
-          x[i][0] = xoriginal[i][0] + ax*sine;
+          v[i][0] = ax * omega_rotate * cosine;
+          x[i][0] = xoriginal[i][0] + ax * sine;
         } else if (rmass) {
           dtfm = dtf / rmass[i];
           v[i][0] += dtfm * f[i][0];
@@ -549,8 +609,8 @@ void FixMove::initial_integrate(int /*vflag*/)
         }
 
         if (ayflag) {
-          v[i][1] = ay*omega_rotate*cosine;
-          x[i][1] = xoriginal[i][1] + ay*sine;
+          v[i][1] = ay * omega_rotate * cosine;
+          x[i][1] = xoriginal[i][1] + ay * sine;
         } else if (rmass) {
           dtfm = dtf / rmass[i];
           v[i][1] += dtfm * f[i][1];
@@ -562,8 +622,8 @@ void FixMove::initial_integrate(int /*vflag*/)
         }
 
         if (azflag) {
-          v[i][2] = az*omega_rotate*cosine;
-          x[i][2] = xoriginal[i][2] + az*sine;
+          v[i][2] = az * omega_rotate * cosine;
+          x[i][2] = xoriginal[i][2] + az * sine;
         } else if (rmass) {
           dtfm = dtf / rmass[i];
           v[i][2] += dtfm * f[i][2];
@@ -574,7 +634,7 @@ void FixMove::initial_integrate(int /*vflag*/)
           x[i][2] += dtv * v[i][2];
         }
 
-        domain->remap_near(x[i],xold);
+        domain->remap_near(x[i], xold);
       }
     }
 
@@ -597,12 +657,12 @@ void FixMove::initial_integrate(int /*vflag*/)
     double cosine = cos(arg);
     double sine = sin(arg);
 
-    double qcosine = cos(0.5*arg);
-    double qsine = sin(0.5*arg);
+    double qcosine = cos(0.5 * arg);
+    double qsine = sin(0.5 * arg);
     qrotate[0] = qcosine;
-    qrotate[1] = runit[0]*qsine;
-    qrotate[2] = runit[1]*qsine;
-    qrotate[3] = runit[2]*qsine;
+    qrotate[1] = runit[0] * qsine;
+    qrotate[2] = runit[1] * qsine;
+    qrotate[3] = runit[2] * qsine;
 
     for (int i = 0; i < nlocal; i++) {
       if (mask[i] & groupbit) {
@@ -613,26 +673,26 @@ void FixMove::initial_integrate(int /*vflag*/)
         d[0] = xoriginal[i][0] - point[0];
         d[1] = xoriginal[i][1] - point[1];
         d[2] = xoriginal[i][2] - point[2];
-        ddotr = d[0]*runit[0] + d[1]*runit[1] + d[2]*runit[2];
-        c[0] = ddotr*runit[0];
-        c[1] = ddotr*runit[1];
-        c[2] = ddotr*runit[2];
+        ddotr = d[0] * runit[0] + d[1] * runit[1] + d[2] * runit[2];
+        c[0] = ddotr * runit[0];
+        c[1] = ddotr * runit[1];
+        c[2] = ddotr * runit[2];
         a[0] = d[0] - c[0];
         a[1] = d[1] - c[1];
         a[2] = d[2] - c[2];
-        b[0] = runit[1]*a[2] - runit[2]*a[1];
-        b[1] = runit[2]*a[0] - runit[0]*a[2];
-        b[2] = runit[0]*a[1] - runit[1]*a[0];
-        disp[0] = a[0]*cosine  + b[0]*sine;
-        disp[1] = a[1]*cosine  + b[1]*sine;
-        disp[2] = a[2]*cosine  + b[2]*sine;
+        b[0] = runit[1] * a[2] - runit[2] * a[1];
+        b[1] = runit[2] * a[0] - runit[0] * a[2];
+        b[2] = runit[0] * a[1] - runit[1] * a[0];
+        disp[0] = a[0] * cosine + b[0] * sine;
+        disp[1] = a[1] * cosine + b[1] * sine;
+        disp[2] = a[2] * cosine + b[2] * sine;
 
         x[i][0] = point[0] + c[0] + disp[0];
         x[i][1] = point[1] + c[1] + disp[1];
         x[i][2] = point[2] + c[2] + disp[2];
-        v[i][0] = omega_rotate * (runit[1]*disp[2] - runit[2]*disp[1]);
-        v[i][1] = omega_rotate * (runit[2]*disp[0] - runit[0]*disp[2]);
-        v[i][2] = omega_rotate * (runit[0]*disp[1] - runit[1]*disp[0]);
+        v[i][0] = omega_rotate * (runit[1] * disp[2] - runit[2] * disp[1]);
+        v[i][1] = omega_rotate * (runit[2] * disp[0] - runit[0] * disp[2]);
+        v[i][2] = omega_rotate * (runit[0] * disp[1] - runit[1] * disp[0]);
 
         // set any extra attributes affected by rotation
 
@@ -646,9 +706,9 @@ void FixMove::initial_integrate(int /*vflag*/)
             if (line_flag && line[i] >= 0.0) flag = 1;
             if (tri_flag && tri[i] >= 0.0) flag = 1;
             if (flag) {
-              omega[i][0] = omega_rotate*runit[0];
-              omega[i][1] = omega_rotate*runit[1];
-              omega[i][2] = omega_rotate*runit[2];
+              omega[i][0] = omega_rotate * runit[0];
+              omega[i][1] = omega_rotate * runit[1];
+              omega[i][2] = omega_rotate * runit[2];
             }
           }
 
@@ -660,11 +720,11 @@ void FixMove::initial_integrate(int /*vflag*/)
               quat = avec_ellipsoid->bonus[ellipsoid[i]].quat;
               shape = avec_ellipsoid->bonus[ellipsoid[i]].shape;
               inertia_ellipsoid[0] =
-                INERTIA*rmass[i] * (shape[1]*shape[1]+shape[2]*shape[2]);
+                  INERTIA * rmass[i] * (shape[1] * shape[1] + shape[2] * shape[2]);
               inertia_ellipsoid[1] =
-                INERTIA*rmass[i] * (shape[0]*shape[0]+shape[2]*shape[2]);
+                  INERTIA * rmass[i] * (shape[0] * shape[0] + shape[2] * shape[2]);
               inertia_ellipsoid[2] =
-                INERTIA*rmass[i] * (shape[0]*shape[0]+shape[1]*shape[1]);
+                  INERTIA * rmass[i] * (shape[0] * shape[0] + shape[1] * shape[1]);
               inertia = inertia_ellipsoid;
             } else if (tri_flag && tri[i] >= 0) {
               quat = avec_tri->bonus[tri[i]].quat;
@@ -674,18 +734,18 @@ void FixMove::initial_integrate(int /*vflag*/)
               inertia = avec_body->bonus[body[i]].inertia;
             }
             if (quat) {
-              w[0] = omega_rotate*runit[0];
-              w[1] = omega_rotate*runit[1];
-              w[2] = omega_rotate*runit[2];
-              MathExtra::q_to_exyz(quat,ex,ey,ez);
-              MathExtra::omega_to_angmom(w,ex,ey,ez,inertia,angmom[i]);
+              w[0] = omega_rotate * runit[0];
+              w[1] = omega_rotate * runit[1];
+              w[2] = omega_rotate * runit[2];
+              MathExtra::q_to_exyz(quat, ex, ey, ez);
+              MathExtra::omega_to_angmom(w, ex, ey, ez, inertia, angmom[i]);
             }
           }
 
           // theta for lines
 
           if (theta_flag && line[i] >= 0.0) {
-            theta_new = fmod(toriginal[i]+arg,MY_2PI);
+            theta_new = fmod(toriginal[i] + arg, MY_2PI);
             avec_line->bonus[atom->line[i]].theta = theta_new;
           }
 
@@ -699,11 +759,149 @@ void FixMove::initial_integrate(int /*vflag*/)
               quat = avec_tri->bonus[tri[i]].quat;
             else if (body_flag && body[i] >= 0)
               quat = avec_body->bonus[body[i]].quat;
-            if (quat) MathExtra::quatquat(qrotate,qoriginal[i],quat);
+            if (quat) MathExtra::quatquat(qrotate, qoriginal[i], quat);
           }
         }
 
-        domain->remap_near(x[i],xold);
+        domain->remap_near(x[i], xold);
+      }
+    }
+
+    // for rotate by right-hand rule around omega:
+    // P = point = vector = point of rotation
+    // R = vector = axis of rotation
+    // w = omega of rotation (from period)
+    // X0 = xoriginal = initial coord of atom
+    // R0 = runit = unit vector for R
+    // D = X0 - P = vector from P to X0
+    // C = (D dot R0) R0 = projection of atom coord onto R line
+    // A = D - C = vector from R line to X0
+    // B = R0 cross A = vector perp to A in plane of rotation
+    // A,B define plane of circular rotation around R line
+    // X = P + C + A cos(w*dt) + B sin(w*dt)
+    // V = w R0 cross (A cos(w*dt) + B sin(w*dt))
+    //
+    // add translation after rotation
+
+  } else if (mstyle == TRANSROT) {
+    double arg = omega_rotate * delta;
+    double cosine = cos(arg);
+    double sine = sin(arg);
+
+    double qcosine = cos(0.5 * arg);
+    double qsine = sin(0.5 * arg);
+    qrotate[0] = qcosine;
+    qrotate[1] = runit[0] * qsine;
+    qrotate[2] = runit[1] * qsine;
+    qrotate[3] = runit[2] * qsine;
+
+    for (int i = 0; i < nlocal; i++) {
+      if (mask[i] & groupbit) {
+        xold[0] = x[i][0];
+        xold[1] = x[i][1];
+        xold[2] = x[i][2];
+
+        d[0] = xoriginal[i][0] - point[0];
+        d[1] = xoriginal[i][1] - point[1];
+        d[2] = xoriginal[i][2] - point[2];
+        ddotr = d[0] * runit[0] + d[1] * runit[1] + d[2] * runit[2];
+        c[0] = ddotr * runit[0];
+        c[1] = ddotr * runit[1];
+        c[2] = ddotr * runit[2];
+        a[0] = d[0] - c[0];
+        a[1] = d[1] - c[1];
+        a[2] = d[2] - c[2];
+        b[0] = runit[1] * a[2] - runit[2] * a[1];
+        b[1] = runit[2] * a[0] - runit[0] * a[2];
+        b[2] = runit[0] * a[1] - runit[1] * a[0];
+        disp[0] = a[0] * cosine + b[0] * sine;
+        disp[1] = a[1] * cosine + b[1] * sine;
+        disp[2] = a[2] * cosine + b[2] * sine;
+
+        x[i][0] = point[0] + c[0] + disp[0];
+        x[i][1] = point[1] + c[1] + disp[1];
+        x[i][2] = point[2] + c[2] + disp[2];
+        v[i][0] = omega_rotate * (runit[1] * disp[2] - runit[2] * disp[1]);
+        v[i][1] = omega_rotate * (runit[2] * disp[0] - runit[0] * disp[2]);
+        v[i][2] = omega_rotate * (runit[0] * disp[1] - runit[1] * disp[0]);
+
+        // set any extra attributes affected by rotation
+
+        if (extra_flag) {
+
+          // omega for spheres, lines, tris
+
+          if (omega_flag) {
+            flag = 0;
+            if (radius_flag && radius[i] > 0.0) flag = 1;
+            if (line_flag && line[i] >= 0.0) flag = 1;
+            if (tri_flag && tri[i] >= 0.0) flag = 1;
+            if (flag) {
+              omega[i][0] = omega_rotate * runit[0];
+              omega[i][1] = omega_rotate * runit[1];
+              omega[i][2] = omega_rotate * runit[2];
+            }
+          }
+
+          // angmom for ellipsoids, tris, and bodies
+
+          if (angmom_flag) {
+            quat = inertia = nullptr;
+            if (ellipsoid_flag && ellipsoid[i] >= 0) {
+              quat = avec_ellipsoid->bonus[ellipsoid[i]].quat;
+              shape = avec_ellipsoid->bonus[ellipsoid[i]].shape;
+              inertia_ellipsoid[0] =
+                  INERTIA * rmass[i] * (shape[1] * shape[1] + shape[2] * shape[2]);
+              inertia_ellipsoid[1] =
+                  INERTIA * rmass[i] * (shape[0] * shape[0] + shape[2] * shape[2]);
+              inertia_ellipsoid[2] =
+                  INERTIA * rmass[i] * (shape[0] * shape[0] + shape[1] * shape[1]);
+              inertia = inertia_ellipsoid;
+            } else if (tri_flag && tri[i] >= 0) {
+              quat = avec_tri->bonus[tri[i]].quat;
+              inertia = avec_tri->bonus[tri[i]].inertia;
+            } else if (body_flag && body[i] >= 0) {
+              quat = avec_body->bonus[body[i]].quat;
+              inertia = avec_body->bonus[body[i]].inertia;
+            }
+            if (quat) {
+              w[0] = omega_rotate * runit[0];
+              w[1] = omega_rotate * runit[1];
+              w[2] = omega_rotate * runit[2];
+              MathExtra::q_to_exyz(quat, ex, ey, ez);
+              MathExtra::omega_to_angmom(w, ex, ey, ez, inertia, angmom[i]);
+            }
+          }
+
+          // theta for lines
+
+          if (theta_flag && line[i] >= 0.0) {
+            theta_new = fmod(toriginal[i] + arg, MY_2PI);
+            avec_line->bonus[atom->line[i]].theta = theta_new;
+          }
+
+          // quats for ellipsoids, tris, and bodies
+
+          if (quat_flag) {
+            quat = nullptr;
+            if (ellipsoid_flag && ellipsoid[i] >= 0)
+              quat = avec_ellipsoid->bonus[ellipsoid[i]].quat;
+            else if (tri_flag && tri[i] >= 0)
+              quat = avec_tri->bonus[tri[i]].quat;
+            else if (body_flag && body[i] >= 0)
+              quat = avec_body->bonus[body[i]].quat;
+            if (quat) MathExtra::quatquat(qrotate, qoriginal[i], quat);
+          }
+        }
+
+        v[i][0] += vx;
+        x[i][0] += vx * delta;
+        v[i][1] += vy;
+        x[i][1] += vy * delta;
+        v[i][2] += vz;
+        x[i][2] += vz * delta;
+
+        domain->remap_near(x[i], xold);
       }
     }
 
@@ -720,11 +918,11 @@ void FixMove::initial_integrate(int /*vflag*/)
       maxatom = atom->nmax;
       if (displaceflag) {
         memory->destroy(displace);
-        memory->create(displace,maxatom,3,"move:displace");
+        memory->create(displace, maxatom, 3, "move:displace");
       }
       if (velocityflag) {
         memory->destroy(velocity);
-        memory->create(velocity,maxatom,3,"move:velocity");
+        memory->create(velocity, maxatom, 3, "move:velocity");
       }
     }
 
@@ -733,28 +931,40 @@ void FixMove::initial_integrate(int /*vflag*/)
     modify->clearstep_compute();
 
     if (xvarstr) {
-      if (xvarstyle == EQUAL) dx = input->variable->compute_equal(xvar);
-      else input->variable->compute_atom(xvar,igroup,&displace[0][0],3,0);
+      if (xvarstyle == EQUAL)
+        dx = input->variable->compute_equal(xvar);
+      else
+        input->variable->compute_atom(xvar, igroup, &displace[0][0], 3, 0);
     }
     if (yvarstr) {
-      if (yvarstyle == EQUAL) dy = input->variable->compute_equal(yvar);
-      else input->variable->compute_atom(yvar,igroup,&displace[0][1],3,0);
+      if (yvarstyle == EQUAL)
+        dy = input->variable->compute_equal(yvar);
+      else
+        input->variable->compute_atom(yvar, igroup, &displace[0][1], 3, 0);
     }
     if (zvarstr) {
-      if (zvarstyle == EQUAL) dz = input->variable->compute_equal(zvar);
-      else input->variable->compute_atom(zvar,igroup,&displace[0][2],3,0);
+      if (zvarstyle == EQUAL)
+        dz = input->variable->compute_equal(zvar);
+      else
+        input->variable->compute_atom(zvar, igroup, &displace[0][2], 3, 0);
     }
     if (vxvarstr) {
-      if (vxvarstyle == EQUAL) vx = input->variable->compute_equal(vxvar);
-      else input->variable->compute_atom(vxvar,igroup,&velocity[0][0],3,0);
+      if (vxvarstyle == EQUAL)
+        vx = input->variable->compute_equal(vxvar);
+      else
+        input->variable->compute_atom(vxvar, igroup, &velocity[0][0], 3, 0);
     }
     if (vyvarstr) {
-      if (vyvarstyle == EQUAL) vy = input->variable->compute_equal(vyvar);
-      else input->variable->compute_atom(vyvar,igroup,&velocity[0][1],3,0);
+      if (vyvarstyle == EQUAL)
+        vy = input->variable->compute_equal(vyvar);
+      else
+        input->variable->compute_atom(vyvar, igroup, &velocity[0][1], 3, 0);
     }
     if (vzvarstr) {
-      if (vzvarstyle == EQUAL) vz = input->variable->compute_equal(vzvar);
-      else input->variable->compute_atom(vzvar,igroup,&velocity[0][2],3,0);
+      if (vzvarstyle == EQUAL)
+        vz = input->variable->compute_equal(vzvar);
+      else
+        input->variable->compute_atom(vzvar, igroup, &velocity[0][2], 3, 0);
     }
 
     modify->addstep_compute(update->ntimestep + 1);
@@ -768,16 +978,24 @@ void FixMove::initial_integrate(int /*vflag*/)
         xold[2] = x[i][2];
 
         if (xvarstr && vxvarstr) {
-          if (vxvarstyle == EQUAL) v[i][0] = vx;
-          else v[i][0] = velocity[i][0];
-          if (xvarstyle == EQUAL) x[i][0] = xoriginal[i][0] + dx;
-          else x[i][0] = xoriginal[i][0] + displace[i][0];
+          if (vxvarstyle == EQUAL)
+            v[i][0] = vx;
+          else
+            v[i][0] = velocity[i][0];
+          if (xvarstyle == EQUAL)
+            x[i][0] = xoriginal[i][0] + dx;
+          else
+            x[i][0] = xoriginal[i][0] + displace[i][0];
         } else if (xvarstr) {
-          if (xvarstyle == EQUAL) x[i][0] = xoriginal[i][0] + dx;
-          else x[i][0] = xoriginal[i][0] + displace[i][0];
+          if (xvarstyle == EQUAL)
+            x[i][0] = xoriginal[i][0] + dx;
+          else
+            x[i][0] = xoriginal[i][0] + displace[i][0];
         } else if (vxvarstr) {
-          if (vxvarstyle == EQUAL) v[i][0] = vx;
-          else v[i][0] = velocity[i][0];
+          if (vxvarstyle == EQUAL)
+            v[i][0] = vx;
+          else
+            v[i][0] = velocity[i][0];
           x[i][0] += dtv * v[i][0];
         } else {
           if (rmass) {
@@ -791,16 +1009,24 @@ void FixMove::initial_integrate(int /*vflag*/)
         }
 
         if (yvarstr && vyvarstr) {
-          if (vyvarstyle == EQUAL) v[i][1] = vy;
-          else v[i][1] = velocity[i][1];
-          if (yvarstyle == EQUAL) x[i][1] = xoriginal[i][1] + dy;
-          else x[i][1] = xoriginal[i][1] + displace[i][1];
+          if (vyvarstyle == EQUAL)
+            v[i][1] = vy;
+          else
+            v[i][1] = velocity[i][1];
+          if (yvarstyle == EQUAL)
+            x[i][1] = xoriginal[i][1] + dy;
+          else
+            x[i][1] = xoriginal[i][1] + displace[i][1];
         } else if (yvarstr) {
-          if (yvarstyle == EQUAL) x[i][1] = xoriginal[i][1] + dy;
-          else x[i][1] = xoriginal[i][1] + displace[i][1];
+          if (yvarstyle == EQUAL)
+            x[i][1] = xoriginal[i][1] + dy;
+          else
+            x[i][1] = xoriginal[i][1] + displace[i][1];
         } else if (vyvarstr) {
-          if (vyvarstyle == EQUAL) v[i][1] = vy;
-          else v[i][1] = velocity[i][1];
+          if (vyvarstyle == EQUAL)
+            v[i][1] = vy;
+          else
+            v[i][1] = velocity[i][1];
           x[i][1] += dtv * v[i][1];
         } else {
           if (rmass) {
@@ -814,16 +1040,24 @@ void FixMove::initial_integrate(int /*vflag*/)
         }
 
         if (zvarstr && vzvarstr) {
-          if (vzvarstyle == EQUAL) v[i][2] = vz;
-          else v[i][2] = velocity[i][2];
-          if (zvarstyle == EQUAL) x[i][2] = xoriginal[i][2] + dz;
-          else x[i][2] = xoriginal[i][2] + displace[i][2];
+          if (vzvarstyle == EQUAL)
+            v[i][2] = vz;
+          else
+            v[i][2] = velocity[i][2];
+          if (zvarstyle == EQUAL)
+            x[i][2] = xoriginal[i][2] + dz;
+          else
+            x[i][2] = xoriginal[i][2] + displace[i][2];
         } else if (zvarstr) {
-          if (zvarstyle == EQUAL) x[i][2] = xoriginal[i][2] + dz;
-          else x[i][2] = xoriginal[i][2] + displace[i][2];
+          if (zvarstyle == EQUAL)
+            x[i][2] = xoriginal[i][2] + dz;
+          else
+            x[i][2] = xoriginal[i][2] + displace[i][2];
         } else if (vzvarstr) {
-          if (vzvarstyle == EQUAL) v[i][2] = vz;
-          else v[i][2] = velocity[i][2];
+          if (vzvarstyle == EQUAL)
+            v[i][2] = vz;
+          else
+            v[i][2] = velocity[i][2];
           x[i][2] += dtv * v[i][2];
         } else {
           if (rmass) {
@@ -836,7 +1070,7 @@ void FixMove::initial_integrate(int /*vflag*/)
           x[i][2] += dtv * v[i][2];
         }
 
-        domain->remap_near(x[i],xold);
+        domain->remap_near(x[i], xold);
       }
     }
   }
@@ -851,22 +1085,40 @@ void FixMove::final_integrate()
   double dtfm;
 
   int xflag = 1;
-  if (mstyle == LINEAR && vxflag) xflag = 0;
-  else if (mstyle == WIGGLE && axflag) xflag = 0;
-  else if (mstyle == ROTATE) xflag = 0;
-  else if (mstyle == VARIABLE && (xvarstr || vxvarstr)) xflag = 0;
+  if (mstyle == LINEAR && vxflag)
+    xflag = 0;
+  else if (mstyle == WIGGLE && axflag)
+    xflag = 0;
+  else if (mstyle == ROTATE)
+    xflag = 0;
+  else if (mstyle == TRANSROT)
+    xflag = 0;
+  else if (mstyle == VARIABLE && (xvarstr || vxvarstr))
+    xflag = 0;
 
   int yflag = 1;
-  if (mstyle == LINEAR && vyflag) yflag = 0;
-  else if (mstyle == WIGGLE && ayflag) yflag = 0;
-  else if (mstyle == ROTATE) yflag = 0;
-  else if (mstyle == VARIABLE && (yvarstr || vyvarstr)) yflag = 0;
+  if (mstyle == LINEAR && vyflag)
+    yflag = 0;
+  else if (mstyle == WIGGLE && ayflag)
+    yflag = 0;
+  else if (mstyle == ROTATE)
+    yflag = 0;
+  else if (mstyle == TRANSROT)
+    yflag = 0;
+  else if (mstyle == VARIABLE && (yvarstr || vyvarstr))
+    yflag = 0;
 
   int zflag = 1;
-  if (mstyle == LINEAR && vzflag) zflag = 0;
-  else if (mstyle == WIGGLE && azflag) zflag = 0;
-  else if (mstyle == ROTATE) zflag = 0;
-  else if (mstyle == VARIABLE && (zvarstr || vzvarstr)) zflag = 0;
+  if (mstyle == LINEAR && vzflag)
+    zflag = 0;
+  else if (mstyle == WIGGLE && azflag)
+    zflag = 0;
+  else if (mstyle == ROTATE)
+    zflag = 0;
+  else if (mstyle == TRANSROT)
+    zflag = 0;
+  else if (mstyle == VARIABLE && (zvarstr || vzvarstr))
+    zflag = 0;
 
   double **v = atom->v;
   double **f = atom->f;
@@ -918,14 +1170,14 @@ void FixMove::initial_integrate_respa(int vflag, int ilevel, int /*iloop*/)
   // outermost level - update v and x
   // all other levels - nothing
 
-  if (ilevel == nlevels_respa-1) initial_integrate(vflag);
+  if (ilevel == nlevels_respa - 1) initial_integrate(vflag);
 }
 
 /* ---------------------------------------------------------------------- */
 
 void FixMove::final_integrate_respa(int ilevel, int /*iloop*/)
 {
-  if (ilevel == nlevels_respa-1) final_integrate();
+  if (ilevel == nlevels_respa - 1) final_integrate();
 }
 
 /* ----------------------------------------------------------------------
@@ -934,11 +1186,11 @@ void FixMove::final_integrate_respa(int ilevel, int /*iloop*/)
 
 double FixMove::memory_usage()
 {
-  double bytes = (double)atom->nmax*3 * sizeof(double);
-  if (theta_flag) bytes += (double)atom->nmax * sizeof(double);
-  if (quat_flag) bytes += (double)atom->nmax*4 * sizeof(double);
-  if (displaceflag) bytes += (double)atom->nmax*3 * sizeof(double);
-  if (velocityflag) bytes += (double)atom->nmax*3 * sizeof(double);
+  double bytes = (double) atom->nmax * 3 * sizeof(double);
+  if (theta_flag) bytes += (double) atom->nmax * sizeof(double);
+  if (quat_flag) bytes += (double) atom->nmax * 4 * sizeof(double);
+  if (displaceflag) bytes += (double) atom->nmax * 3 * sizeof(double);
+  if (velocityflag) bytes += (double) atom->nmax * 3 * sizeof(double);
   return bytes;
 }
 
@@ -954,8 +1206,8 @@ void FixMove::write_restart(FILE *fp)
 
   if (comm->me == 0) {
     int size = n * sizeof(double);
-    fwrite(&size,sizeof(int),1,fp);
-    fwrite(list,sizeof(double),n,fp);
+    fwrite(&size, sizeof(int), 1, fp);
+    fwrite(list, sizeof(double), n, fp);
   }
 }
 
@@ -968,7 +1220,7 @@ void FixMove::restart(char *buf)
   int n = 0;
   double *list = (double *) buf;
 
-  time_origin = static_cast<int> (list[n++]);
+  time_origin = static_cast<int>(list[n++]);
 }
 
 /* ----------------------------------------------------------------------
@@ -977,9 +1229,9 @@ void FixMove::restart(char *buf)
 
 void FixMove::grow_arrays(int nmax)
 {
-  memory->grow(xoriginal,nmax,3,"move:xoriginal");
-  if (theta_flag) memory->grow(toriginal,nmax,"move:toriginal");
-  if (quat_flag) memory->grow(qoriginal,nmax,4,"move:qoriginal");
+  memory->grow(xoriginal, nmax, 3, "move:xoriginal");
+  if (theta_flag) memory->grow(toriginal, nmax, "move:toriginal");
+  if (quat_flag) memory->grow(qoriginal, nmax, 4, "move:qoriginal");
   array_atom = xoriginal;
 }
 
@@ -1028,50 +1280,51 @@ void FixMove::set_arrays(int i)
   // current time still equal fix creation time
 
   if (update->ntimestep == time_origin) {
-    domain->unmap(x[i],image[i],xoriginal[i]);
+    domain->unmap(x[i], image[i], xoriginal[i]);
     return;
   }
 
   // backup particle to time_origin
 
-  if (mstyle == VARIABLE)
-    error->all(FLERR,"Cannot add atoms to fix move variable");
+  if (mstyle == VARIABLE) error->all(FLERR, "Cannot add atoms to fix move variable");
 
-  domain->unmap(x[i],image[i],xoriginal[i]);
+  domain->unmap(x[i], image[i], xoriginal[i]);
   double delta = (update->ntimestep - time_origin) * update->dt;
 
   if (mstyle == LINEAR) {
     if (vxflag) xoriginal[i][0] -= vx * delta;
     if (vyflag) xoriginal[i][1] -= vy * delta;
     if (vzflag) xoriginal[i][2] -= vz * delta;
+
   } else if (mstyle == WIGGLE) {
     double arg = omega_rotate * delta;
     double sine = sin(arg);
-    if (axflag) xoriginal[i][0] -= ax*sine;
-    if (ayflag) xoriginal[i][1] -= ay*sine;
-    if (azflag) xoriginal[i][2] -= az*sine;
+    if (axflag) xoriginal[i][0] -= ax * sine;
+    if (ayflag) xoriginal[i][1] -= ay * sine;
+    if (azflag) xoriginal[i][2] -= az * sine;
+
   } else if (mstyle == ROTATE) {
-    double a[3],b[3],c[3],d[3],disp[3],ddotr;
-    double arg = - omega_rotate * delta;
+    double a[3], b[3], c[3], d[3], disp[3], ddotr;
+    double arg = -omega_rotate * delta;
     double sine = sin(arg);
     double cosine = cos(arg);
     d[0] = x[i][0] - point[0];
     d[1] = x[i][1] - point[1];
     d[2] = x[i][2] - point[2];
-    ddotr = d[0]*runit[0] + d[1]*runit[1] + d[2]*runit[2];
-    c[0] = ddotr*runit[0];
-    c[1] = ddotr*runit[1];
-    c[2] = ddotr*runit[2];
+    ddotr = d[0] * runit[0] + d[1] * runit[1] + d[2] * runit[2];
+    c[0] = ddotr * runit[0];
+    c[1] = ddotr * runit[1];
+    c[2] = ddotr * runit[2];
 
     a[0] = d[0] - c[0];
     a[1] = d[1] - c[1];
     a[2] = d[2] - c[2];
-    b[0] = runit[1]*a[2] - runit[2]*a[1];
-    b[1] = runit[2]*a[0] - runit[0]*a[2];
-    b[2] = runit[0]*a[1] - runit[1]*a[0];
-    disp[0] = a[0]*cosine  + b[0]*sine;
-    disp[1] = a[1]*cosine  + b[1]*sine;
-    disp[2] = a[2]*cosine  + b[2]*sine;
+    b[0] = runit[1] * a[2] - runit[2] * a[1];
+    b[1] = runit[2] * a[0] - runit[0] * a[2];
+    b[2] = runit[0] * a[1] - runit[1] * a[0];
+    disp[0] = a[0] * cosine + b[0] * sine;
+    disp[1] = a[1] * cosine + b[1] * sine;
+    disp[2] = a[2] * cosine + b[2] * sine;
 
     xoriginal[i][0] = point[0] + c[0] + disp[0];
     xoriginal[i][1] = point[1] + c[1] + disp[1];
@@ -1085,7 +1338,64 @@ void FixMove::set_arrays(int i)
 
       if (theta_flag && line[i] >= 0.0) {
         theta = avec_line->bonus[atom->line[i]].theta;
-        toriginal[i] = theta - 0.0;  // NOTE: edit this line
+        toriginal[i] = theta - 0.0;    // NOTE: edit this line
+      }
+
+      // quats for ellipsoids, tris, and bodies
+
+      if (quat_flag) {
+        quat = nullptr;
+        if (ellipsoid_flag && ellipsoid[i] >= 0)
+          quat = avec_ellipsoid->bonus[ellipsoid[i]].quat;
+        else if (tri_flag && tri[i] >= 0)
+          quat = avec_tri->bonus[tri[i]].quat;
+        else if (body_flag && body[i] >= 0)
+          quat = avec_body->bonus[body[i]].quat;
+        if (quat) {
+          // qoriginal = f(quat,-delta);   // NOTE: edit this line
+        }
+      }
+    }
+    xoriginal[i][0] -= vx * delta;
+    xoriginal[i][1] -= vy * delta;
+    xoriginal[i][2] -= vz * delta;
+
+  } else if (mstyle == TRANSROT) {
+    double a[3], b[3], c[3], d[3], disp[3], ddotr;
+    double arg = -omega_rotate * delta;
+    double sine = sin(arg);
+    double cosine = cos(arg);
+    d[0] = x[i][0] - point[0];
+    d[1] = x[i][1] - point[1];
+    d[2] = x[i][2] - point[2];
+    ddotr = d[0] * runit[0] + d[1] * runit[1] + d[2] * runit[2];
+    c[0] = ddotr * runit[0];
+    c[1] = ddotr * runit[1];
+    c[2] = ddotr * runit[2];
+
+    a[0] = d[0] - c[0];
+    a[1] = d[1] - c[1];
+    a[2] = d[2] - c[2];
+    b[0] = runit[1] * a[2] - runit[2] * a[1];
+    b[1] = runit[2] * a[0] - runit[0] * a[2];
+    b[2] = runit[0] * a[1] - runit[1] * a[0];
+    disp[0] = a[0] * cosine + b[0] * sine;
+    disp[1] = a[1] * cosine + b[1] * sine;
+    disp[2] = a[2] * cosine + b[2] * sine;
+
+    xoriginal[i][0] = point[0] + c[0] + disp[0];
+    xoriginal[i][1] = point[1] + c[1] + disp[1];
+    xoriginal[i][2] = point[2] + c[2] + disp[2];
+
+    // set theta and quat extra attributes affected by rotation
+
+    if (extra_flag) {
+
+      // theta for lines
+
+      if (theta_flag && line[i] >= 0.0) {
+        theta = avec_line->bonus[atom->line[i]].theta;
+        toriginal[i] = theta - 0.0;    // NOTE: edit this line
       }
 
       // quats for ellipsoids, tris, and bodies
@@ -1180,7 +1490,7 @@ void FixMove::unpack_restart(int nlocal, int nth)
   // unpack the Nth first values this way because other fixes pack them
 
   int m = 0;
-  for (int i = 0; i < nth; i++) m += static_cast<int> (extra[nlocal][m]);
+  for (int i = 0; i < nth; i++) m += static_cast<int>(extra[nlocal][m]);
   m++;
 
   xoriginal[nlocal][0] = extra[nlocal][m++];
@@ -1217,5 +1527,5 @@ int FixMove::size_restart(int /*nlocal*/)
 
 void FixMove::reset_dt()
 {
-  error->all(FLERR,"Resetting timestep size is not allowed with fix move");
+  error->all(FLERR, "Resetting timestep size is not allowed with fix move");
 }
diff --git a/unittest/force-styles/tests/fix-timestep-move_linear.yaml b/unittest/force-styles/tests/fix-timestep-move_linear.yaml
new file mode 100644
index 0000000000..6abef11272
--- /dev/null
+++ b/unittest/force-styles/tests/fix-timestep-move_linear.yaml
@@ -0,0 +1,74 @@
+---
+lammps_version: 24 Mar 2022
+date_generated: Sun Apr  3 03:06:24 2022
+epsilon: 2e-14
+skip_tests:
+prerequisites: ! |
+  atom full
+  fix move
+pre_commands: ! ""
+post_commands: ! |
+  fix test solute move linear 1.0 1.0 1.0
+input_file: in.fourmol
+natoms: 29
+run_pos: ! |2
+    1  1.7200631633077317e+00  4.4726588069312836e+00  1.8279913975585156e+00
+    2  2.3019708395540222e+00  4.9515239068888608e+00  1.1431026442709245e+00
+    3  1.3056462211944140e+00  3.2440473127136711e+00  1.3776619853110796e+00
+    4  4.2283858353148673e-01  3.4915333140468068e+00  7.5128731549594785e-01
+    5  1.1049823864064074e+00  2.9356812874307137e+00  2.4022773187148436e+00
+    6  2.2941260793770599e+00  2.2271928265665291e+00  7.1569059321421302e-01
+    7  2.3401987106287963e+00  1.9908722649924213e+00 -4.6331132243045614e-01
+    8  3.1641187171852803e+00  1.5162469404461476e+00  1.3234017623263132e+00
+    9  3.3777459838125838e+00  1.7463366133047700e+00  2.2687764473032632e+00
+   10  4.0185283555536984e+00  5.7160331534826425e-01  1.0326647272886698e+00
+   11  3.7929780509347664e+00  1.2895245923125742e-02  1.1593733568143261e-01
+   12  5.0030247876861225e+00  1.5107668003242725e+00  3.8113414684627522e-01
+   13  6.0447273787895934e+00  1.0986800145255375e+00  3.6155527316791636e-01
+   14  4.6033152817257079e+00  1.5921023849403642e+00 -6.5544135388230629e-01
+   15  4.9756315249791303e+00  2.5633426972296931e+00  7.5623492454009922e-01
+   16  4.6517554244980310e+00 -3.9571104249784383e-01  2.0329083359991782e+00
+   17  4.2309964792710639e+00 -1.0229189433193842e-01  3.1491948328949437e+00
+   18  2.1384791188033843e+00  3.0177261773770208e+00 -3.5160827596876225e+00
+   19  1.5349125211132961e+00  2.6315969880333707e+00 -4.2472859440220647e+00
+   20  2.7641167828863153e+00  3.6833419064000221e+00 -3.9380850623312638e+00
+   21  4.9064454390208301e+00 -4.0751205255383196e+00 -3.6215576073601046e+00
+   22  4.3687453488627543e+00 -4.2054270536772504e+00 -4.4651491269372565e+00
+   23  5.7374928154769504e+00 -3.5763355905184966e+00 -3.8820297194230728e+00
+   24  2.0684115301174013e+00  3.1518221747664397e+00  3.1554242678474576e+00
+   25  1.2998381073113014e+00  3.2755513587518097e+00  2.5092990173114837e+00
+   26  2.5807438597688113e+00  4.0120175892854135e+00  3.2133398379059099e+00
+   27 -1.9613581876744359e+00 -4.3556300596085160e+00  2.1101467673534788e+00
+   28 -2.7406520384725965e+00 -4.0207251278130975e+00  1.5828689861678511e+00
+   29 -1.3108232656499081e+00 -3.5992986322410760e+00  2.2680459788743503e+00
+run_vel: ! |2
+    1  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    2  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    3  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    4  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    5  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    6  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    7  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    8  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+    9  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   10  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   11  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   12  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   13  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   14  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   15  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   16  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   17  1.0000000000000000e+00  1.0000000000000000e+00  1.0000000000000000e+00
+   18 -6.0936815808025862e-04 -9.3774557532468582e-04 -3.3558072507805731e-04
+   19 -6.9919768291957119e-04 -3.6060777270430031e-03  4.2833405289822791e-03
+   20  4.7777805013736515e-03  5.1003745845520452e-03  1.8002873923729241e-03
+   21 -9.5568188553430398e-04  1.6594630943762931e-04 -1.8199788009966615e-04
+   22 -3.3137518957653462e-03 -2.8683968287936054e-03  3.6384389958326871e-03
+   23  2.4209481134686401e-04 -4.5457709985051130e-03  2.7663581642115042e-03
+   24  2.5447450568861086e-04  4.8412447786110117e-04 -4.8021914527341357e-04
+   25  4.3722771097312743e-03 -4.5184411669545515e-03  2.5200952006556795e-03
+   26 -1.9250110555001179e-03 -3.0342169883610837e-03  3.5062814567984532e-03
+   27 -2.6510179146429716e-04  3.6306203629019116e-04 -5.6235585400647747e-04
+   28 -2.3068708109787484e-04 -8.5663070212203200e-04  2.1302563179109169e-03
+   29 -2.5054744388303732e-03 -1.6773997805290820e-04  2.8436699761004796e-03
+...
diff --git a/unittest/force-styles/tests/fix-timestep-move_rotate.yaml b/unittest/force-styles/tests/fix-timestep-move_rotate.yaml
new file mode 100644
index 0000000000..b4ac0113eb
--- /dev/null
+++ b/unittest/force-styles/tests/fix-timestep-move_rotate.yaml
@@ -0,0 +1,74 @@
+---
+lammps_version: 24 Mar 2022
+date_generated: Sun Apr  3 03:06:24 2022
+epsilon: 2e-14
+skip_tests:
+prerequisites: ! |
+  atom full
+  fix move
+pre_commands: ! ""
+post_commands: ! |
+  fix test solute move rotate -0.18 1.1 -1.8 0.1 0.5 1.0 1.0
+input_file: in.fourmol
+natoms: 29
+run_pos: ! |2
+    1 -2.7993683669226810e-01  2.4726588069312840e+00 -1.7200860244148450e-01
+    2  3.0197083955402271e-01  2.9515239068888608e+00 -8.5689735572907566e-01
+    3 -6.9435377880558635e-01  1.2440473127136713e+00 -6.2233801468892025e-01
+    4 -1.5771614164685133e+00  1.4915333140468072e+00 -1.2487126845040524e+00
+    5 -8.9501761359359322e-01  9.3568128743071388e-01  4.0227731871484329e-01
+    6  2.9412607937705959e-01  2.2719282656652884e-01 -1.2843094067857868e+00
+    7  3.4019871062879570e-01 -9.1277350075789077e-03 -2.4633113224304561e+00
+    8  1.1641187171852796e+00 -4.8375305955385284e-01 -6.7659823767368665e-01
+    9  1.3777459838125827e+00 -2.5366338669523070e-01  2.6877644730326344e-01
+   10  2.0185283555536979e+00 -1.4283966846517369e+00 -9.6733527271132957e-01
+   11  1.7929780509347653e+00 -1.9871047540768751e+00 -1.8840626643185665e+00
+   12  3.0030247876861216e+00 -4.8923319967572887e-01 -1.6188658531537241e+00
+   13  4.0447273787895925e+00 -9.0131998547446424e-01 -1.6384447268320823e+00
+   14  2.6033152817257070e+00 -4.0789761505963706e-01 -2.6554413538823054e+00
+   15  2.9756315249791299e+00  5.6334269722969144e-01 -1.2437650754599001e+00
+   16  2.6517554244980293e+00 -2.3957110424978456e+00  3.2908335999179084e-02
+   17  2.2309964792710621e+00 -2.1022918943319393e+00  1.1491948328949442e+00
+   18  2.1384791188033843e+00  3.0177261773770208e+00 -3.5160827596876225e+00
+   19  1.5349125211132961e+00  2.6315969880333707e+00 -4.2472859440220647e+00
+   20  2.7641167828863153e+00  3.6833419064000221e+00 -3.9380850623312638e+00
+   21  4.9064454390208301e+00 -4.0751205255383196e+00 -3.6215576073601046e+00
+   22  4.3687453488627543e+00 -4.2054270536772504e+00 -4.4651491269372565e+00
+   23  5.7374928154769504e+00 -3.5763355905184966e+00 -3.8820297194230728e+00
+   24  2.0684115301174013e+00  3.1518221747664397e+00  3.1554242678474576e+00
+   25  1.2998381073113014e+00  3.2755513587518097e+00  2.5092990173114837e+00
+   26  2.5807438597688113e+00  4.0120175892854135e+00  3.2133398379059099e+00
+   27 -1.9613581876744359e+00 -4.3556300596085160e+00  2.1101467673534788e+00
+   28 -2.7406520384725965e+00 -4.0207251278130975e+00  1.5828689861678511e+00
+   29 -1.3108232656499081e+00 -3.5992986322410760e+00  2.2680459788743503e+00
+run_vel: ! |2
+    1 -3.1271203016537847e+00 -1.4706662994868196e+00  1.0480451799087882e+00
+    2 -7.7244053214512371e+00  2.1699325352629488e+00 -3.1252573548635093e-01
+    3  2.4896794778365781e+00 -3.5382955941894649e+00  1.5201798493110743e+00
+    4 -6.4869308833104955e-01 -8.1292033418525875e+00  4.1294709797593985e+00
+    5  7.0834056891986288e+00 -5.2350417705319234e+00  1.9091803163460987e+00
+    6  6.3288343933688544e+00  2.3652656077349734e+00 -1.8155162432043723e+00
+    7  4.3519048787150751e+00  3.2831045265140744e+00 -2.0767427511285446e+00
+    8  1.2009192080844477e+01  6.8948882185832456e+00 -4.6483633173760710e+00
+    9  1.3367134079812072e+01  7.5614941211688285e+00 -5.1174604685656222e+00
+   10  1.6483139579874326e+01  1.1840191844735013e+01 -7.5684098803549391e+00
+   11  1.7044817849532322e+01  1.1090811320459508e+01 -7.2498874451829867e+00
+   12  9.4026927142316001e+00  1.7715611543038719e+01 -9.7980750429425196e+00
+   13  1.1654554684130250e+01  2.3557507710224332e+01 -1.2944209323525191e+01
+   14  6.0462972823495269e+00  1.6058458836709743e+01 -8.6338591465898240e+00
+   15  4.5607069302660133e+00  1.7352314681858498e+01 -9.1322280339558510e+00
+   16  2.4697122832464643e+01  1.4824797394989091e+01 -9.8821109807410092e+00
+   17  2.6178917667758348e+01  1.1844754445814953e+01 -8.5402689896833106e+00
+   18 -6.0936815808025862e-04 -9.3774557532468582e-04 -3.3558072507805731e-04
+   19 -6.9919768291957119e-04 -3.6060777270430031e-03  4.2833405289822791e-03
+   20  4.7777805013736515e-03  5.1003745845520452e-03  1.8002873923729241e-03
+   21 -9.5568188553430398e-04  1.6594630943762931e-04 -1.8199788009966615e-04
+   22 -3.3137518957653462e-03 -2.8683968287936054e-03  3.6384389958326871e-03
+   23  2.4209481134686401e-04 -4.5457709985051130e-03  2.7663581642115042e-03
+   24  2.5447450568861086e-04  4.8412447786110117e-04 -4.8021914527341357e-04
+   25  4.3722771097312743e-03 -4.5184411669545515e-03  2.5200952006556795e-03
+   26 -1.9250110555001179e-03 -3.0342169883610837e-03  3.5062814567984532e-03
+   27 -2.6510179146429716e-04  3.6306203629019116e-04 -5.6235585400647747e-04
+   28 -2.3068708109787484e-04 -8.5663070212203200e-04  2.1302563179109169e-03
+   29 -2.5054744388303732e-03 -1.6773997805290820e-04  2.8436699761004796e-03
+...
diff --git a/unittest/force-styles/tests/fix-timestep-move_transrot.yaml b/unittest/force-styles/tests/fix-timestep-move_transrot.yaml
new file mode 100644
index 0000000000..8f6629c5be
--- /dev/null
+++ b/unittest/force-styles/tests/fix-timestep-move_transrot.yaml
@@ -0,0 +1,74 @@
+---
+lammps_version: 24 Mar 2022
+date_generated: Sun Apr  3 03:07:42 2022
+epsilon: 2e-14
+skip_tests:
+prerequisites: ! |
+  atom full
+  fix move
+pre_commands: ! ""
+post_commands: ! |
+  fix test solute move transrot 0.1 0.2 -0.1 -0.18 1.1 -1.8 0.1 0.5 1.0 1.0
+input_file: in.fourmol
+natoms: 29
+run_pos: ! |2
+    1 -7.9936836692268087e-02  2.8726588069312839e+00 -3.7200860244148448e-01
+    2  5.0197083955402277e-01  3.3515239068888607e+00 -1.0568973557290757e+00
+    3 -4.9435377880558634e-01  1.6440473127136714e+00 -8.2233801468892032e-01
+    4 -1.3771614164685133e+00  1.8915333140468071e+00 -1.4487126845040523e+00
+    5 -6.9501761359359326e-01  1.3356812874307140e+00  2.0227731871484328e-01
+    6  4.9412607937705960e-01  6.2719282656652886e-01 -1.4843094067857867e+00
+    7  5.4019871062879576e-01  3.9087226499242111e-01 -2.6633113224304563e+00
+    8  1.3641187171852796e+00 -8.3753059553852816e-02 -8.7659823767368672e-01
+    9  1.5777459838125827e+00  1.4633661330476933e-01  6.8776447303263433e-02
+   10  2.2185283555536981e+00 -1.0283966846517369e+00 -1.1673352727113295e+00
+   11  1.9929780509347652e+00 -1.5871047540768752e+00 -2.0840626643185667e+00
+   12  3.2030247876861218e+00 -8.9233199675728847e-02 -1.8188658531537241e+00
+   13  4.2447273787895927e+00 -5.0131998547446421e-01 -1.8384447268320823e+00
+   14  2.8033152817257072e+00 -7.8976150596370420e-03 -2.8554413538823056e+00
+   15  3.1756315249791300e+00  9.6334269722969146e-01 -1.4437650754599001e+00
+   16  2.8517554244980294e+00 -1.9957110424978457e+00 -1.6709166400082093e-01
+   17  2.4309964792710623e+00 -1.7022918943319394e+00  9.4919483289494422e-01
+   18  2.1384791188033843e+00  3.0177261773770208e+00 -3.5160827596876225e+00
+   19  1.5349125211132961e+00  2.6315969880333707e+00 -4.2472859440220647e+00
+   20  2.7641167828863153e+00  3.6833419064000221e+00 -3.9380850623312638e+00
+   21  4.9064454390208301e+00 -4.0751205255383196e+00 -3.6215576073601046e+00
+   22  4.3687453488627543e+00 -4.2054270536772504e+00 -4.4651491269372565e+00
+   23  5.7374928154769504e+00 -3.5763355905184966e+00 -3.8820297194230728e+00
+   24  2.0684115301174013e+00  3.1518221747664397e+00  3.1554242678474576e+00
+   25  1.2998381073113014e+00  3.2755513587518097e+00  2.5092990173114837e+00
+   26  2.5807438597688113e+00  4.0120175892854135e+00  3.2133398379059099e+00
+   27 -1.9613581876744359e+00 -4.3556300596085160e+00  2.1101467673534788e+00
+   28 -2.7406520384725965e+00 -4.0207251278130975e+00  1.5828689861678511e+00
+   29 -1.3108232656499081e+00 -3.5992986322410760e+00  2.2680459788743503e+00
+run_vel: ! |2
+    1 -3.0271203016537847e+00 -1.2706662994868196e+00  9.4804517990878823e-01
+    2 -7.6244053214512375e+00  2.3699325352629490e+00 -4.1252573548635096e-01
+    3  2.5896794778365781e+00 -3.3382955941894648e+00  1.4201798493110742e+00
+    4 -5.4869308833104957e-01 -7.9292033418525874e+00  4.0294709797593988e+00
+    5  7.1834056891986284e+00 -5.0350417705319233e+00  1.8091803163460987e+00
+    6  6.4288343933688541e+00  2.5652656077349736e+00 -1.9155162432043724e+00
+    7  4.4519048787150748e+00  3.4831045265140745e+00 -2.1767427511285447e+00
+    8  1.2109192080844476e+01  7.0948882185832458e+00 -4.7483633173760706e+00
+    9  1.3467134079812071e+01  7.7614941211688286e+00 -5.2174604685656218e+00
+   10  1.6583139579874327e+01  1.2040191844735013e+01 -7.6684098803549388e+00
+   11  1.7144817849532323e+01  1.1290811320459508e+01 -7.3498874451829863e+00
+   12  9.5026927142315998e+00  1.7915611543038718e+01 -9.8980750429425193e+00
+   13  1.1754554684130250e+01  2.3757507710224331e+01 -1.3044209323525191e+01
+   14  6.1462972823495265e+00  1.6258458836709742e+01 -8.7338591465898237e+00
+   15  4.6607069302660129e+00  1.7552314681858498e+01 -9.2322280339558507e+00
+   16  2.4797122832464645e+01  1.5024797394989090e+01 -9.9821109807410089e+00
+   17  2.6278917667758350e+01  1.2044754445814952e+01 -8.6402689896833103e+00
+   18 -6.0936815808025862e-04 -9.3774557532468582e-04 -3.3558072507805731e-04
+   19 -6.9919768291957119e-04 -3.6060777270430031e-03  4.2833405289822791e-03
+   20  4.7777805013736515e-03  5.1003745845520452e-03  1.8002873923729241e-03
+   21 -9.5568188553430398e-04  1.6594630943762931e-04 -1.8199788009966615e-04
+   22 -3.3137518957653462e-03 -2.8683968287936054e-03  3.6384389958326871e-03
+   23  2.4209481134686401e-04 -4.5457709985051130e-03  2.7663581642115042e-03
+   24  2.5447450568861086e-04  4.8412447786110117e-04 -4.8021914527341357e-04
+   25  4.3722771097312743e-03 -4.5184411669545515e-03  2.5200952006556795e-03
+   26 -1.9250110555001179e-03 -3.0342169883610837e-03  3.5062814567984532e-03
+   27 -2.6510179146429716e-04  3.6306203629019116e-04 -5.6235585400647747e-04
+   28 -2.3068708109787484e-04 -8.5663070212203200e-04  2.1302563179109169e-03
+   29 -2.5054744388303732e-03 -1.6773997805290820e-04  2.8436699761004796e-03
+...
diff --git a/unittest/force-styles/tests/fix-timestep-move_variable.yaml b/unittest/force-styles/tests/fix-timestep-move_variable.yaml
new file mode 100644
index 0000000000..22accc9d3c
--- /dev/null
+++ b/unittest/force-styles/tests/fix-timestep-move_variable.yaml
@@ -0,0 +1,80 @@
+---
+lammps_version: 24 Mar 2022
+date_generated: Sun Apr  3 03:16:48 2022
+epsilon: 2e-14
+skip_tests:
+prerequisites: ! |
+  atom full
+  fix move
+pre_commands: ! |
+  variable vx equal 1.0
+  variable x  equal vdisplace(0.0,v_vx)
+  variable vy equal 0.0
+  variable y  equal 0.0
+  variable vz equal 0.0
+  variable z  equal 0.0
+post_commands: ! |
+  fix test solute move variable v_x v_y v_z v_vx v_vy v_vz
+input_file: in.fourmol
+natoms: 29
+run_pos: ! |2
+    1  1.7200631633077317e+00  2.4726588069312840e+00 -1.7200860244148433e-01
+    2  2.3019708395540222e+00  2.9515239068888608e+00 -8.5689735572907566e-01
+    3  1.3056462211944140e+00  1.2440473127136711e+00 -6.2233801468892025e-01
+    4  4.2283858353148673e-01  1.4915333140468066e+00 -1.2487126845040522e+00
+    5  1.1049823864064074e+00  9.3568128743071344e-01  4.0227731871484346e-01
+    6  2.2941260793770599e+00  2.2719282656652909e-01 -1.2843094067857870e+00
+    7  2.3401987106287963e+00 -9.1277350075786561e-03 -2.4633113224304561e+00
+    8  3.1641187171852803e+00 -4.8375305955385234e-01 -6.7659823767368688e-01
+    9  3.3777459838125838e+00 -2.5366338669522998e-01  2.6877644730326306e-01
+   10  4.0185283555536984e+00 -1.4283966846517357e+00 -9.6733527271133024e-01
+   11  3.7929780509347664e+00 -1.9871047540768743e+00 -1.8840626643185674e+00
+   12  5.0030247876861225e+00 -4.8923319967572748e-01 -1.6188658531537248e+00
+   13  6.0447273787895934e+00 -9.0131998547446246e-01 -1.6384447268320836e+00
+   14  4.6033152817257079e+00 -4.0789761505963579e-01 -2.6554413538823063e+00
+   15  4.9756315249791303e+00  5.6334269722969288e-01 -1.2437650754599008e+00
+   16  4.6517554244980310e+00 -2.3957110424978438e+00  3.2908335999178327e-02
+   17  4.2309964792710639e+00 -2.1022918943319384e+00  1.1491948328949437e+00
+   18  2.1384791188033843e+00  3.0177261773770208e+00 -3.5160827596876225e+00
+   19  1.5349125211132961e+00  2.6315969880333707e+00 -4.2472859440220647e+00
+   20  2.7641167828863153e+00  3.6833419064000221e+00 -3.9380850623312638e+00
+   21  4.9064454390208301e+00 -4.0751205255383196e+00 -3.6215576073601046e+00
+   22  4.3687453488627543e+00 -4.2054270536772504e+00 -4.4651491269372565e+00
+   23  5.7374928154769504e+00 -3.5763355905184966e+00 -3.8820297194230728e+00
+   24  2.0684115301174013e+00  3.1518221747664397e+00  3.1554242678474576e+00
+   25  1.2998381073113014e+00  3.2755513587518097e+00  2.5092990173114837e+00
+   26  2.5807438597688113e+00  4.0120175892854135e+00  3.2133398379059099e+00
+   27 -1.9613581876744359e+00 -4.3556300596085160e+00  2.1101467673534788e+00
+   28 -2.7406520384725965e+00 -4.0207251278130975e+00  1.5828689861678511e+00
+   29 -1.3108232656499081e+00 -3.5992986322410760e+00  2.2680459788743503e+00
+run_vel: ! |2
+    1  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    2  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    3  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    4  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    5  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    6  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    7  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    8  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+    9  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   10  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   11  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   12  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   13  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   14  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   15  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   16  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   17  1.0000000000000000e+00  0.0000000000000000e+00  0.0000000000000000e+00
+   18 -6.0936815808025862e-04 -9.3774557532468582e-04 -3.3558072507805731e-04
+   19 -6.9919768291957119e-04 -3.6060777270430031e-03  4.2833405289822791e-03
+   20  4.7777805013736515e-03  5.1003745845520452e-03  1.8002873923729241e-03
+   21 -9.5568188553430398e-04  1.6594630943762931e-04 -1.8199788009966615e-04
+   22 -3.3137518957653462e-03 -2.8683968287936054e-03  3.6384389958326871e-03
+   23  2.4209481134686401e-04 -4.5457709985051130e-03  2.7663581642115042e-03
+   24  2.5447450568861086e-04  4.8412447786110117e-04 -4.8021914527341357e-04
+   25  4.3722771097312743e-03 -4.5184411669545515e-03  2.5200952006556795e-03
+   26 -1.9250110555001179e-03 -3.0342169883610837e-03  3.5062814567984532e-03
+   27 -2.6510179146429716e-04  3.6306203629019116e-04 -5.6235585400647747e-04
+   28 -2.3068708109787484e-04 -8.5663070212203200e-04  2.1302563179109169e-03
+   29 -2.5054744388303732e-03 -1.6773997805290820e-04  2.8436699761004796e-03
+...
diff --git a/unittest/force-styles/tests/fix-timestep-move_wiggle.yaml b/unittest/force-styles/tests/fix-timestep-move_wiggle.yaml
new file mode 100644
index 0000000000..c07f0634d6
--- /dev/null
+++ b/unittest/force-styles/tests/fix-timestep-move_wiggle.yaml
@@ -0,0 +1,74 @@
+---
+lammps_version: 24 Mar 2022
+date_generated: Sun Apr  3 03:06:24 2022
+epsilon: 2e-14
+skip_tests:
+prerequisites: ! |
+  atom full
+  fix move
+pre_commands: ! ""
+post_commands: ! |
+  fix test solute move wiggle 1.0 0.5 -1.0 1.0
+input_file: in.fourmol
+natoms: 29
+run_pos: ! |2
+    1 -2.7993683669226882e-01  2.4726588069312836e+00 -1.7200860244148383e-01
+    2  3.0197083955402154e-01  2.9515239068888603e+00 -8.5689735572907522e-01
+    3 -6.9435377880558646e-01  1.2440473127136709e+00 -6.2233801468891981e-01
+    4 -1.5771614164685137e+00  1.4915333140468063e+00 -1.2487126845040517e+00
+    5 -8.9501761359359300e-01  9.3568128743071322e-01  4.0227731871484396e-01
+    6  2.9412607937705959e-01  2.2719282656652884e-01 -1.2843094067857865e+00
+    7  3.4019871062879559e-01 -9.1277350075789007e-03 -2.4633113224304557e+00
+    8  1.1641187171852800e+00 -4.8375305955385256e-01 -6.7659823767368643e-01
+    9  1.3777459838125834e+00 -2.5366338669523020e-01  2.6877644730326355e-01
+   10  2.0185283555536984e+00 -1.4283966846517360e+00 -9.6733527271132980e-01
+   11  1.7929780509347661e+00 -1.9871047540768745e+00 -1.8840626643185669e+00
+   12  3.0030247876861220e+00 -4.8923319967572770e-01 -1.6188658531537243e+00
+   13  4.0447273787895925e+00 -9.0131998547446268e-01 -1.6384447268320832e+00
+   14  2.6033152817257070e+00 -4.0789761505963601e-01 -2.6554413538823058e+00
+   15  2.9756315249791299e+00  5.6334269722969266e-01 -1.2437650754599003e+00
+   16  2.6517554244980301e+00 -2.3957110424978443e+00  3.2908335999178820e-02
+   17  2.2309964792710635e+00 -2.1022918943319389e+00  1.1491948328949442e+00
+   18  2.1384791188033843e+00  3.0177261773770208e+00 -3.5160827596876225e+00
+   19  1.5349125211132961e+00  2.6315969880333707e+00 -4.2472859440220647e+00
+   20  2.7641167828863153e+00  3.6833419064000221e+00 -3.9380850623312638e+00
+   21  4.9064454390208301e+00 -4.0751205255383196e+00 -3.6215576073601046e+00
+   22  4.3687453488627543e+00 -4.2054270536772504e+00 -4.4651491269372565e+00
+   23  5.7374928154769504e+00 -3.5763355905184966e+00 -3.8820297194230728e+00
+   24  2.0684115301174013e+00  3.1518221747664397e+00  3.1554242678474576e+00
+   25  1.2998381073113014e+00  3.2755513587518097e+00  2.5092990173114837e+00
+   26  2.5807438597688113e+00  4.0120175892854135e+00  3.2133398379059099e+00
+   27 -1.9613581876744359e+00 -4.3556300596085160e+00  2.1101467673534788e+00
+   28 -2.7406520384725965e+00 -4.0207251278130975e+00  1.5828689861678511e+00
+   29 -1.3108232656499081e+00 -3.5992986322410760e+00  2.2680459788743503e+00
+run_vel: ! |2
+    1  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    2  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    3  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    4  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    5  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    6  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    7  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    8  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+    9  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   10  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   11  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   12  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   13  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   14  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   15  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   16  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   17  6.2831853071795862e+00  3.1415926535897931e+00 -6.2831853071795862e+00
+   18 -6.0936815808025862e-04 -9.3774557532468582e-04 -3.3558072507805731e-04
+   19 -6.9919768291957119e-04 -3.6060777270430031e-03  4.2833405289822791e-03
+   20  4.7777805013736515e-03  5.1003745845520452e-03  1.8002873923729241e-03
+   21 -9.5568188553430398e-04  1.6594630943762931e-04 -1.8199788009966615e-04
+   22 -3.3137518957653462e-03 -2.8683968287936054e-03  3.6384389958326871e-03
+   23  2.4209481134686401e-04 -4.5457709985051130e-03  2.7663581642115042e-03
+   24  2.5447450568861086e-04  4.8412447786110117e-04 -4.8021914527341357e-04
+   25  4.3722771097312743e-03 -4.5184411669545515e-03  2.5200952006556795e-03
+   26 -1.9250110555001179e-03 -3.0342169883610837e-03  3.5062814567984532e-03
+   27 -2.6510179146429716e-04  3.6306203629019116e-04 -5.6235585400647747e-04
+   28 -2.3068708109787484e-04 -8.5663070212203200e-04  2.1302563179109169e-03
+   29 -2.5054744388303732e-03 -1.6773997805290820e-04  2.8436699761004796e-03
+...