patch 16Mar18

Merge pull request #843 from akohlmey/whitespace-cleanup
Whitespace cleanup for stable release
2018-03-19 08:26:58 -06:00 · 2018-03-16 14:44:30 -06:00 · 2018-03-16 13:21:54 -04:00 · 2018-03-16 12:37:27 -04:00 · 2018-03-16 12:34:33 -04:00 · 2018-03-16 09:26:59 -06:00
1270 changed files with 31370 additions and 23749 deletions
--- a/doc/src/Manual.txt
+++ b/doc/src/Manual.txt
@ -1,7 +1,7 @@
 <!-- HTML_ONLY -->
 <HEAD>
 <TITLE>LAMMPS Users Manual</TITLE>
-<META NAME="docnumber" CONTENT="22 Feb 2018 version">
+<META NAME="docnumber" CONTENT="16 Mar 2018 version">
 <META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
 <META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License.">
 </HEAD>
@ -21,7 +21,7 @@
 <H1></H1>

 LAMMPS Documentation :c,h3
-22 Feb 2018 version :c,h4
+16 Mar 2018 version :c,h4

 Version info: :h4

--- a/doc/src/PDF/colvars-refman-lammps.pdf
+++ b/doc/src/PDF/colvars-refman-lammps.pdf
--- a/doc/src/Section_start.txt
+++ b/doc/src/Section_start.txt
@ -803,6 +803,10 @@ currently installed. For those that are installed, it will list any
 files that are different in the src directory and package
 sub-directory.

+Typing "make package-installed" or "make pi" will list which packages are
+currently installed, without listing the status of packages that are not
+installed.
+
 Typing "make package-update" or "make pu" will overwrite src files
 with files from the package sub-directories if the package is
 installed.  It should be used after a patch has been applied, since
--- a/doc/src/pair_meam.txt
+++ b/doc/src/pair_meam.txt
@ -77,7 +77,7 @@ See the "pair_coeff"_pair_coeff.html doc page for alternate ways
 to specify the path for the potential files.

 As an example, the potentials/library.meam file has generic MEAM
-settings for a variety of elements.  The potentials/sic.meam file has
+settings for a variety of elements.  The potentials/SiC.meam file has
 specific parameter settings for a Si and C alloy system.  If your
 LAMMPS simulation has 4 atoms types and you want the 1st 3 to be Si,
 and the 4th to be C, you would use the following pair_coeff command:
@ -105,6 +105,15 @@ This can be used when a {meam} potential is used as part of the
 {hybrid} pair style.  The NULL values are placeholders for atom types
 that will be used with other potentials.

+NOTE: If the 2nd filename is NULL, the element names between the two
+filenames can appear in any order, e.g. "Si C" or "C Si" in the
+example above.  However, if the 2nd filename is not NULL (as in the
+example above), it contains settings that are Fortran-indexed for the
+elements that preceed it.  Thus you need to insure you list the
+elements between the filenames in an order consistent with how the
+values in the 2nd filename are indexed.  See details below on the
+syntax for settings in the 2nd file.
+
 The MEAM library file provided with LAMMPS has the name
 potentials/library.meam.  It is the "meamf" file used by other MD
 codes.  Aside from blank and comment lines (start with #) which can
@ -164,7 +173,15 @@ keyword(I) = value
 keyword(I,J) = value
 keyword(I,J,K) = value :pre

-The recognized keywords are as follows:
+The indices I, J, K correspond to the elements selected from the
+MEAM library file numbered in the order of how those elements were
+selected starting from 1. Thus for the example given below
+
+pair_coeff * * library.meam Si C sic.meam Si Si Si C :pre
+
+an index of 1 would refer to Si and an index of 2 to C. 
+
+The recognized keywords for the parameter file are as follows:

 Ec, alpha, rho0, delta, lattce, attrac, repuls, nn2, Cmin, Cmax, rc, delr,
 augt1, gsmooth_factor, re
--- a/doc/src/read_data.txt
+++ b/doc/src/read_data.txt
@ -15,10 +15,11 @@ read_data file keyword args ... :pre
 file = name of data file to read in :ulb,l
 zero or more keyword/arg pairs may be appended :l
 keyword = {add} or {offset} or {shift} or {extra/atom/types} or {extra/bond/types} or {extra/angle/types} or {extra/dihedral/types} or {extra/improper/types} or {extra/bond/per/atom} or {extra/angle/per/atom} or {extra/dihedral/per/atom} or {extra/improper/per/atom} or {group} or {nocoeff} or {fix} :l
-  {add} arg = {append} or {Nstart} or {merge}
-    append = add new atoms with IDs appended to current IDs
-    Nstart = add new atoms with IDs starting with Nstart
-    merge = add new atoms with their IDs unchanged
+  {add} arg = {append} or {IDoffset} or {IDoffset MOLoffset} or {merge}
+    append = add new atoms with atom IDs appended to current IDs
+    IDoffset = add new atoms with atom IDs having IDoffset added
+    MOLoffset = add new atoms with molecule IDs having MOLoffset added (only when molecule IDs are enabled)
+    merge = add new atoms with their atom IDs (and molecule IDs) unchanged
  {offset} args = toff boff aoff doff ioff
    toff = offset to add to atom types
    boff = offset to add to bond types
@ -120,20 +121,26 @@ boundary, then the atoms may become far apart if the box size grows.
 This will separate the atoms in the bond, which can lead to "lost"
 bond atoms or bad dynamics.

-The three choices for the {add} argument affect how the IDs of atoms
-in the data file are treated.  If {append} is specified, atoms in the
-data file are added to the current system, with their atom IDs reset
-so that an atomID = M in the data file becomes atomID = N+M, where N
-is the largest atom ID in the current system.  This rule is applied to
-all occurrences of atom IDs in the data file, e.g. in the Velocity or
-Bonds section.  If {Nstart} is specified, then {Nstart} is a numeric
-value is given, e.g. 1000, so that an atomID = M in the data file
-becomes atomID = 1000+M.  If {merge} is specified, the data file atoms
+The three choices for the {add} argument affect how the atom IDs and
+molecule IDs of atoms in the data file are treated.  If {append} is
+specified, atoms in the data file are added to the current system,
+with their atom IDs reset so that an atomID = M in the data file
+becomes atomID = N+M, where N is the largest atom ID in the current
+system.  This rule is applied to all occurrences of atom IDs in the
+data file, e.g. in the Velocity or Bonds section. This is also done
+for molecule IDs, if the atom style does support molecule IDs or
+they are enabled via fix property/atom. If {IDoffset} is specified,
+then {IDoffset} is a numeric value is given, e.g. 1000, so that an
+atomID = M in the data file becomes atomID = 1000+M. For systems
+with enabled molecule IDs, another numerical argument {MOLoffset}
+is required representing the equivalent offset for molecule IDs.
+If {merge} is specified, the data file atoms
 are added to the current system without changing their IDs.  They are
 assumed to merge (without duplication) with the currently defined
 atoms.  It is up to you to insure there are no multiply defined atom
 IDs, as LAMMPS only performs an incomplete check that this is the case
-by insuring the resulting max atomID >= the number of atoms.
+by insuring the resulting max atomID >= the number of atoms. For
+molecule IDs, there is no check done at all.

 The {offset} and {shift} keywords can only be used if the {add}
 keyword is also specified.
--- a/examples/reax/AB/log.5Oct16.AB.g++.4
+++ b/examples/reax/AB/log.5Oct16.AB.g++.4
@ -1,70 +0,0 @@
-LAMMPS (5 Oct 2016)
-# REAX potential for Nitroamines system
-# .....
-
-units		real
-
-atom_style	charge
-read_data	data.AB
-  orthogonal box = (0 0 0) to (25 25 25)
-  1 by 2 by 2 MPI processor grid
-  reading atoms ...
-  104 atoms
-
-pair_style	reax/c lmp_control
-pair_coeff	* * ffield.reax.AB H B N
-Reading potential file ffield.reax.AB with DATE: 2011-02-18
-
-neighbor	2 bin
-neigh_modify	every 10 delay 0 check no
-
-fix		1 all nve
-fix             2 all qeq/reax 1 0.0 10.0 1e-6 param.qeq
-fix             3 all temp/berendsen 500.0 500.0 100.0
-
-timestep	0.25
-
-#dump		1 all atom 30 dump.reax.ab
-
-run		3000
-Neighbor list info ...
-  2 neighbor list requests
-  update every 10 steps, delay 0 steps, check no
-  max neighbors/atom: 2000, page size: 100000
-  master list distance cutoff = 12
-  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 12.622 Mbytes
-Step Temp E_pair E_mol TotEng Press 
-       0            0   -8505.1816            0   -8505.1816   -673.36566 
-    3000    496.56561   -8405.3755            0   -8252.9182    472.58916 
-Loop time of 7.23109 on 4 procs for 3000 steps with 104 atoms
-
-Performance: 8.961 ns/day, 2.678 hours/ns, 414.875 timesteps/s
-99.4% CPU use with 4 MPI tasks x no OpenMP threads
-
-MPI task timing breakdown:
-Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
-Pair    | 5.705      | 5.7262     | 5.7504     |   0.7 | 79.19
-Neigh   | 0.14367    | 0.15976    | 0.16805    |   2.4 |  2.21
-Comm    | 0.053353   | 0.077311   | 0.097821   |   5.7 |  1.07
-Output  | 1.812e-05  | 1.9848e-05 | 2.408e-05  |   0.1 |  0.00
-Modify  | 1.2559     | 1.2647     | 1.2818     |   0.9 | 17.49
-Other   |            | 0.003126   |            |       |  0.04
-
-Nlocal:    26 ave 35 max 13 min
-Histogram: 1 0 0 0 0 1 0 0 1 1
-Nghost:    421 ave 450 max 377 min
-Histogram: 1 0 0 0 0 1 0 0 1 1
-Neighs:    847.25 ave 1149 max 444 min
-Histogram: 1 0 0 0 1 0 0 0 1 1
-
-Total # of neighbors = 3389
-Ave neighs/atom = 32.5865
-Neighbor list builds = 300
-Dangerous builds not checked
-
-Please see the log.cite file for references relevant to this simulation
-
-Total wall time: 0:00:07
--- a/examples/reax/AB/log.8Mar18.AB.g++.1
+++ b/examples/reax/AB/log.8Mar18.AB.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for Nitroamines system
 # .....

@ -28,43 +29,53 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 18.4119 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 19.3 | 19.3 | 19.3 Mbytes
 Step Temp E_pair E_mol TotEng Press 
       0            0   -8505.1816            0   -8505.1816   -673.36566 
-    3000    499.30579   -8405.1387            0   -8251.8401   -94.844317 
-Loop time of 12.5114 on 1 procs for 3000 steps with 104 atoms
+    3000    478.18595   -8398.4168            0   -8251.6025    1452.6935 
+Loop time of 14.3573 on 1 procs for 3000 steps with 104 atoms

-Performance: 5.179 ns/day, 4.634 hours/ns, 239.782 timesteps/s
-99.3% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 4.513 ns/day, 5.318 hours/ns, 208.952 timesteps/s
+96.6% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 11.137     | 11.137     | 11.137     |   0.0 | 89.01
-Neigh   | 0.29816    | 0.29816    | 0.29816    |   0.0 |  2.38
-Comm    | 0.016993   | 0.016993   | 0.016993   |   0.0 |  0.14
-Output  | 1.1921e-05 | 1.1921e-05 | 1.1921e-05 |   0.0 |  0.00
-Modify  | 1.0552     | 1.0552     | 1.0552     |   0.0 |  8.43
-Other   |            | 0.004142   |            |       |  0.03
+Pair    | 12.709     | 12.709     | 12.709     |   0.0 | 88.52
+Neigh   | 0.36804    | 0.36804    | 0.36804    |   0.0 |  2.56
+Comm    | 0.022419   | 0.022419   | 0.022419   |   0.0 |  0.16
+Output  | 2.8133e-05 | 2.8133e-05 | 2.8133e-05 |   0.0 |  0.00
+Modify  | 1.2513     | 1.2513     | 1.2513     |   0.0 |  8.72
+Other   |            | 0.006263   |            |       |  0.04

 Nlocal:    104 ave 104 max 104 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
 Nghost:    694 ave 694 max 694 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
-Neighs:    2927 ave 2927 max 2927 min
+Neighs:    2866 ave 2866 max 2866 min
 Histogram: 1 0 0 0 0 0 0 0 0 0

-Total # of neighbors = 2927
-Ave neighs/atom = 28.1442
+Total # of neighbors = 2866
+Ave neighs/atom = 27.5577
 Neighbor list builds = 300
 Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:12
+Total wall time: 0:00:14
--- a/examples/reax/AB/log.8Mar18.AB.g++.4
+++ b/examples/reax/AB/log.8Mar18.AB.g++.4
@ -0,0 +1,81 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# REAX potential for Nitroamines system
+# .....
+
+units		real
+
+atom_style	charge
+read_data	data.AB
+  orthogonal box = (0 0 0) to (25 25 25)
+  1 by 2 by 2 MPI processor grid
+  reading atoms ...
+  104 atoms
+
+pair_style	reax/c lmp_control
+pair_coeff	* * ffield.reax.AB H B N
+Reading potential file ffield.reax.AB with DATE: 2011-02-18
+
+neighbor	2 bin
+neigh_modify	every 10 delay 0 check no
+
+fix		1 all nve
+fix             2 all qeq/reax 1 0.0 10.0 1e-6 param.qeq
+fix             3 all temp/berendsen 500.0 500.0 100.0
+
+timestep	0.25
+
+#dump		1 all atom 30 dump.reax.ab
+
+run		3000
+Neighbor list info ...
+  update every 10 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 12
+  ghost atom cutoff = 12
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 12.38 | 13.22 | 13.64 Mbytes
+Step Temp E_pair E_mol TotEng Press 
+       0            0   -8505.1816            0   -8505.1816   -673.36566 
+    3000    555.17702   -8426.5541            0   -8256.1017    219.26856 
+Loop time of 9.03521 on 4 procs for 3000 steps with 104 atoms
+
+Performance: 7.172 ns/day, 3.346 hours/ns, 332.034 timesteps/s
+94.6% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 7.0347     | 7.0652     | 7.1049     |   1.0 | 78.20
+Neigh   | 0.18481    | 0.20727    | 0.22108    |   3.0 |  2.29
+Comm    | 0.075175   | 0.11496    | 0.14517    |   7.4 |  1.27
+Output  | 2.2888e-05 | 2.569e-05  | 3.1948e-05 |   0.0 |  0.00
+Modify  | 1.6286     | 1.6421     | 1.6649     |   1.1 | 18.17
+Other   |            | 0.005646   |            |       |  0.06
+
+Nlocal:    26 ave 35 max 13 min
+Histogram: 1 0 0 0 0 1 0 0 1 1
+Nghost:    420.25 ave 454 max 370 min
+Histogram: 1 0 0 0 0 1 0 0 1 1
+Neighs:    862.5 ave 1178 max 444 min
+Histogram: 1 0 0 0 1 0 0 0 1 1
+
+Total # of neighbors = 3450
+Ave neighs/atom = 33.1731
+Neighbor list builds = 300
+Dangerous builds not checked
+
+Please see the log.cite file for references relevant to this simulation
+
+Total wall time: 0:00:09
--- a/examples/reax/AuO/log.8Mar18.AuO.g++.1
+++ b/examples/reax/AuO/log.8Mar18.AuO.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for AuO system
 # .....

@ -28,30 +29,40 @@ timestep	0.25

 run		100
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 4 5
-Memory usage per processor = 144.382 Mbytes
+  binsize = 6, bins = 5 4 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 157.6 | 157.6 | 157.6 Mbytes
 Step Temp E_pair E_mol TotEng Press 
-       0            0   -72201.743            0   -72201.743    -166.1947 
-     100    69.043346    -72076.31            0   -71878.943    22702.308 
-Loop time of 17.7559 on 1 procs for 100 steps with 960 atoms
+       0            0   -72201.743            0   -72201.743   -166.19482 
+     100    69.043331   -72076.309            0   -71878.942     22702.89 
+Loop time of 18.4369 on 1 procs for 100 steps with 960 atoms

-Performance: 0.122 ns/day, 197.288 hours/ns, 5.632 timesteps/s
-99.8% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 0.117 ns/day, 204.854 hours/ns, 5.424 timesteps/s
+98.7% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 15.102     | 15.102     | 15.102     |   0.0 | 85.05
-Neigh   | 0.49358    | 0.49358    | 0.49358    |   0.0 |  2.78
-Comm    | 0.0067561  | 0.0067561  | 0.0067561  |   0.0 |  0.04
-Output  | 1.502e-05  | 1.502e-05  | 1.502e-05  |   0.0 |  0.00
-Modify  | 2.1525     | 2.1525     | 2.1525     |   0.0 | 12.12
-Other   |            | 0.001267   |            |       |  0.01
+Pair    | 15.373     | 15.373     | 15.373     |   0.0 | 83.38
+Neigh   | 0.58774    | 0.58774    | 0.58774    |   0.0 |  3.19
+Comm    | 0.0079026  | 0.0079026  | 0.0079026  |   0.0 |  0.04
+Output  | 3.171e-05  | 3.171e-05  | 3.171e-05  |   0.0 |  0.00
+Modify  | 2.4665     | 2.4665     | 2.4665     |   0.0 | 13.38
+Other   |            | 0.001366   |            |       |  0.01

 Nlocal:    960 ave 960 max 960 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
--- a/examples/reax/AuO/log.8Mar18.AuO.g++.4
+++ b/examples/reax/AuO/log.8Mar18.AuO.g++.4
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for AuO system
 # .....

@ -28,30 +29,40 @@ timestep	0.25

 run		100
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 4 5
-Memory usage per processor = 80.1039 Mbytes
+  binsize = 6, bins = 5 4 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 87.17 | 87.17 | 87.17 Mbytes
 Step Temp E_pair E_mol TotEng Press 
-       0            0   -72201.743            0   -72201.743   -166.20356 
-     100    69.043372    -72076.31            0   -71878.943    22701.855 
-Loop time of 7.66838 on 4 procs for 100 steps with 960 atoms
+       0            0   -72201.743            0   -72201.743    -166.2027 
+     100    69.043379    -72076.31            0   -71878.943    22701.771 
+Loop time of 8.44797 on 4 procs for 100 steps with 960 atoms

-Performance: 0.282 ns/day, 85.204 hours/ns, 13.041 timesteps/s
-99.7% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 0.256 ns/day, 93.866 hours/ns, 11.837 timesteps/s
+96.5% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 6.7833     | 6.7864     | 6.7951     |   0.2 | 88.50
-Neigh   | 0.2412     | 0.24206    | 0.24396    |   0.2 |  3.16
-Comm    | 0.010402   | 0.019419   | 0.022561   |   3.7 |  0.25
-Output  | 2.0981e-05 | 2.3007e-05 | 2.9087e-05 |   0.1 |  0.00
-Modify  | 0.61733    | 0.61964    | 0.62064    |   0.2 |  8.08
-Other   |            | 0.0007888  |            |       |  0.01
+Pair    | 7.3702     | 7.3757     | 7.3879     |   0.3 | 87.31
+Neigh   | 0.28875    | 0.29449    | 0.29747    |   0.6 |  3.49
+Comm    | 0.015008   | 0.027055   | 0.032681   |   4.3 |  0.32
+Output  | 2.4319e-05 | 2.8551e-05 | 3.8624e-05 |   0.0 |  0.00
+Modify  | 0.74721    | 0.74985    | 0.75539    |   0.4 |  8.88
+Other   |            | 0.0008975  |            |       |  0.01

 Nlocal:    240 ave 240 max 240 min
 Histogram: 4 0 0 0 0 0 0 0 0 0
@ -67,4 +78,4 @@ Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:07
+Total wall time: 0:00:08
--- a/examples/reax/CHO/log.8Mar18.CHO.g++.1
+++ b/examples/reax/CHO/log.8Mar18.CHO.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for CHO system
 # .....

@ -28,30 +29,40 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 17.7936 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 18.68 | 18.68 | 18.68 Mbytes
 Step Temp E_pair E_mol TotEng Press 
-       0            0   -10226.557            0   -10226.557   -106.09789 
-    3000    548.72503   -10170.457            0   -10000.349    34.314945 
-Loop time of 11.5678 on 1 procs for 3000 steps with 105 atoms
+       0            0   -10226.557            0   -10226.557   -106.09755 
+    3000     548.5116   -10170.389            0   -10000.348    40.372297 
+Loop time of 12.6046 on 1 procs for 3000 steps with 105 atoms

-Performance: 5.602 ns/day, 4.284 hours/ns, 259.340 timesteps/s
-99.3% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 5.141 ns/day, 4.668 hours/ns, 238.008 timesteps/s
+98.9% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 10.111     | 10.111     | 10.111     |   0.0 | 87.41
-Neigh   | 0.27992    | 0.27992    | 0.27992    |   0.0 |  2.42
-Comm    | 0.01603    | 0.01603    | 0.01603    |   0.0 |  0.14
-Output  | 1.2159e-05 | 1.2159e-05 | 1.2159e-05 |   0.0 |  0.00
-Modify  | 1.1563     | 1.1563     | 1.1563     |   0.0 | 10.00
-Other   |            | 0.004084   |            |       |  0.04
+Pair    | 10.931     | 10.931     | 10.931     |   0.0 | 86.72
+Neigh   | 0.33107    | 0.33107    | 0.33107    |   0.0 |  2.63
+Comm    | 0.017975   | 0.017975   | 0.017975   |   0.0 |  0.14
+Output  | 2.0742e-05 | 2.0742e-05 | 2.0742e-05 |   0.0 |  0.00
+Modify  | 1.3197     | 1.3197     | 1.3197     |   0.0 | 10.47
+Other   |            | 0.005059   |            |       |  0.04

 Nlocal:    105 ave 105 max 105 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
@ -67,4 +78,4 @@ Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:11
+Total wall time: 0:00:12
--- a/examples/reax/CHO/log.8Mar18.CHO.g++.4
+++ b/examples/reax/CHO/log.8Mar18.CHO.g++.4
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for CHO system
 # .....

@ -28,30 +29,40 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 12.9938 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 11.75 | 12.85 | 13.81 Mbytes
 Step Temp E_pair E_mol TotEng Press 
-       0            0   -10226.557            0   -10226.557    -106.0974 
-    3000    547.91377   -10170.194            0   -10000.338    61.118402 
-Loop time of 6.51546 on 4 procs for 3000 steps with 105 atoms
+       0            0   -10226.557            0   -10226.557   -106.09745 
+    3000    548.30567   -10170.323            0   -10000.346    47.794514 
+Loop time of 7.42367 on 4 procs for 3000 steps with 105 atoms

-Performance: 9.946 ns/day, 2.413 hours/ns, 460.443 timesteps/s
-99.1% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 8.729 ns/day, 2.750 hours/ns, 404.113 timesteps/s
+97.7% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 4.9869     | 5.0615     | 5.1246     |   2.3 | 77.68
-Neigh   | 0.12213    | 0.14723    | 0.17304    |   5.5 |  2.26
-Comm    | 0.05189    | 0.11582    | 0.18932    |  15.4 |  1.78
-Output  | 1.812e-05  | 2.0564e-05 | 2.5988e-05 |   0.1 |  0.00
-Modify  | 1.1626     | 1.1878     | 1.2122     |   1.9 | 18.23
-Other   |            | 0.003059   |            |       |  0.05
+Pair    | 5.3058     | 5.4086     | 5.4922     |   3.1 | 72.86
+Neigh   | 0.14791    | 0.17866    | 0.2106     |   6.5 |  2.41
+Comm    | 0.080185   | 0.16666    | 0.26933    |  17.7 |  2.24
+Output  | 2.5988e-05 | 2.8491e-05 | 3.4571e-05 |   0.0 |  0.00
+Modify  | 1.6364     | 1.6658     | 1.6941     |   2.0 | 22.44
+Other   |            | 0.003964   |            |       |  0.05

 Nlocal:    26.25 ave 45 max 6 min
 Histogram: 1 0 1 0 0 0 0 0 1 1
@ -67,4 +78,4 @@ Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:06
+Total wall time: 0:00:07
--- a/examples/reax/FC/log.8Mar18.FC.g++.1
+++ b/examples/reax/FC/log.8Mar18.FC.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for Nitroamines system
 # .....

@ -29,13 +30,23 @@ thermo          1
 dump            4 all xyz 5000 dumpnpt.xyz
 run             10
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 28 27 17
-Memory usage per processor = 440.212 Mbytes
+  binsize = 6, bins = 28 27 17
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 470 | 470 | 470 Mbytes
 Step Temp E_pair TotEng Press 
       0            0   -808525.04   -808525.04    58194.694 
       1    4.9935726   -808803.89   -808546.69    58205.825 
@ -48,20 +59,20 @@ Step Temp E_pair TotEng Press
       8    320.17692   -826387.27   -809896.43    58886.877 
       9    404.17073   -831129.48    -810312.5    59064.551 
      10    497.02486   -836425.19   -810825.72    59260.714 
-Loop time of 20.3094 on 1 procs for 10 steps with 17280 atoms
+Loop time of 21.5054 on 1 procs for 10 steps with 17280 atoms

-Performance: 0.009 ns/day, 2820.746 hours/ns, 0.492 timesteps/s
-99.9% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 0.008 ns/day, 2986.857 hours/ns, 0.465 timesteps/s
+98.8% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 18.124     | 18.124     | 18.124     |   0.0 | 89.24
-Neigh   | 0.072459   | 0.072459   | 0.072459   |   0.0 |  0.36
-Comm    | 0.00077629 | 0.00077629 | 0.00077629 |   0.0 |  0.00
-Output  | 0.00075412 | 0.00075412 | 0.00075412 |   0.0 |  0.00
-Modify  | 2.1109     | 2.1109     | 2.1109     |   0.0 | 10.39
-Other   |            | 0.0005426  |            |       |  0.00
+Pair    | 19.008     | 19.008     | 19.008     |   0.0 | 88.39
+Neigh   | 0.084401   | 0.084401   | 0.084401   |   0.0 |  0.39
+Comm    | 0.00080419 | 0.00080419 | 0.00080419 |   0.0 |  0.00
+Output  | 0.00095367 | 0.00095367 | 0.00095367 |   0.0 |  0.00
+Modify  | 2.4109     | 2.4109     | 2.4109     |   0.0 | 11.21
+Other   |            | 0.0004592  |            |       |  0.00

 Nlocal:    17280 ave 17280 max 17280 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
@ -85,7 +96,7 @@ timestep	0.2
 #dump            6 all custom 5000 dumpidtype.dat id type x y z

 run		10
-Memory usage per processor = 440.212 Mbytes
+Per MPI rank memory allocation (min/avg/max) = 470 | 470 | 470 Mbytes
 Step Temp E_pair TotEng Press 
      10    497.02486   -836425.19   -810825.72    59260.714 
      11    601.65141   -841814.22   -810825.91    59489.422 
@ -98,20 +109,20 @@ Step Temp E_pair TotEng Press
      18     1623.072   -894534.04   -810937.04    61739.541 
      19    1812.1865   -904337.99   -811000.57    62200.561 
      20    2011.5899   -915379.19   -811771.41    63361.151 
-Loop time of 20.3051 on 1 procs for 10 steps with 17280 atoms
+Loop time of 21.362 on 1 procs for 10 steps with 17280 atoms

-Performance: 0.009 ns/day, 2820.155 hours/ns, 0.492 timesteps/s
-99.9% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 0.008 ns/day, 2966.945 hours/ns, 0.468 timesteps/s
+98.9% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 18.008     | 18.008     | 18.008     |   0.0 | 88.69
-Neigh   | 0.069963   | 0.069963   | 0.069963   |   0.0 |  0.34
-Comm    | 0.00077033 | 0.00077033 | 0.00077033 |   0.0 |  0.00
-Output  | 0.00077224 | 0.00077224 | 0.00077224 |   0.0 |  0.00
-Modify  | 2.225      | 2.225      | 2.225      |   0.0 | 10.96
-Other   |            | 0.0005276  |            |       |  0.00
+Pair    | 18.793     | 18.793     | 18.793     |   0.0 | 87.97
+Neigh   | 0.077047   | 0.077047   | 0.077047   |   0.0 |  0.36
+Comm    | 0.00080276 | 0.00080276 | 0.00080276 |   0.0 |  0.00
+Output  | 0.0010097  | 0.0010097  | 0.0010097  |   0.0 |  0.00
+Modify  | 2.4897     | 2.4897     | 2.4897     |   0.0 | 11.65
+Other   |            | 0.0004568  |            |       |  0.00

 Nlocal:    17280 ave 17280 max 17280 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
@ -127,4 +138,4 @@ Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:45
+Total wall time: 0:00:47
--- a/examples/reax/FC/log.8Mar18.FC.g++.4
+++ b/examples/reax/FC/log.8Mar18.FC.g++.4
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for Nitroamines system
 # .....

@ -29,13 +30,23 @@ thermo          1
 dump            4 all xyz 5000 dumpnpt.xyz
 run             10
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 28 27 17
-Memory usage per processor = 140.018 Mbytes
+  binsize = 6, bins = 28 27 17
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 149.3 | 149.3 | 149.3 Mbytes
 Step Temp E_pair TotEng Press 
       0            0   -808525.04   -808525.04    58194.694 
       1    4.9935726   -808803.89   -808546.69    58205.825 
@ -48,20 +59,20 @@ Step Temp E_pair TotEng Press
       8    320.17692   -826387.27   -809896.43    58886.877 
       9    404.17073   -831129.48    -810312.5    59064.551 
      10    497.02486   -836425.19   -810825.72    59260.714 
-Loop time of 5.47494 on 4 procs for 10 steps with 17280 atoms
+Loop time of 6.02109 on 4 procs for 10 steps with 17280 atoms

-Performance: 0.032 ns/day, 760.408 hours/ns, 1.827 timesteps/s
-99.9% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 0.029 ns/day, 836.262 hours/ns, 1.661 timesteps/s
+99.0% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 4.5958     | 4.7748     | 4.8852     |   5.4 | 87.21
-Neigh   | 0.021961   | 0.022104   | 0.022431   |   0.1 |  0.40
-Comm    | 0.0077388  | 0.11804    | 0.29694    |  34.2 |  2.16
-Output  | 0.00047708 | 0.00051123 | 0.0005939  |   0.2 |  0.01
-Modify  | 0.55906    | 0.55927    | 0.55946    |   0.0 | 10.22
-Other   |            | 0.0002034  |            |       |  0.00
+Pair    | 4.9482     | 5.1186     | 5.3113     |   7.4 | 85.01
+Neigh   | 0.024811   | 0.025702   | 0.027556   |   0.7 |  0.43
+Comm    | 0.0027421  | 0.19541    | 0.36565    |  38.1 |  3.25
+Output  | 0.00053239 | 0.00057119 | 0.00067186 |   0.0 |  0.01
+Modify  | 0.67876    | 0.68059    | 0.68165    |   0.1 | 11.30
+Other   |            | 0.0001779  |            |       |  0.00

 Nlocal:    4320 ave 4320 max 4320 min
 Histogram: 4 0 0 0 0 0 0 0 0 0
@ -85,7 +96,7 @@ timestep	0.2
 #dump            6 all custom 5000 dumpidtype.dat id type x y z

 run		10
-Memory usage per processor = 140.018 Mbytes
+Per MPI rank memory allocation (min/avg/max) = 149.3 | 149.3 | 149.3 Mbytes
 Step Temp E_pair TotEng Press 
      10    497.02486   -836425.19   -810825.72    59260.714 
      11    601.65141   -841814.22   -810825.91    59489.422 
@ -98,20 +109,20 @@ Step Temp E_pair TotEng Press
      18     1623.072   -894534.04   -810937.04    61739.541 
      19    1812.1865   -904337.99   -811000.57    62200.561 
      20    2011.5899   -915379.19   -811771.41    63361.151 
-Loop time of 5.49026 on 4 procs for 10 steps with 17280 atoms
+Loop time of 6.08805 on 4 procs for 10 steps with 17280 atoms

-Performance: 0.031 ns/day, 762.536 hours/ns, 1.821 timesteps/s
-99.9% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 0.028 ns/day, 845.563 hours/ns, 1.643 timesteps/s
+99.2% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 4.5657     | 4.7603     | 4.8596     |   5.4 | 86.70
-Neigh   | 0.021023   | 0.021468   | 0.022176   |   0.3 |  0.39
-Comm    | 0.016467   | 0.1157     | 0.31031    |  34.7 |  2.11
-Output  | 0.00047684 | 0.00050694 | 0.00059295 |   0.2 |  0.01
-Modify  | 0.59135    | 0.59207    | 0.59251    |   0.1 | 10.78
-Other   |            | 0.0001938  |            |       |  0.00
+Pair    | 4.9124     | 5.1008     | 5.3405     |   8.3 | 83.78
+Neigh   | 0.023652   | 0.024473   | 0.025996   |   0.6 |  0.40
+Comm    | 0.0020971  | 0.24171    | 0.43023    |  38.0 |  3.97
+Output  | 0.00056076 | 0.00060701 | 0.00072312 |   0.0 |  0.01
+Modify  | 0.71869    | 0.72023    | 0.72107    |   0.1 | 11.83
+Other   |            | 0.0001827  |            |       |  0.00

 Nlocal:    4320 ave 4320 max 4320 min
 Histogram: 4 0 0 0 0 0 0 0 0 0
@ -127,4 +138,4 @@ Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:12
+Total wall time: 0:00:13
--- a/examples/reax/HNS/in.reaxc.hns
+++ b/examples/reax/HNS/in.reaxc.hns
@ -1,6 +1,12 @@
 # Pure HNS crystal, ReaxFF tests for benchmarking LAMMPS
 # See README for more info

+variable x index 2
+variable y index 2
+variable z index 2
+variable t index 100
+
+
 units             real
 atom_style        charge
 atom_modify sort  100 0.0 # optional
@ -24,7 +30,7 @@ timestep          0.1

 thermo_style      custom step temp pe press evdwl ecoul vol
 thermo_modify     norm yes
-thermo            100
+thermo            10

 velocity          all create 300.0 41279 loop geom

--- a/examples/reax/HNS/log.8Mar18.reaxc.hns.g++.1
+++ b/examples/reax/HNS/log.8Mar18.reaxc.hns.g++.1
@ -0,0 +1,115 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# Pure HNS crystal, ReaxFF tests for benchmarking LAMMPS
+# See README for more info
+
+variable x index 2
+variable y index 2
+variable z index 2
+variable t index 100
+
+
+units             real
+atom_style        charge
+atom_modify sort  100 0.0 # optional
+dimension         3
+boundary          p p p
+box               tilt large
+
+read_data         data.hns-equil
+  triclinic box = (0 0 0) to (22.326 11.1412 13.779) with tilt (0 -5.02603 0)
+  1 by 1 by 1 MPI processor grid
+  reading atoms ...
+  304 atoms
+  reading velocities ...
+  304 velocities
+replicate         $x $y $z bbox
+replicate         2 $y $z bbox
+replicate         2 2 $z bbox
+replicate         2 2 2 bbox
+  triclinic box = (0 0 0) to (44.652 22.2824 27.5579) with tilt (0 -10.0521 0)
+  1 by 1 by 1 MPI processor grid
+  2432 atoms
+  Time spent = 0.000789404 secs
+
+
+pair_style        reax/c NULL
+pair_coeff        * * ffield.reax.hns C H O N
+
+compute           reax all pair reax/c
+
+neighbor          1.0 bin
+neigh_modify      every 20 delay 0 check no
+
+timestep          0.1
+
+thermo_style      custom step temp pe press evdwl ecoul vol
+thermo_modify     norm yes
+thermo            10
+
+velocity          all create 300.0 41279 loop geom
+
+fix               1 all nve
+fix               2 all qeq/reax 1 0.0 10.0 1e-6 reax/c
+
+run               100
+Neighbor list info ...
+  update every 20 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 11
+  ghost atom cutoff = 11
+  binsize = 5.5, bins = 10 5 6
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 262.4 | 262.4 | 262.4 Mbytes
+Step Temp PotEng Press E_vdwl E_coul Volume 
+       0          300   -113.27833    437.52103   -111.57687   -1.7014647    27418.867 
+      10    299.87174   -113.27778    2033.6337   -111.57645   -1.7013325    27418.867 
+      20    300.81718   -113.28046    4817.5889   -111.57931   -1.7011463    27418.867 
+      30     301.8622   -113.28323    8303.0039   -111.58237   -1.7008608    27418.867 
+      40     302.4646   -113.28493    10519.459   -111.58446    -1.700467    27418.867 
+      50    300.79064   -113.27989    10402.291   -111.57987   -1.7000218    27418.867 
+      60    296.11534   -113.26599    7929.1348    -111.5664   -1.6995929    27418.867 
+      70    291.73354   -113.25289    5071.5459    -111.5537   -1.6991916    27418.867 
+      80      292.189   -113.25399    5667.0962   -111.55519   -1.6987993    27418.867 
+      90    298.40792   -113.27253    7513.3806   -111.57409   -1.6984403    27418.867 
+     100    303.58246   -113.28809    10017.879   -111.58991    -1.698177    27418.867 
+Loop time of 59.5461 on 1 procs for 100 steps with 2432 atoms
+
+Performance: 0.015 ns/day, 1654.060 hours/ns, 1.679 timesteps/s
+97.0% CPU use with 1 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 49.922     | 49.922     | 49.922     |   0.0 | 83.84
+Neigh   | 0.53154    | 0.53154    | 0.53154    |   0.0 |  0.89
+Comm    | 0.011399   | 0.011399   | 0.011399   |   0.0 |  0.02
+Output  | 0.00064397 | 0.00064397 | 0.00064397 |   0.0 |  0.00
+Modify  | 9.0782     | 9.0782     | 9.0782     |   0.0 | 15.25
+Other   |            | 0.002116   |            |       |  0.00
+
+Nlocal:    2432 ave 2432 max 2432 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Nghost:    10687 ave 10687 max 10687 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Neighs:    823977 ave 823977 max 823977 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+
+Total # of neighbors = 823977
+Ave neighs/atom = 338.806
+Neighbor list builds = 5
+Dangerous builds not checked
+
+Please see the log.cite file for references relevant to this simulation
+
+Total wall time: 0:01:00
--- a/examples/reax/HNS/log.8Mar18.reaxc.hns.g++.4
+++ b/examples/reax/HNS/log.8Mar18.reaxc.hns.g++.4
@ -0,0 +1,115 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# Pure HNS crystal, ReaxFF tests for benchmarking LAMMPS
+# See README for more info
+
+variable x index 2
+variable y index 2
+variable z index 2
+variable t index 100
+
+
+units             real
+atom_style        charge
+atom_modify sort  100 0.0 # optional
+dimension         3
+boundary          p p p
+box               tilt large
+
+read_data         data.hns-equil
+  triclinic box = (0 0 0) to (22.326 11.1412 13.779) with tilt (0 -5.02603 0)
+  2 by 1 by 2 MPI processor grid
+  reading atoms ...
+  304 atoms
+  reading velocities ...
+  304 velocities
+replicate         $x $y $z bbox
+replicate         2 $y $z bbox
+replicate         2 2 $z bbox
+replicate         2 2 2 bbox
+  triclinic box = (0 0 0) to (44.652 22.2824 27.5579) with tilt (0 -10.0521 0)
+  2 by 1 by 2 MPI processor grid
+  2432 atoms
+  Time spent = 0.000398397 secs
+
+
+pair_style        reax/c NULL
+pair_coeff        * * ffield.reax.hns C H O N
+
+compute           reax all pair reax/c
+
+neighbor          1.0 bin
+neigh_modify      every 20 delay 0 check no
+
+timestep          0.1
+
+thermo_style      custom step temp pe press evdwl ecoul vol
+thermo_modify     norm yes
+thermo            10
+
+velocity          all create 300.0 41279 loop geom
+
+fix               1 all nve
+fix               2 all qeq/reax 1 0.0 10.0 1e-6 reax/c
+
+run               100
+Neighbor list info ...
+  update every 20 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 11
+  ghost atom cutoff = 11
+  binsize = 5.5, bins = 10 5 6
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 126.6 | 126.6 | 126.6 Mbytes
+Step Temp PotEng Press E_vdwl E_coul Volume 
+       0          300   -113.27833    437.52112   -111.57687   -1.7014647    27418.867 
+      10    299.87174   -113.27778     2033.632   -111.57645   -1.7013325    27418.867 
+      20    300.81719   -113.28046    4817.5761   -111.57931   -1.7011463    27418.867 
+      30     301.8622   -113.28323    8302.9767   -111.58237   -1.7008609    27418.867 
+      40     302.4646   -113.28493    10519.481   -111.58446    -1.700467    27418.867 
+      50    300.79064   -113.27989    10402.312   -111.57987   -1.7000217    27418.867 
+      60    296.11534   -113.26599    7929.1393    -111.5664   -1.6995929    27418.867 
+      70    291.73354   -113.25289    5071.5368    -111.5537   -1.6991916    27418.867 
+      80    292.18901   -113.25399    5667.1118   -111.55519   -1.6987993    27418.867 
+      90    298.40793   -113.27253    7513.4029   -111.57409   -1.6984403    27418.867 
+     100    303.58247   -113.28809    10017.892   -111.58991    -1.698177    27418.867 
+Loop time of 21.3933 on 4 procs for 100 steps with 2432 atoms
+
+Performance: 0.040 ns/day, 594.257 hours/ns, 4.674 timesteps/s
+97.6% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 14.863     | 16.367     | 18.027     |  28.6 | 76.51
+Neigh   | 0.23943    | 0.2422     | 0.24658    |   0.6 |  1.13
+Comm    | 0.024331   | 1.6845     | 3.189      |  89.2 |  7.87
+Output  | 0.00051165 | 0.00056899 | 0.00068665 |   0.0 |  0.00
+Modify  | 3.0933     | 3.0969     | 3.0999     |   0.1 | 14.48
+Other   |            | 0.001784   |            |       |  0.01
+
+Nlocal:    608 ave 608 max 608 min
+Histogram: 4 0 0 0 0 0 0 0 0 0
+Nghost:    5738.25 ave 5742 max 5734 min
+Histogram: 1 1 0 0 0 0 0 0 0 2
+Neighs:    231544 ave 231625 max 231466 min
+Histogram: 2 0 0 0 0 0 0 0 0 2
+
+Total # of neighbors = 926176
+Ave neighs/atom = 380.829
+Neighbor list builds = 5
+Dangerous builds not checked
+
+Please see the log.cite file for references relevant to this simulation
+
+Total wall time: 0:00:21
--- a/examples/reax/RDX/log.8Mar18.RDX.g++.1
+++ b/examples/reax/RDX/log.8Mar18.RDX.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for high energy CHON systems
 # .....

@ -28,43 +29,53 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 18.1116 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 19 | 19 | 19 Mbytes
 Step Temp E_pair E_mol TotEng Press 
       0            0   -10197.932            0   -10197.932    38.347492 
-    3000    510.85923   -10091.694            0   -9933.3253    1668.5084 
-Loop time of 18.9088 on 1 procs for 3000 steps with 105 atoms
+    3000    510.63767   -10091.537            0   -9933.2374     1144.545 
+Loop time of 21.2931 on 1 procs for 3000 steps with 105 atoms

-Performance: 3.427 ns/day, 7.003 hours/ns, 158.657 timesteps/s
-99.5% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 3.043 ns/day, 7.886 hours/ns, 140.891 timesteps/s
+97.6% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 17.724     | 17.724     | 17.724     |   0.0 | 93.73
-Neigh   | 0.27457    | 0.27457    | 0.27457    |   0.0 |  1.45
-Comm    | 0.015814   | 0.015814   | 0.015814   |   0.0 |  0.08
-Output  | 1.1921e-05 | 1.1921e-05 | 1.1921e-05 |   0.0 |  0.00
-Modify  | 0.89014    | 0.89014    | 0.89014    |   0.0 |  4.71
-Other   |            | 0.004246   |            |       |  0.02
+Pair    | 19.887     | 19.887     | 19.887     |   0.0 | 93.40
+Neigh   | 0.33143    | 0.33143    | 0.33143    |   0.0 |  1.56
+Comm    | 0.02079    | 0.02079    | 0.02079    |   0.0 |  0.10
+Output  | 2.5272e-05 | 2.5272e-05 | 2.5272e-05 |   0.0 |  0.00
+Modify  | 1.0478     | 1.0478     | 1.0478     |   0.0 |  4.92
+Other   |            | 0.006125   |            |       |  0.03

 Nlocal:    105 ave 105 max 105 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
 Nghost:    645 ave 645 max 645 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
-Neighs:    3061 ave 3061 max 3061 min
+Neighs:    3063 ave 3063 max 3063 min
 Histogram: 1 0 0 0 0 0 0 0 0 0

-Total # of neighbors = 3061
-Ave neighs/atom = 29.1524
+Total # of neighbors = 3063
+Ave neighs/atom = 29.1714
 Neighbor list builds = 300
 Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:19
+Total wall time: 0:00:21
--- a/examples/reax/RDX/log.8Mar18.RDX.g++.4
+++ b/examples/reax/RDX/log.8Mar18.RDX.g++.4
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for high energy CHON systems
 # .....

@ -28,43 +29,53 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 12.2102 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 12.14 | 13.04 | 13.9 Mbytes
 Step Temp E_pair E_mol TotEng Press 
       0            0   -10197.932            0   -10197.932    38.347492 
-    3000    504.05354   -10089.494            0   -9933.2351    868.32505 
-Loop time of 9.70759 on 4 procs for 3000 steps with 105 atoms
+    3000    509.89257    -10091.36            0   -9933.2916    1406.1215 
+Loop time of 10.8858 on 4 procs for 3000 steps with 105 atoms

-Performance: 6.675 ns/day, 3.595 hours/ns, 309.037 timesteps/s
-99.2% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 5.953 ns/day, 4.032 hours/ns, 275.588 timesteps/s
+98.1% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 8.4621     | 8.5307     | 8.6001     |   1.9 | 87.88
-Neigh   | 0.12583    | 0.14931    | 0.17341    |   4.5 |  1.54
-Comm    | 0.053017   | 0.12311    | 0.19244    |  16.2 |  1.27
-Output  | 1.9073e-05 | 2.0802e-05 | 2.408e-05  |   0.0 |  0.00
-Modify  | 0.87638    | 0.9012     | 0.92557    |   1.9 |  9.28
-Other   |            | 0.003213   |            |       |  0.03
+Pair    | 9.3081     | 9.4054     | 9.4994     |   2.6 | 86.40
+Neigh   | 0.15541    | 0.18258    | 0.2099     |   4.7 |  1.68
+Comm    | 0.070516   | 0.16621    | 0.26541    |  19.7 |  1.53
+Output  | 2.2173e-05 | 2.5153e-05 | 3.3855e-05 |   0.0 |  0.00
+Modify  | 1.0979     | 1.1272     | 1.1568     |   2.1 | 10.35
+Other   |            | 0.004379   |            |       |  0.04

 Nlocal:    26.25 ave 46 max 8 min
 Histogram: 1 0 0 1 0 1 0 0 0 1
 Nghost:    399.5 ave 512 max 288 min
 Histogram: 1 0 0 1 0 0 1 0 0 1
-Neighs:    1010.75 ave 1818 max 420 min
+Neighs:    1011.25 ave 1819 max 420 min
 Histogram: 1 0 1 1 0 0 0 0 0 1

-Total # of neighbors = 4043
-Ave neighs/atom = 38.5048
+Total # of neighbors = 4045
+Ave neighs/atom = 38.5238
 Neighbor list builds = 300
 Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:10
+Total wall time: 0:00:11
--- a/examples/reax/VOH/log.5Oct16.VOH.g++.4
+++ b/examples/reax/VOH/log.5Oct16.VOH.g++.4
@ -1,70 +0,0 @@
-LAMMPS (5 Oct 2016)
-# REAX potential for VOH system
-# .....
-
-units		real
-
-atom_style	charge
-read_data	data.VOH
-  orthogonal box = (0 0 0) to (25 25 25)
-  1 by 2 by 2 MPI processor grid
-  reading atoms ...
-  100 atoms
-
-pair_style	reax/c lmp_control
-pair_coeff	* * ffield.reax.V_O_C_H H C O V
-Reading potential file ffield.reax.V_O_C_H with DATE: 2011-02-18
-
-neighbor	2 bin
-neigh_modify	every 10 delay 0 check no
-
-fix		1 all nve
-fix             2 all qeq/reax 1 0.0 10.0 1e-6 param.qeq
-fix             3 all temp/berendsen 500.0 500.0 100.0
-
-timestep	0.25
-
-#dump		1 all atom 30 dump.reax.voh
-
-run		3000
-Neighbor list info ...
-  2 neighbor list requests
-  update every 10 steps, delay 0 steps, check no
-  max neighbors/atom: 2000, page size: 100000
-  master list distance cutoff = 12
-  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 12.1769 Mbytes
-Step Temp E_pair E_mol TotEng Press 
-       0            0   -10246.825            0   -10246.825    42.256092 
-    3000     518.1493   -10196.234            0   -10043.328    -334.5971 
-Loop time of 5.59178 on 4 procs for 3000 steps with 100 atoms
-
-Performance: 11.588 ns/day, 2.071 hours/ns, 536.502 timesteps/s
-99.1% CPU use with 4 MPI tasks x no OpenMP threads
-
-MPI task timing breakdown:
-Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
-Pair    | 4.2807     | 4.3532     | 4.398      |   2.1 | 77.85
-Neigh   | 0.12328    | 0.14561    | 0.16815    |   4.2 |  2.60
-Comm    | 0.051619   | 0.097282   | 0.1697     |  14.1 |  1.74
-Output  | 1.7881e-05 | 1.9372e-05 | 2.3842e-05 |   0.1 |  0.00
-Modify  | 0.9701     | 0.99258    | 1.0148     |   1.6 | 17.75
-Other   |            | 0.003097   |            |       |  0.06
-
-Nlocal:    25 ave 38 max 11 min
-Histogram: 1 0 0 0 1 0 1 0 0 1
-Nghost:    368.25 ave 449 max 283 min
-Histogram: 1 0 0 0 1 0 1 0 0 1
-Neighs:    1084.5 ave 1793 max 418 min
-Histogram: 1 0 0 1 0 0 1 0 0 1
-
-Total # of neighbors = 4338
-Ave neighs/atom = 43.38
-Neighbor list builds = 300
-Dangerous builds not checked
-
-Please see the log.cite file for references relevant to this simulation
-
-Total wall time: 0:00:05
--- a/examples/reax/VOH/log.8Mar18.VOH.g++.1
+++ b/examples/reax/VOH/log.8Mar18.VOH.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for VOH system
 # .....

@ -28,43 +29,53 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 16.9211 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 17.79 | 17.79 | 17.79 Mbytes
 Step Temp E_pair E_mol TotEng Press 
       0            0   -10246.825            0   -10246.825    42.256089 
-    3000    479.39686   -10186.225            0   -10044.755   -454.82798 
-Loop time of 10.4348 on 1 procs for 3000 steps with 100 atoms
+    3000    476.73301   -10185.256            0   -10044.572   -694.70737 
+Loop time of 11.0577 on 1 procs for 3000 steps with 100 atoms

-Performance: 6.210 ns/day, 3.865 hours/ns, 287.499 timesteps/s
-99.2% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 5.860 ns/day, 4.095 hours/ns, 271.304 timesteps/s
+98.9% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 9.2216     | 9.2216     | 9.2216     |   0.0 | 88.37
-Neigh   | 0.2757     | 0.2757     | 0.2757     |   0.0 |  2.64
-Comm    | 0.015626   | 0.015626   | 0.015626   |   0.0 |  0.15
-Output  | 1.1921e-05 | 1.1921e-05 | 1.1921e-05 |   0.0 |  0.00
-Modify  | 0.91782    | 0.91782    | 0.91782    |   0.0 |  8.80
-Other   |            | 0.004039   |            |       |  0.04
+Pair    | 9.6785     | 9.6785     | 9.6785     |   0.0 | 87.53
+Neigh   | 0.32599    | 0.32599    | 0.32599    |   0.0 |  2.95
+Comm    | 0.017231   | 0.017231   | 0.017231   |   0.0 |  0.16
+Output  | 2.5511e-05 | 2.5511e-05 | 2.5511e-05 |   0.0 |  0.00
+Modify  | 1.0311     | 1.0311     | 1.0311     |   0.0 |  9.32
+Other   |            | 0.004857   |            |       |  0.04

 Nlocal:    100 ave 100 max 100 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
 Nghost:    598 ave 598 max 598 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
-Neighs:    3384 ave 3384 max 3384 min
+Neighs:    3390 ave 3390 max 3390 min
 Histogram: 1 0 0 0 0 0 0 0 0 0

-Total # of neighbors = 3384
-Ave neighs/atom = 33.84
+Total # of neighbors = 3390
+Ave neighs/atom = 33.9
 Neighbor list builds = 300
 Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:10
+Total wall time: 0:00:11
--- a/examples/reax/VOH/log.8Mar18.VOH.g++.4
+++ b/examples/reax/VOH/log.8Mar18.VOH.g++.4
@ -0,0 +1,81 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# REAX potential for VOH system
+# .....
+
+units		real
+
+atom_style	charge
+read_data	data.VOH
+  orthogonal box = (0 0 0) to (25 25 25)
+  1 by 2 by 2 MPI processor grid
+  reading atoms ...
+  100 atoms
+
+pair_style	reax/c lmp_control
+pair_coeff	* * ffield.reax.V_O_C_H H C O V
+Reading potential file ffield.reax.V_O_C_H with DATE: 2011-02-18
+
+neighbor	2 bin
+neigh_modify	every 10 delay 0 check no
+
+fix		1 all nve
+fix             2 all qeq/reax 1 0.0 10.0 1e-6 param.qeq
+fix             3 all temp/berendsen 500.0 500.0 100.0
+
+timestep	0.25
+
+#dump		1 all atom 30 dump.reax.voh
+
+run		3000
+Neighbor list info ...
+  update every 10 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 12
+  ghost atom cutoff = 12
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 11.21 | 12.52 | 13.64 Mbytes
+Step Temp E_pair E_mol TotEng Press 
+       0            0   -10246.825            0   -10246.825    42.256092 
+    3000    489.67803   -10188.866            0   -10044.362    -553.7513 
+Loop time of 6.49847 on 4 procs for 3000 steps with 100 atoms
+
+Performance: 9.972 ns/day, 2.407 hours/ns, 461.647 timesteps/s
+97.7% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 4.7412     | 4.8453     | 4.9104     |   2.9 | 74.56
+Neigh   | 0.1468     | 0.17834    | 0.20151    |   4.7 |  2.74
+Comm    | 0.071841   | 0.14037    | 0.24502    |  17.2 |  2.16
+Output  | 2.1219e-05 | 2.408e-05  | 3.1948e-05 |   0.0 |  0.00
+Modify  | 1.3072     | 1.3308     | 1.3627     |   1.7 | 20.48
+Other   |            | 0.003713   |            |       |  0.06
+
+Nlocal:    25 ave 38 max 11 min
+Histogram: 1 0 0 0 1 0 1 0 0 1
+Nghost:    369.75 ave 453 max 283 min
+Histogram: 1 0 0 0 1 1 0 0 0 1
+Neighs:    1082.25 ave 1788 max 417 min
+Histogram: 1 0 1 0 0 0 1 0 0 1
+
+Total # of neighbors = 4329
+Ave neighs/atom = 43.29
+Neighbor list builds = 300
+Dangerous builds not checked
+
+Please see the log.cite file for references relevant to this simulation
+
+Total wall time: 0:00:06
--- a/examples/reax/ZnOH2/log.8Mar18.ZnOH2.g++.1
+++ b/examples/reax/ZnOH2/log.8Mar18.ZnOH2.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for ZnOH2 system
 # .....

@ -28,43 +29,53 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 17.485 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 18.36 | 18.36 | 18.36 Mbytes
 Step Temp E_pair E_mol TotEng Press 
       0            0   -7900.2668            0   -7900.2668    60.076093 
-    3000    522.42599   -7928.9641            0   -7767.0098   -755.28778 
-Loop time of 6.38119 on 1 procs for 3000 steps with 105 atoms
+    3000    535.58577   -7934.7287            0   -7768.6948   -475.46237 
+Loop time of 7.29784 on 1 procs for 3000 steps with 105 atoms

-Performance: 10.155 ns/day, 2.363 hours/ns, 470.132 timesteps/s
-99.0% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 8.879 ns/day, 2.703 hours/ns, 411.081 timesteps/s
+97.3% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 5.2711     | 5.2711     | 5.2711     |   0.0 | 82.60
-Neigh   | 0.30669    | 0.30669    | 0.30669    |   0.0 |  4.81
-Comm    | 0.015599   | 0.015599   | 0.015599   |   0.0 |  0.24
-Output  | 1.0967e-05 | 1.0967e-05 | 1.0967e-05 |   0.0 |  0.00
-Modify  | 0.78376    | 0.78376    | 0.78376    |   0.0 | 12.28
-Other   |            | 0.004036   |            |       |  0.06
+Pair    | 5.9988     | 5.9988     | 5.9988     |   0.0 | 82.20
+Neigh   | 0.37455    | 0.37455    | 0.37455    |   0.0 |  5.13
+Comm    | 0.019186   | 0.019186   | 0.019186   |   0.0 |  0.26
+Output  | 2.4557e-05 | 2.4557e-05 | 2.4557e-05 |   0.0 |  0.00
+Modify  | 0.89915    | 0.89915    | 0.89915    |   0.0 | 12.32
+Other   |            | 0.006108   |            |       |  0.08

 Nlocal:    105 ave 105 max 105 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
 Nghost:    649 ave 649 max 649 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
-Neighs:    3956 ave 3956 max 3956 min
+Neighs:    3971 ave 3971 max 3971 min
 Histogram: 1 0 0 0 0 0 0 0 0 0

-Total # of neighbors = 3956
-Ave neighs/atom = 37.6762
+Total # of neighbors = 3971
+Ave neighs/atom = 37.819
 Neighbor list builds = 300
 Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:06
+Total wall time: 0:00:07
--- a/examples/reax/ZnOH2/log.8Mar18.ZnOH2.g++.4
+++ b/examples/reax/ZnOH2/log.8Mar18.ZnOH2.g++.4
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # REAX potential for ZnOH2 system
 # .....

@ -28,40 +29,50 @@ timestep	0.25

 run		3000
 Neighbor list info ...
-  2 neighbor list requests
  update every 10 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12
  ghost atom cutoff = 12
-  binsize = 6 -> bins = 5 5 5
-Memory usage per processor = 12.0066 Mbytes
+  binsize = 6, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 11.28 | 12.77 | 14.21 Mbytes
 Step Temp E_pair E_mol TotEng Press 
       0            0   -7900.2668            0   -7900.2668    60.076093 
-    3000     536.8256   -7935.1437            0   -7768.7255   -479.27959 
-Loop time of 3.77632 on 4 procs for 3000 steps with 105 atoms
+    3000    538.25796   -7935.6159            0   -7768.7536   -525.47078 
+Loop time of 4.48824 on 4 procs for 3000 steps with 105 atoms

-Performance: 17.160 ns/day, 1.399 hours/ns, 794.423 timesteps/s
-99.0% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 14.438 ns/day, 1.662 hours/ns, 668.414 timesteps/s
+97.2% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 2.7337     | 2.7808     | 2.8316     |   2.5 | 73.64
-Neigh   | 0.13455    | 0.16558    | 0.19493    |   5.3 |  4.38
-Comm    | 0.046741   | 0.099375   | 0.14663    |  13.6 |  2.63
-Output  | 1.7881e-05 | 2.0027e-05 | 2.408e-05  |   0.1 |  0.00
-Modify  | 0.69792    | 0.7275     | 0.75887    |   2.5 | 19.26
-Other   |            | 0.003084   |            |       |  0.08
+Pair    | 3.1031     | 3.1698     | 3.2378     |   3.3 | 70.62
+Neigh   | 0.16642    | 0.20502    | 0.25003    |   6.6 |  4.57
+Comm    | 0.074932   | 0.14224    | 0.21025    |  15.6 |  3.17
+Output  | 0.00011349 | 0.00011736 | 0.00012231 |   0.0 |  0.00
+Modify  | 0.92089    | 0.96736    | 1.0083     |   3.2 | 21.55
+Other   |            | 0.003731   |            |       |  0.08

 Nlocal:    26.25 ave 45 max 15 min
 Histogram: 1 0 2 0 0 0 0 0 0 1
 Nghost:    399 ave 509 max 295 min
 Histogram: 1 0 0 0 2 0 0 0 0 1
-Neighs:    1150 ave 2061 max 701 min
+Neighs:    1151.5 ave 2066 max 701 min
 Histogram: 1 2 0 0 0 0 0 0 0 1

-Total # of neighbors = 4600
-Ave neighs/atom = 43.8095
+Total # of neighbors = 4606
+Ave neighs/atom = 43.8667
 Neighbor list builds = 300
 Dangerous builds not checked

--- a/examples/reax/ci-reaxFF/in.ci-reax.CH
+++ b/examples/reax/ci-reaxFF/in.ci-reax.CH
--- a/examples/reax/ci-reaxFF/log.8Mar18.ci-reax.CH.g++.1
+++ b/examples/reax/ci-reaxFF/log.8Mar18.ci-reax.CH.g++.1
@ -1,4 +1,5 @@
-LAMMPS (23 Oct 2017)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 #ci-reax potential for CH systems with tabulated ZBL correction
 atom_style      charge
 units           real
@ -31,6 +32,7 @@ fix             2 all temp/berendsen 500.0 500.0 100.0
 #dump           1 all atom 30 dump.ci-reax.lammpstrj

 run             3000
+WARNING: Total cutoff < 2*bond cutoff. May need to use an increased neighbor list skin. (../pair_reaxc.cpp:392)
 Neighbor list info ...
  update every 1 steps, delay 10 steps, check yes
  max neighbors/atom: 2000, page size: 100000
@ -52,20 +54,20 @@ Per MPI rank memory allocation (min/avg/max) = 43.46 | 43.46 | 43.46 Mbytes
 Step Temp E_pair E_mol TotEng Press 
       0    508.42043   -28736.654            0   -28260.785    1678.3276 
    3000    480.41333   -28707.835            0   -28258.181   -3150.0762 
-Loop time of 21.5509 on 1 procs for 3000 steps with 315 atoms
+Loop time of 45.3959 on 1 procs for 3000 steps with 315 atoms

-Performance: 3.007 ns/day, 7.982 hours/ns, 139.205 timesteps/s
-100.0% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 1.427 ns/day, 16.813 hours/ns, 66.085 timesteps/s
+96.6% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 21.315     | 21.315     | 21.315     |   0.0 | 98.91
-Neigh   | 0.17846    | 0.17846    | 0.17846    |   0.0 |  0.83
-Comm    | 0.028676   | 0.028676   | 0.028676   |   0.0 |  0.13
-Output  | 2.6941e-05 | 2.6941e-05 | 2.6941e-05 |   0.0 |  0.00
-Modify  | 0.018969   | 0.018969   | 0.018969   |   0.0 |  0.09
-Other   |            | 0.009438   |            |       |  0.04
+Pair    | 44.955     | 44.955     | 44.955     |   0.0 | 99.03
+Neigh   | 0.29903    | 0.29903    | 0.29903    |   0.0 |  0.66
+Comm    | 0.056547   | 0.056547   | 0.056547   |   0.0 |  0.12
+Output  | 4.8399e-05 | 4.8399e-05 | 4.8399e-05 |   0.0 |  0.00
+Modify  | 0.058722   | 0.058722   | 0.058722   |   0.0 |  0.13
+Other   |            | 0.02632    |            |       |  0.06

 Nlocal:    315 ave 315 max 315 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
@ -81,4 +83,4 @@ Dangerous builds = 0

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:21
+Total wall time: 0:00:45
--- a/examples/reax/ci-reaxFF/log.8Mar18.ci-reax.CH.g++.4
+++ b/examples/reax/ci-reaxFF/log.8Mar18.ci-reax.CH.g++.4
@ -0,0 +1,86 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+#ci-reax potential for CH systems with tabulated ZBL correction
+atom_style      charge
+units           real
+
+read_data       CH4.dat
+  orthogonal box = (0 0 0) to (20 20 20)
+  1 by 2 by 2 MPI processor grid
+  reading atoms ...
+  315 atoms
+  reading velocities ...
+  315 velocities
+
+pair_style      hybrid/overlay reax/c control checkqeq no table linear 11000
+pair_coeff      * * reax/c ffield.ci-reax.CH C H
+Reading potential file ffield.ci-reax.CH with DATE: 2017-11-20
+pair_coeff      1 1 table ci-reaxFF_ZBL.dat CC_cireaxFF
+WARNING: 2 of 10000 force values in table are inconsistent with -dE/dr.
+  Should only be flagged at inflection points (../pair_table.cpp:481)
+pair_coeff      1 2 table ci-reaxFF_ZBL.dat CH_cireaxFF
+WARNING: 2 of 11000 force values in table are inconsistent with -dE/dr.
+  Should only be flagged at inflection points (../pair_table.cpp:481)
+pair_coeff      2 2 table ci-reaxFF_ZBL.dat HH_cireaxFF
+WARNING: 2 of 6000 force values in table are inconsistent with -dE/dr.
+  Should only be flagged at inflection points (../pair_table.cpp:481)
+
+timestep        0.25
+fix             1 all nve
+fix             2 all temp/berendsen 500.0 500.0 100.0
+
+#dump           1 all atom 30 dump.ci-reax.lammpstrj
+
+run             3000
+WARNING: Total cutoff < 2*bond cutoff. May need to use an increased neighbor list skin. (../pair_reaxc.cpp:392)
+Neighbor list info ...
+  update every 1 steps, delay 10 steps, check yes
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 9.5
+  ghost atom cutoff = 9.5
+  binsize = 4.75, bins = 5 5 5
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) pair table, perpetual
+      attributes: half, newton on
+      pair build: half/bin/atomonly/newton
+      stencil: half/bin/3d/newton
+      bin: standard
+Per MPI rank memory allocation (min/avg/max) = 24.48 | 25.61 | 27.27 Mbytes
+Step Temp E_pair E_mol TotEng Press 
+       0    508.42043   -28736.654            0   -28260.785    1678.3276 
+    3000    480.41333   -28707.835            0   -28258.181   -3150.0762 
+Loop time of 24.7034 on 4 procs for 3000 steps with 315 atoms
+
+Performance: 2.623 ns/day, 9.149 hours/ns, 121.441 timesteps/s
+95.8% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 18.945     | 21.367     | 24.046     |  39.3 | 86.49
+Neigh   | 0.1456     | 0.15254    | 0.16101    |   1.6 |  0.62
+Comm    | 0.39168    | 3.0859     | 5.5185     | 103.9 | 12.49
+Output  | 3.5763e-05 | 4.065e-05  | 5.2452e-05 |   0.0 |  0.00
+Modify  | 0.05831    | 0.068811   | 0.077666   |   2.9 |  0.28
+Other   |            | 0.0292     |            |       |  0.12
+
+Nlocal:    78.75 ave 96 max 65 min
+Histogram: 2 0 0 0 0 0 0 1 0 1
+Nghost:    1233 ave 1348 max 1116 min
+Histogram: 1 0 1 0 0 0 0 1 0 1
+Neighs:    9467.25 ave 12150 max 7160 min
+Histogram: 1 1 0 0 0 0 0 1 0 1
+
+Total # of neighbors = 37869
+Ave neighs/atom = 120.219
+Neighbor list builds = 37
+Dangerous builds = 0
+
+Please see the log.cite file for references relevant to this simulation
+
+Total wall time: 0:00:24
--- a/examples/reax/log.5Oct16.reax.rdx.g++.1
+++ b/examples/reax/log.5Oct16.reax.rdx.g++.1
@ -1,101 +0,0 @@
-LAMMPS (5 Oct 2016)
-# ReaxFF potential for RDX system
-
-units		real
-
-atom_style	charge
-read_data	data.rdx
-  orthogonal box = (35 35 35) to (48 48 48)
-  1 by 1 by 1 MPI processor grid
-  reading atoms ...
-  21 atoms
-
-#     reax args: hbcut hbnewflag tripflag precision
-
-pair_style	reax 6.0 1 1 1.0e-6
-WARNING: The pair_style reax command will be deprecated soon - users should switch to pair_style reax/c (../pair_reax.cpp:49)
-pair_coeff	* * ffield.reax 1 2 3 4
-
-compute reax all pair reax
-
-variable eb  	 equal c_reax[1]
-variable ea  	 equal c_reax[2]
-variable elp 	 equal c_reax[3]
-variable emol 	 equal c_reax[4]
-variable ev 	 equal c_reax[5]
-variable epen 	 equal c_reax[6]
-variable ecoa 	 equal c_reax[7]
-variable ehb 	 equal c_reax[8]
-variable et 	 equal c_reax[9]
-variable eco 	 equal c_reax[10]
-variable ew 	 equal c_reax[11]
-variable ep 	 equal c_reax[12]
-variable efi 	 equal c_reax[13]
-variable eqeq 	 equal c_reax[14]
-
-neighbor	2.5 bin
-neigh_modify	every 10 delay 0 check no
-
-fix		1 all nve
-
-thermo		10
-thermo_style    custom step temp epair etotal press 	     	v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb 		v_et v_eco v_ew v_ep v_efi v_eqeq
-
-timestep	1.0
-
-#dump            1 all custom 10 dump.reax.rdx id type q xs ys zs
-
-#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	2 pad 3
-
-#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	3 pad 3
-
-run		100
-Neighbor list info ...
-  1 neighbor list requests
-  update every 10 steps, delay 0 steps, check no
-  max neighbors/atom: 2000, page size: 100000
-  master list distance cutoff = 12.5
-  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 3 3 3
-Memory usage per processor = 2.95105 Mbytes
-Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
-       0            0   -1885.1268   -1885.1268    27233.074   -2958.4712    79.527715   0.31082031            0    97.771125    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79971            0    168.88435 
-      10    1281.7558   -1989.1322   -1912.7188   -19609.913   -2733.8828   -15.775275   0.20055725            0    55.020231    3.1070522   -77.710916            0    14.963568   -5.8082204    843.41939   -180.17724            0     107.5115 
-      20    516.83079    -1941.677   -1910.8655    -12525.41   -2801.8626    7.4107974  0.073134188            0    81.986982    0.2281551   -57.494871            0    30.656735   -10.102557    877.78696   -158.93385            0    88.574158 
-      30     467.2641    -1940.978   -1913.1215   -35957.487    -2755.021   -6.9179959  0.049322439            0    78.853175   0.13604392   -51.653634            0    19.862872   -9.7098575    853.79334     -151.232            0    80.861768 
-      40    647.45541   -1951.1994   -1912.6006   -5883.7147   -2798.3556    17.334807   0.15102863            0     63.23512   0.18070931   -54.598962            0    17.325008   -12.052277    883.01667   -164.21335            0    96.777422 
-      50    716.38057   -1949.4749    -1906.767    5473.2085    -2800.931    9.2056917   0.15413274            0    85.371449    3.2986106   -78.253597            0    34.861773   -8.5531236    882.01435   -193.85275            0     117.2096 
-      60    1175.2707   -1975.9611   -1905.8959   -1939.4971   -2726.5816   -11.651982   0.24296788            0    48.320663    7.1799636   -75.363641            0    16.520132   -4.8869463      844.754   -194.23296            0    119.73837 
-      70       1156.7   -1975.3486   -1906.3905    24628.344   -2880.5223    25.652478   0.26894312            0    83.724884    7.1049303   -68.700942            0    24.750744   -8.6338218    911.20067    -183.4058            0    113.21158 
-      80    840.23687   -1955.4768   -1905.3851   -17731.383   -2755.7295   -8.0168306   0.13867962            0     86.14748    2.2387306   -76.945841            0    23.595858   -7.2609645     853.6346   -167.88289            0    94.603895 
-      90    365.79169    -1926.406   -1904.5989    898.37155    -2842.183    47.368211      0.23109            0    92.288131   0.38031313   -61.361483            0    18.476377   -12.255472    900.24202   -186.48056            0    116.88831 
-     100    801.32078   -1953.4177    -1905.646   -2417.5518   -2802.7244    4.6676973   0.18046558            0    76.730114    5.4177372   -77.102556            0    24.997234   -7.7554179    898.67306    -196.8912            0    120.38952 
-Loop time of 0.512828 on 1 procs for 100 steps with 21 atoms
-
-Performance: 16.848 ns/day, 1.425 hours/ns, 194.997 timesteps/s
-99.4% CPU use with 1 MPI tasks x no OpenMP threads
-
-MPI task timing breakdown:
-Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
-Pair    | 0.51126    | 0.51126    | 0.51126    |   0.0 | 99.69
-Neigh   | 0.00071597 | 0.00071597 | 0.00071597 |   0.0 |  0.14
-Comm    | 0.00040317 | 0.00040317 | 0.00040317 |   0.0 |  0.08
-Output  | 0.00027037 | 0.00027037 | 0.00027037 |   0.0 |  0.05
-Modify  | 7.2241e-05 | 7.2241e-05 | 7.2241e-05 |   0.0 |  0.01
-Other   |            | 0.000108   |            |       |  0.02
-
-Nlocal:    21 ave 21 max 21 min
-Histogram: 1 0 0 0 0 0 0 0 0 0
-Nghost:    546 ave 546 max 546 min
-Histogram: 1 0 0 0 0 0 0 0 0 0
-Neighs:    1106 ave 1106 max 1106 min
-Histogram: 1 0 0 0 0 0 0 0 0 0
-
-Total # of neighbors = 1106
-Ave neighs/atom = 52.6667
-Neighbor list builds = 10
-Dangerous builds not checked
-Total wall time: 0:00:00
--- a/examples/reax/log.5Oct16.reax.rdx.g++.4
+++ b/examples/reax/log.5Oct16.reax.rdx.g++.4
@ -1,101 +0,0 @@
-LAMMPS (5 Oct 2016)
-# ReaxFF potential for RDX system
-
-units		real
-
-atom_style	charge
-read_data	data.rdx
-  orthogonal box = (35 35 35) to (48 48 48)
-  1 by 2 by 2 MPI processor grid
-  reading atoms ...
-  21 atoms
-
-#     reax args: hbcut hbnewflag tripflag precision
-
-pair_style	reax 6.0 1 1 1.0e-6
-WARNING: The pair_style reax command will be deprecated soon - users should switch to pair_style reax/c (../pair_reax.cpp:49)
-pair_coeff	* * ffield.reax 1 2 3 4
-
-compute reax all pair reax
-
-variable eb  	 equal c_reax[1]
-variable ea  	 equal c_reax[2]
-variable elp 	 equal c_reax[3]
-variable emol 	 equal c_reax[4]
-variable ev 	 equal c_reax[5]
-variable epen 	 equal c_reax[6]
-variable ecoa 	 equal c_reax[7]
-variable ehb 	 equal c_reax[8]
-variable et 	 equal c_reax[9]
-variable eco 	 equal c_reax[10]
-variable ew 	 equal c_reax[11]
-variable ep 	 equal c_reax[12]
-variable efi 	 equal c_reax[13]
-variable eqeq 	 equal c_reax[14]
-
-neighbor	2.5 bin
-neigh_modify	every 10 delay 0 check no
-
-fix		1 all nve
-
-thermo		10
-thermo_style    custom step temp epair etotal press 	     	v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb 		v_et v_eco v_ew v_ep v_efi v_eqeq
-
-timestep	1.0
-
-#dump            1 all custom 10 dump.reax.rdx id type q xs ys zs
-
-#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	2 pad 3
-
-#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	3 pad 3
-
-run		100
-Neighbor list info ...
-  1 neighbor list requests
-  update every 10 steps, delay 0 steps, check no
-  max neighbors/atom: 2000, page size: 100000
-  master list distance cutoff = 12.5
-  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 3 3 3
-Memory usage per processor = 3.0718 Mbytes
-Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
-       0            0   -1885.1268   -1885.1268    27233.074   -2958.4712    79.527715   0.31082031            0    97.771125    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79972            0     168.8843 
-      10    1281.7558   -1989.1322   -1912.7188   -19609.913   -2733.8828   -15.775275   0.20055725            0    55.020231    3.1070523   -77.710916            0    14.963568   -5.8082204    843.41939   -180.17725            0    107.51148 
-      20     516.8308    -1941.677   -1910.8655   -12525.411   -2801.8626    7.4107973   0.07313419            0    81.986982    0.2281551   -57.494871            0    30.656735   -10.102557    877.78696   -158.93385            0    88.574155 
-      30    467.26411    -1940.978   -1913.1215   -35957.487    -2755.021   -6.9179966  0.049322437            0    78.853175   0.13604391   -51.653634            0    19.862872   -9.7098574    853.79333     -151.232            0    80.861765 
-      40    647.45584   -1951.1994   -1912.6006   -5883.7102   -2798.3557    17.334812   0.15102857            0    63.235124   0.18070914   -54.598951            0    17.325006   -12.052278    883.01674   -164.21335            0    96.777418 
-      50    716.38108   -1949.4679     -1906.76    5473.1803   -2800.9311    9.2057064   0.15413272            0    85.371443    3.2986124   -78.253597            0    34.861778   -8.5531235    882.01441   -193.85213            0    117.21596 
-      60    1175.2703   -1975.9632    -1905.898   -1939.6676   -2726.5815   -11.652032   0.24296779            0    48.320636    7.1799647   -75.363643            0    16.520124   -4.8869416    844.75396   -194.25563            0    119.75889 
-      70    1156.7016   -1975.3469   -1906.3887    24628.125   -2880.5225     25.65252   0.26894309            0    83.724869    7.1048931   -68.700978            0    24.750754   -8.6338341    911.20067   -183.41947            0    113.22722 
-      80     840.3323   -1955.4867   -1905.3893   -17732.956   -2755.7336   -8.0168615   0.13869303            0    86.143454    2.2388975   -76.946365            0    23.594977   -7.2608903    853.63682   -167.88599            0    94.604168 
-      90    365.75853   -1926.4192   -1904.6141    902.29004   -2842.1715    47.360077   0.23110905            0     92.28805   0.38040356   -61.364192            0    18.473252   -12.253964    900.23128   -186.47889            0    116.88518 
-     100    801.64661   -1953.4392   -1905.6481   -2464.5533   -2802.6922    4.6510183   0.18048786            0    76.715675      5.41849   -77.102069            0    24.987058   -7.7531389    898.65974   -196.87724            0    120.37303 
-Loop time of 0.405054 on 4 procs for 100 steps with 21 atoms
-
-Performance: 21.331 ns/day, 1.125 hours/ns, 246.881 timesteps/s
-96.9% CPU use with 4 MPI tasks x no OpenMP threads
-
-MPI task timing breakdown:
-Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
-Pair    | 0.16194    | 0.24674    | 0.40012    |  18.4 | 60.92
-Neigh   | 7.3671e-05 | 0.00024015 | 0.00053477 |   1.1 |  0.06
-Comm    | 0.0037704  | 0.1575     | 0.24247    |  23.1 | 38.88
-Output  | 0.00037122 | 0.00040913 | 0.0004406  |   0.1 |  0.10
-Modify  | 4.22e-05   | 6.175e-05  | 8.3685e-05 |   0.2 |  0.02
-Other   |            | 0.0001087  |            |       |  0.03
-
-Nlocal:    5.25 ave 15 max 0 min
-Histogram: 1 0 2 0 0 0 0 0 0 1
-Nghost:    355.5 ave 432 max 282 min
-Histogram: 1 0 0 0 1 1 0 0 0 1
-Neighs:    301.25 ave 827 max 0 min
-Histogram: 1 0 2 0 0 0 0 0 0 1
-
-Total # of neighbors = 1205
-Ave neighs/atom = 57.381
-Neighbor list builds = 10
-Dangerous builds not checked
-Total wall time: 0:00:00
--- a/examples/reax/log.5Oct16.reaxc.rdx.g++.1
+++ b/examples/reax/log.5Oct16.reaxc.rdx.g++.1
@ -1,104 +0,0 @@
-LAMMPS (5 Oct 2016)
-# ReaxFF potential for RDX system
-# this run is equivalent to reax/in.reax.rdx
-
-units		real
-
-atom_style	charge
-read_data	data.rdx
-  orthogonal box = (35 35 35) to (48 48 48)
-  1 by 1 by 1 MPI processor grid
-  reading atoms ...
-  21 atoms
-
-pair_style      reax/c control.reax_c.rdx
-pair_coeff      * * ffield.reax C H O N
-Reading potential file ffield.reax with DATE: 2010-02-19
-
-compute reax all pair reax/c
-
-variable eb  	 equal c_reax[1]
-variable ea  	 equal c_reax[2]
-variable elp 	 equal c_reax[3]
-variable emol 	 equal c_reax[4]
-variable ev 	 equal c_reax[5]
-variable epen 	 equal c_reax[6]
-variable ecoa 	 equal c_reax[7]
-variable ehb 	 equal c_reax[8]
-variable et 	 equal c_reax[9]
-variable eco 	 equal c_reax[10]
-variable ew 	 equal c_reax[11]
-variable ep 	 equal c_reax[12]
-variable efi 	 equal c_reax[13]
-variable eqeq 	 equal c_reax[14]
-
-neighbor	2.5 bin
-neigh_modify	every 10 delay 0 check no
-
-fix		1 all nve
-fix             2 all qeq/reax 1 0.0 10.0 1.0e-6 reax/c
-
-thermo		10
-thermo_style 	custom step temp epair etotal press 		v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa 		v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq
-
-timestep	1.0
-
-#dump		1 all atom 10 dump.reaxc.rdx
-
-#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	2 pad 3
-
-#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	3 pad 3
-
-run		100
-Neighbor list info ...
-  2 neighbor list requests
-  update every 10 steps, delay 0 steps, check no
-  max neighbors/atom: 2000, page size: 100000
-  master list distance cutoff = 12.5
-  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 3 3 3
-Memory usage per processor = 14.4462 Mbytes
-Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
-       0            0   -1884.3081   -1884.3081    27186.181   -2958.4712    79.527715   0.31082031            0    98.589783    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79931            0    168.88396 
-      10    1288.6116   -1989.6644   -1912.8422   -19456.353   -2734.6769   -15.607221    0.2017796            0    54.629557     3.125229     -77.7067            0    14.933901   -5.8108541    843.92073   -180.43321            0    107.75935 
-      20    538.95819   -1942.7037   -1910.5731   -10725.639   -2803.7394    7.9078269   0.07792668            0    81.610053   0.22951941   -57.557107            0    30.331207   -10.178049    878.99009   -159.68914            0    89.313379 
-      30    463.09535   -1933.5765   -1905.9686   -33255.546    -2749.859   -8.0154745   0.02762893            0    81.627395   0.11972413   -50.262293            0    20.820303   -9.6327015    851.88715   -149.49499            0    79.205727 
-      40    885.49171   -1958.9125   -1906.1229   -4814.6856    -2795.644     9.150669   0.13747498            0    70.947982   0.24360485   -57.862663            0    19.076496   -11.141218    873.73893   -159.99393            0    92.434096 
-      50    861.16578   -1954.4599   -1903.1205   -1896.7713    -2784.845    3.8270515   0.15793266            0    79.851823    3.3492142    -78.06613            0    32.629016    -7.956541    872.81838   -190.98567            0    114.75995 
-      60    1167.7852   -1971.8429    -1902.224   -3482.7305    -2705.863    -17.12171   0.22749077            0    44.507654    7.8560745   -74.788955            0    16.256483   -4.6046431     835.8304   -188.33691            0    114.19413 
-      70    1439.9966   -1989.3024   -1903.4553    23845.651   -2890.7895    31.958845   0.26671721            0    85.758695    3.1803544   -71.002903            0    24.357134    -10.31131    905.86775   -175.38471            0    106.79648 
-      80    502.39438   -1930.7544   -1900.8035   -20356.316   -2703.8115   -18.662467   0.11286011            0    99.804201    2.0329024   -76.171317            0    19.237028   -6.2786907    826.47451   -166.03125            0    92.539398 
-      90    749.08499   -1946.9838   -1902.3262     17798.51   -2863.7576    42.068717    0.2433807            0    96.181613   0.96184887   -69.955448            0    24.615302   -11.582765    903.68818   -190.13843            0    120.69141 
-     100    1109.6968   -1968.5874   -1902.4315   -4490.1018   -2755.8965   -7.1231014   0.21757699            0    61.806018    7.0827673   -75.645345            0    20.114997   -6.2371964     863.5635   -198.56976            0    122.09961 
-Loop time of 0.362895 on 1 procs for 100 steps with 21 atoms
-
-Performance: 23.809 ns/day, 1.008 hours/ns, 275.562 timesteps/s
-100.0% CPU use with 1 MPI tasks x no OpenMP threads
-
-MPI task timing breakdown:
-Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
-Pair    | 0.34367    | 0.34367    | 0.34367    |   0.0 | 94.70
-Neigh   | 0.0078354  | 0.0078354  | 0.0078354  |   0.0 |  2.16
-Comm    | 0.00043559 | 0.00043559 | 0.00043559 |   0.0 |  0.12
-Output  | 0.00019908 | 0.00019908 | 0.00019908 |   0.0 |  0.05
-Modify  | 0.010645   | 0.010645   | 0.010645   |   0.0 |  2.93
-Other   |            | 0.0001094  |            |       |  0.03
-
-Nlocal:    21 ave 21 max 21 min
-Histogram: 1 0 0 0 0 0 0 0 0 0
-Nghost:    546 ave 546 max 546 min
-Histogram: 1 0 0 0 0 0 0 0 0 0
-Neighs:    1096 ave 1096 max 1096 min
-Histogram: 1 0 0 0 0 0 0 0 0 0
-
-Total # of neighbors = 1096
-Ave neighs/atom = 52.1905
-Neighbor list builds = 10
-Dangerous builds not checked
-
-Please see the log.cite file for references relevant to this simulation
-
-Total wall time: 0:00:00
--- a/examples/reax/log.5Oct16.reaxc.rdx.g++.4
+++ b/examples/reax/log.5Oct16.reaxc.rdx.g++.4
@ -1,104 +0,0 @@
-LAMMPS (5 Oct 2016)
-# ReaxFF potential for RDX system
-# this run is equivalent to reax/in.reax.rdx
-
-units		real
-
-atom_style	charge
-read_data	data.rdx
-  orthogonal box = (35 35 35) to (48 48 48)
-  1 by 2 by 2 MPI processor grid
-  reading atoms ...
-  21 atoms
-
-pair_style      reax/c control.reax_c.rdx
-pair_coeff      * * ffield.reax C H O N
-Reading potential file ffield.reax with DATE: 2010-02-19
-
-compute reax all pair reax/c
-
-variable eb  	 equal c_reax[1]
-variable ea  	 equal c_reax[2]
-variable elp 	 equal c_reax[3]
-variable emol 	 equal c_reax[4]
-variable ev 	 equal c_reax[5]
-variable epen 	 equal c_reax[6]
-variable ecoa 	 equal c_reax[7]
-variable ehb 	 equal c_reax[8]
-variable et 	 equal c_reax[9]
-variable eco 	 equal c_reax[10]
-variable ew 	 equal c_reax[11]
-variable ep 	 equal c_reax[12]
-variable efi 	 equal c_reax[13]
-variable eqeq 	 equal c_reax[14]
-
-neighbor	2.5 bin
-neigh_modify	every 10 delay 0 check no
-
-fix		1 all nve
-fix             2 all qeq/reax 1 0.0 10.0 1.0e-6 reax/c
-
-thermo		10
-thermo_style 	custom step temp epair etotal press 		v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa 		v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq
-
-timestep	1.0
-
-#dump		1 all atom 10 dump.reaxc.rdx
-
-#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	2 pad 3
-
-#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
-#dump_modify	3 pad 3
-
-run		100
-Neighbor list info ...
-  2 neighbor list requests
-  update every 10 steps, delay 0 steps, check no
-  max neighbors/atom: 2000, page size: 100000
-  master list distance cutoff = 12.5
-  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 3 3 3
-Memory usage per processor = 12.531 Mbytes
-Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
-       0            0   -1884.3081   -1884.3081     27186.18   -2958.4712    79.527715   0.31082031            0    98.589783    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79953            0    168.88418 
-      10    1288.6115   -1989.6644   -1912.8422   -19456.354   -2734.6769    -15.60722    0.2017796            0    54.629558    3.1252288     -77.7067            0    14.933901   -5.8108542    843.92073   -180.43321            0    107.75934 
-      20    538.95831   -1942.7037   -1910.5731   -10725.671   -2803.7395    7.9078306  0.077926651            0    81.610051   0.22951926   -57.557099            0    30.331204   -10.178049    878.99014   -159.69268            0    89.316921 
-      30    463.09502   -1933.5765   -1905.9685   -33255.512   -2749.8591    -8.015455  0.027628766            0      81.6274   0.11972393   -50.262275            0    20.820315   -9.6327041    851.88722   -149.49498            0    79.205714 
-      40    885.49378   -1958.9125   -1906.1228    -4814.644    -2795.644    9.1506485   0.13747497            0       70.948   0.24360511   -57.862677            0    19.076502   -11.141216    873.73898   -159.99393            0     92.43409 
-      50    861.16297   -1954.4602   -1903.1209   -1896.8002   -2784.8451    3.8270162     0.157933            0    79.851673    3.3492148   -78.066132            0    32.628944   -7.9565368    872.81852   -190.98572            0    114.76001 
-      60    1167.7835   -1971.8433   -1902.2245   -3482.8296   -2705.8635   -17.121613    0.2274909            0    44.507674      7.85602   -74.788998            0    16.256483   -4.6046575    835.83058   -188.33691            0    114.19414 
-      70    1439.9939   -1989.3026   -1903.4556    23846.042   -2890.7893    31.958672   0.26671708            0    85.758381    3.1804035   -71.002944            0    24.357195   -10.311284     905.8679   -175.38487            0    106.79661 
-      80    502.39535   -1930.7548   -1900.8039   -20356.194   -2703.8126   -18.662209   0.11286005            0    99.803849    2.0329206   -76.171278            0     19.23716   -6.2787147    826.47505   -166.03123            0    92.539386 
-      90    749.07874   -1946.9841   -1902.3269    17798.394   -2863.7576    42.068612   0.24338059            0    96.181423   0.96185061    -69.95542            0    24.615344   -11.582758    903.68812   -190.13826            0    120.69124 
-     100    1109.6904   -1968.5879   -1902.4323   -4490.0667   -2755.8991   -7.1224194   0.21757691            0    61.805857    7.0827218   -75.645383            0    20.115437     -6.23727    863.56487   -198.56975            0    122.09963 
-Loop time of 0.293673 on 4 procs for 100 steps with 21 atoms
-
-Performance: 29.420 ns/day, 0.816 hours/ns, 340.514 timesteps/s
-99.1% CPU use with 4 MPI tasks x no OpenMP threads
-
-MPI task timing breakdown:
-Section |  min time  |  avg time  |  max time  |%varavg| %total
---------------------------------------------------------------
-Pair    | 0.24143    | 0.24223    | 0.24409    |   0.2 | 82.48
-Neigh   | 0.003767   | 0.0049117  | 0.0061524  |   1.2 |  1.67
-Comm    | 0.0030656  | 0.0048578  | 0.0057402  |   1.5 |  1.65
-Output  | 0.00033545 | 0.00036347 | 0.00038052 |   0.1 |  0.12
-Modify  | 0.039885   | 0.041207   | 0.042435   |   0.4 | 14.03
-Other   |            | 0.0001001  |            |       |  0.03
-
-Nlocal:    5.25 ave 15 max 0 min
-Histogram: 1 0 2 0 0 0 0 0 0 1
-Nghost:    355.5 ave 432 max 282 min
-Histogram: 1 0 0 0 1 1 0 0 0 1
-Neighs:    298.75 ave 822 max 0 min
-Histogram: 1 0 2 0 0 0 0 0 0 1
-
-Total # of neighbors = 1195
-Ave neighs/atom = 56.9048
-Neighbor list builds = 10
-Dangerous builds not checked
-
-Please see the log.cite file for references relevant to this simulation
-
-Total wall time: 0:00:00
--- a/examples/reax/log.8March18.reax.rdx.g++.1
+++ b/examples/reax/log.8March18.reax.rdx.g++.1
@ -0,0 +1,107 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# ReaxFF potential for RDX system
+
+units		real
+
+atom_style	charge
+read_data	data.rdx
+  orthogonal box = (35 35 35) to (48 48 48)
+  1 by 1 by 1 MPI processor grid
+  reading atoms ...
+  21 atoms
+
+#     reax args: hbcut hbnewflag tripflag precision
+
+pair_style	reax 6.0 1 1 1.0e-6
+WARNING: The pair_style reax command is unsupported. Please switch to pair_style reax/c instead (../pair_reax.cpp:49)
+pair_coeff	* * ffield.reax 1 2 3 4
+
+compute reax all pair reax
+
+variable eb  	 equal c_reax[1]
+variable ea  	 equal c_reax[2]
+variable elp 	 equal c_reax[3]
+variable emol 	 equal c_reax[4]
+variable ev 	 equal c_reax[5]
+variable epen 	 equal c_reax[6]
+variable ecoa 	 equal c_reax[7]
+variable ehb 	 equal c_reax[8]
+variable et 	 equal c_reax[9]
+variable eco 	 equal c_reax[10]
+variable ew 	 equal c_reax[11]
+variable ep 	 equal c_reax[12]
+variable efi 	 equal c_reax[13]
+variable eqeq 	 equal c_reax[14]
+
+neighbor	2.5 bin
+neigh_modify	every 10 delay 0 check no
+
+fix		1 all nve
+
+thermo		10
+thermo_style    custom step temp epair etotal press 	     	v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb 		v_et v_eco v_ew v_ep v_efi v_eqeq
+
+timestep	1.0
+
+#dump            1 all custom 10 dump.reax.rdx id type q xs ys zs
+
+#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	2 pad 3
+
+#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	3 pad 3
+
+run		100
+Neighbor list info ...
+  update every 10 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 12.5
+  ghost atom cutoff = 12.5
+  binsize = 6.25, bins = 3 3 3
+  1 neighbor lists, perpetual/occasional/extra = 1 0 0
+  (1) pair reax, perpetual
+      attributes: half, newton off
+      pair build: half/bin/newtoff
+      stencil: half/bin/3d/newtoff
+      bin: standard
+Per MPI rank memory allocation (min/avg/max) = 3.278 | 3.278 | 3.278 Mbytes
+Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
+       0            0   -1885.1269   -1885.1269    27233.074   -2958.4712    79.527715   0.31082031            0    97.771125    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79973            0     168.8842 
+      10    1281.7558   -1989.1322   -1912.7188   -19609.913   -2733.8828   -15.775275   0.20055725            0     55.02023    3.1070523   -77.710916            0    14.963568   -5.8082203    843.41939   -180.17724            0     107.5115 
+      20    516.83079    -1941.677   -1910.8655   -12525.412   -2801.8626     7.410797  0.073134186            0    81.986983    0.2281551   -57.494871            0    30.656735   -10.102557    877.78695   -158.93385            0    88.574159 
+      30    467.26411    -1940.978   -1913.1215   -35957.489    -2755.021   -6.9179958  0.049322453            0    78.853173   0.13604393   -51.653635            0    19.862871   -9.7098575    853.79334     -151.232            0     80.86177 
+      40    647.45528   -1951.1994   -1912.6006    -5883.713   -2798.3556    17.334814   0.15102862            0    63.235117   0.18070924   -54.598957            0    17.325007   -12.052278     883.0167   -164.21335            0    96.777424 
+      50    716.38088   -1949.4735   -1906.7656    5473.1969   -2800.9309    9.2056861   0.15413274            0    85.371466    3.2986127   -78.253597            0    34.861774    -8.553123    882.01431   -193.85254            0    117.21068 
+      60    1175.2705    -1975.961   -1905.8958   -1939.4966   -2726.5816   -11.651996   0.24296786            0    48.320654    7.1799691   -75.363638            0    16.520127   -4.8869441    844.75401   -194.23297            0    119.73841 
+      70     1156.701   -1975.3497   -1906.3916    24628.304   -2880.5225    25.652501   0.26894311            0    83.724852    7.1049152    -68.70096            0    24.750735   -8.6338267    911.20079   -183.40562            0    113.21047 
+      80    840.23677   -1955.4769   -1905.3851   -17731.334   -2755.7299   -8.0167723    0.1386797            0    86.147417    2.2387319   -76.945843            0    23.595869    -7.260968    853.63487   -167.88288            0    94.603961 
+      90    365.79122   -1926.4061    -1904.599    898.38479   -2842.1832    47.368107   0.23109002            0    92.288071   0.38031213   -61.361485            0    18.476336    -12.25546    900.24233   -186.48046            0    116.88827 
+     100    801.32158    -1953.418   -1905.6462   -2417.6887   -2802.7247    4.6676477   0.18046575            0    76.729987    5.4177322   -77.102566            0    24.997175   -7.7554074    898.67337   -196.89114            0    120.38946 
+Loop time of 0.463306 on 1 procs for 100 steps with 21 atoms
+
+Performance: 18.649 ns/day, 1.287 hours/ns, 215.840 timesteps/s
+99.6% CPU use with 1 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 0.46143    | 0.46143    | 0.46143    |   0.0 | 99.60
+Neigh   | 0.00087953 | 0.00087953 | 0.00087953 |   0.0 |  0.19
+Comm    | 0.00042653 | 0.00042653 | 0.00042653 |   0.0 |  0.09
+Output  | 0.00034237 | 0.00034237 | 0.00034237 |   0.0 |  0.07
+Modify  | 0.00010109 | 0.00010109 | 0.00010109 |   0.0 |  0.02
+Other   |            | 0.000124   |            |       |  0.03
+
+Nlocal:    21 ave 21 max 21 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Nghost:    546 ave 546 max 546 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Neighs:    1106 ave 1106 max 1106 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+
+Total # of neighbors = 1106
+Ave neighs/atom = 52.6667
+Neighbor list builds = 10
+Dangerous builds not checked
+Total wall time: 0:00:00
--- a/examples/reax/log.8March18.reax.rdx.g++.4
+++ b/examples/reax/log.8March18.reax.rdx.g++.4
@ -0,0 +1,107 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# ReaxFF potential for RDX system
+
+units		real
+
+atom_style	charge
+read_data	data.rdx
+  orthogonal box = (35 35 35) to (48 48 48)
+  1 by 2 by 2 MPI processor grid
+  reading atoms ...
+  21 atoms
+
+#     reax args: hbcut hbnewflag tripflag precision
+
+pair_style	reax 6.0 1 1 1.0e-6
+WARNING: The pair_style reax command is unsupported. Please switch to pair_style reax/c instead (../pair_reax.cpp:49)
+pair_coeff	* * ffield.reax 1 2 3 4
+
+compute reax all pair reax
+
+variable eb  	 equal c_reax[1]
+variable ea  	 equal c_reax[2]
+variable elp 	 equal c_reax[3]
+variable emol 	 equal c_reax[4]
+variable ev 	 equal c_reax[5]
+variable epen 	 equal c_reax[6]
+variable ecoa 	 equal c_reax[7]
+variable ehb 	 equal c_reax[8]
+variable et 	 equal c_reax[9]
+variable eco 	 equal c_reax[10]
+variable ew 	 equal c_reax[11]
+variable ep 	 equal c_reax[12]
+variable efi 	 equal c_reax[13]
+variable eqeq 	 equal c_reax[14]
+
+neighbor	2.5 bin
+neigh_modify	every 10 delay 0 check no
+
+fix		1 all nve
+
+thermo		10
+thermo_style    custom step temp epair etotal press 	     	v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb 		v_et v_eco v_ew v_ep v_efi v_eqeq
+
+timestep	1.0
+
+#dump            1 all custom 10 dump.reax.rdx id type q xs ys zs
+
+#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	2 pad 3
+
+#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	3 pad 3
+
+run		100
+Neighbor list info ...
+  update every 10 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 12.5
+  ghost atom cutoff = 12.5
+  binsize = 6.25, bins = 3 3 3
+  1 neighbor lists, perpetual/occasional/extra = 1 0 0
+  (1) pair reax, perpetual
+      attributes: half, newton off
+      pair build: half/bin/newtoff
+      stencil: half/bin/3d/newtoff
+      bin: standard
+Per MPI rank memory allocation (min/avg/max) = 3.262 | 3.36 | 3.647 Mbytes
+Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
+       0            0   -1885.1268   -1885.1268    27233.074   -2958.4712    79.527715   0.31082031            0    97.771125    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79972            0    168.88428 
+      10    1281.7558   -1989.1322   -1912.7187   -19609.913   -2733.8828   -15.775275   0.20055725            0    55.020231    3.1070523   -77.710916            0    14.963568   -5.8082203    843.41939   -180.17724            0    107.51152 
+      20    516.83079    -1941.677   -1910.8655   -12525.412   -2801.8626     7.410797  0.073134187            0    81.986983    0.2281551   -57.494871            0    30.656735   -10.102557    877.78695   -158.93385            0    88.574168 
+      30    467.26411    -1940.978   -1913.1215   -35957.489    -2755.021   -6.9179959  0.049322449            0    78.853173   0.13604392   -51.653635            0    19.862871   -9.7098575    853.79334     -151.232            0    80.861765 
+      40    647.45479   -1951.1995   -1912.6007   -5883.7199   -2798.3556    17.334805   0.15102868            0    63.235116   0.18070946    -54.59897            0     17.32501   -12.052277     883.0166   -164.21339            0    96.777473 
+      50    716.37927    -1949.466   -1906.7582    5473.2486   -2800.9309    9.2056758   0.15413278            0     85.37143    3.2986099   -78.253596            0    34.861773   -8.5531243    882.01424   -193.85223            0    117.21791 
+      60    1175.2698   -1975.9612    -1905.896   -1939.5206   -2726.5818   -11.651942   0.24296793            0    48.320679    7.1799538    -75.36365            0    16.520134   -4.8869515    844.75405   -194.23289            0     119.7383 
+      70    1156.6963   -1975.3494   -1906.3915    24628.423   -2880.5221     25.65242   0.26894312            0    83.724787    7.1049615   -68.700925            0    24.750729   -8.6338123     911.2006   -183.40591            0    113.21091 
+      80      840.238   -1955.4788    -1905.387   -17731.371   -2755.7301   -8.0167357   0.13868007            0    86.147246    2.2387405   -76.945868            0    23.595868   -7.2609697     853.6349   -167.88312            0    94.602512 
+      90    365.78645   -1926.4072   -1904.6004    898.36945   -2842.1831    47.368307   0.23108998            0    92.288039   0.38031101   -61.361464            0    18.476388   -12.255481    900.24216   -186.48066            0    116.88716 
+     100    801.31322   -1953.4165   -1905.6452   -2417.2041   -2802.7247    4.6678077   0.18046498            0    76.730367    5.4176812   -77.102592            0      24.9973   -7.7554425     898.6732   -196.89097            0    120.39043 
+Loop time of 0.404551 on 4 procs for 100 steps with 21 atoms
+
+Performance: 21.357 ns/day, 1.124 hours/ns, 247.188 timesteps/s
+97.4% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 0.2191     | 0.28038    | 0.39839    |  13.2 | 69.31
+Neigh   | 5.8651e-05 | 0.00025928 | 0.00062203 |   0.0 |  0.06
+Comm    | 0.0046599  | 0.12307    | 0.1845     |  19.9 | 30.42
+Output  | 0.00055337 | 0.00062728 | 0.00071192 |   0.0 |  0.16
+Modify  | 5.3167e-05 | 7.844e-05  | 0.00010109 |   0.0 |  0.02
+Other   |            | 0.0001363  |            |       |  0.03
+
+Nlocal:    5.25 ave 15 max 0 min
+Histogram: 1 0 2 0 0 0 0 0 0 1
+Nghost:    355.5 ave 432 max 282 min
+Histogram: 1 0 0 0 1 1 0 0 0 1
+Neighs:    301.25 ave 827 max 0 min
+Histogram: 1 0 2 0 0 0 0 0 0 1
+
+Total # of neighbors = 1205
+Ave neighs/atom = 57.381
+Neighbor list builds = 10
+Dangerous builds not checked
+Total wall time: 0:00:00
--- a/examples/reax/log.8March18.reax.tatb.g++.1
+++ b/examples/reax/log.8March18.reax.tatb.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # ReaxFF potential for TATB system

 units		real
@ -12,7 +13,7 @@ read_data	data.tatb

 #     reax args: hbcut hbnewflag tripflag precision
 pair_style	reax 6.0 1 1 1.0e-6
-WARNING: The pair_style reax command will be deprecated soon - users should switch to pair_style reax/c (../pair_reax.cpp:49)
+WARNING: The pair_style reax command is unsupported. Please switch to pair_style reax/c instead (../pair_reax.cpp:49)
 pair_coeff	* * ffield.reax 1 2 3 4

 compute reax all pair reax
@ -54,34 +55,39 @@ fix 		2 all reax/bonds 25 bonds.reax.tatb

 run		25
 Neighbor list info ...
-  1 neighbor list requests
  update every 5 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12.5
  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 5 4 3
-Memory usage per processor = 6.61277 Mbytes
+  binsize = 6.25, bins = 5 4 3
+  1 neighbor lists, perpetual/occasional/extra = 1 0 0
+  (1) pair reax, perpetual
+      attributes: half, newton off
+      pair build: half/bin/newtoff
+      stencil: half/bin/3d/newtoff
+      bin: standard
+Per MPI rank memory allocation (min/avg/max) = 7.764 | 7.764 | 7.764 Mbytes
 Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
       0            0    -44767.08    -44767.08    7294.6353   -61120.591     486.4378    4.7236377            0     1568.024    20.788929   -279.51642   -1556.4696    252.57147   -655.84699    18862.412   -8740.6378            0    6391.0231 
-       5   0.63682807   -44767.737    -44767.01    8391.5966   -61118.763    486.82916     4.723415            0     1567.835    20.768662   -278.20804   -1557.6962    252.64683   -655.74117    18859.328   -8738.2727            0    6388.8127 
-      10    2.4306957    -44769.41   -44766.635    11717.369   -61113.142    487.89093    4.7227063            0    1567.2936    20.705084   -274.37509   -1560.8546    252.87219   -655.43578     18850.19   -8731.0713            0    6381.7946 
-      15    5.0590478    -44772.63   -44766.854    17125.033    -61103.34    489.28007    4.7214008            0    1566.4744    20.590604   -268.28963   -1566.5961    252.97781   -654.93836    18835.335   -8719.3112            0    6370.4665 
-      20    8.0678579   -44775.923   -44766.713    24620.824   -61088.791    490.42346    4.7193467            0    1565.5541    20.415031   -260.38512   -1574.1001    253.39805   -654.26837    18815.312   -8703.3104            0    6355.1097 
-      25    10.975539   -44777.231   -44764.701    34381.278   -61068.889    490.53149    4.7164093            0    1566.5715    20.169755    -251.2311   -1582.8552    253.88696   -653.46042    18790.855   -8683.8362            0    6336.3099 
-Loop time of 7.48375 on 1 procs for 25 steps with 384 atoms
+       5   0.63682806   -44767.737    -44767.01    8391.5964   -61118.763    486.82916     4.723415            0     1567.835    20.768662   -278.20804   -1557.6962    252.64683   -655.74117    18859.328   -8738.2728            0    6388.8127 
+      10    2.4306958   -44769.409   -44766.634    11717.376   -61113.142    487.89093    4.7227063            0    1567.2936    20.705084   -274.37509   -1560.8546    252.87219   -655.43578     18850.19   -8731.0693            0    6381.7942 
+      15    5.0590493   -44772.631   -44766.855    17125.067    -61103.34    489.28007    4.7214008            0    1566.4744    20.590604   -268.28962   -1566.5961    252.97781   -654.93836    18835.335   -8719.3013            0    6370.4551 
+      20     8.067859   -44775.936   -44766.725    24620.627   -61088.791    490.42346    4.7193467            0    1565.5541    20.415031   -260.38512   -1574.1001    253.39805   -654.26837    18815.312   -8703.3748            0    6355.1614 
+      25    10.975538   -44777.233   -44764.702    34381.173   -61068.889    490.53149    4.7164093            0    1566.5715    20.169755   -251.23109   -1582.8552    253.88696   -653.46042    18790.855   -8683.8691            0    6336.3409 
+Loop time of 7.80129 on 1 procs for 25 steps with 384 atoms

-Performance: 0.018 ns/day, 1330.444 hours/ns, 3.341 timesteps/s
-99.9% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 0.017 ns/day, 1386.896 hours/ns, 3.205 timesteps/s
+99.5% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 7.4284     | 7.4284     | 7.4284     |   0.0 | 99.26
-Neigh   | 0.051549   | 0.051549   | 0.051549   |   0.0 |  0.69
-Comm    | 0.0021887  | 0.0021887  | 0.0021887  |   0.0 |  0.03
-Output  | 0.00025821 | 0.00025821 | 0.00025821 |   0.0 |  0.00
-Modify  | 0.00099206 | 0.00099206 | 0.00099206 |   0.0 |  0.01
-Other   |            | 0.0003154  |            |       |  0.00
+Pair    | 7.7384     | 7.7384     | 7.7384     |   0.0 | 99.19
+Neigh   | 0.058615   | 0.058615   | 0.058615   |   0.0 |  0.75
+Comm    | 0.0022428  | 0.0022428  | 0.0022428  |   0.0 |  0.03
+Output  | 0.00033212 | 0.00033212 | 0.00033212 |   0.0 |  0.00
+Modify  | 0.0013618  | 0.0013618  | 0.0013618  |   0.0 |  0.02
+Other   |            | 0.0003309  |            |       |  0.00

 Nlocal:    384 ave 384 max 384 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
@ -94,4 +100,4 @@ Total # of neighbors = 286828
 Ave neighs/atom = 746.948
 Neighbor list builds = 5
 Dangerous builds not checked
-Total wall time: 0:00:07
+Total wall time: 0:00:08
--- a/examples/reax/log.8March18.reax.tatb.g++.4
+++ b/examples/reax/log.8March18.reax.tatb.g++.4
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # ReaxFF potential for TATB system

 units		real
@ -12,7 +13,7 @@ read_data	data.tatb

 #     reax args: hbcut hbnewflag tripflag precision
 pair_style	reax 6.0 1 1 1.0e-6
-WARNING: The pair_style reax command will be deprecated soon - users should switch to pair_style reax/c (../pair_reax.cpp:49)
+WARNING: The pair_style reax command is unsupported. Please switch to pair_style reax/c instead (../pair_reax.cpp:49)
 pair_coeff	* * ffield.reax 1 2 3 4

 compute reax all pair reax
@ -54,34 +55,39 @@ fix 		2 all reax/bonds 25 bonds.reax.tatb

 run		25
 Neighbor list info ...
-  1 neighbor list requests
  update every 5 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12.5
  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 5 4 3
-Memory usage per processor = 4.03843 Mbytes
+  binsize = 6.25, bins = 5 4 3
+  1 neighbor lists, perpetual/occasional/extra = 1 0 0
+  (1) pair reax, perpetual
+      attributes: half, newton off
+      pair build: half/bin/newtoff
+      stencil: half/bin/3d/newtoff
+      bin: standard
+Per MPI rank memory allocation (min/avg/max) = 4.402 | 4.402 | 4.402 Mbytes
 Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
       0            0    -44767.08    -44767.08    7294.6353   -61120.591     486.4378    4.7236377            0     1568.024    20.788929   -279.51642   -1556.4696    252.57147   -655.84699    18862.412   -8740.6378            0    6391.0231 
-       5   0.63682726   -44767.816   -44767.089     8391.165   -61118.763    486.82916     4.723415            0     1567.835    20.768662   -278.20804   -1557.6962    252.64683   -655.74117    18859.328   -8738.3995            0      6388.86 
-      10    2.4306905   -44769.408   -44766.633    11717.247   -61113.142    487.89094    4.7227063            0    1567.2936    20.705084    -274.3751   -1560.8546    252.87219   -655.43578     18850.19   -8731.0965            0    6381.8216 
-      15    5.0590422   -44772.626    -44766.85    17124.943    -61103.34     489.2801    4.7214008            0    1566.4744    20.590604   -268.28963   -1566.5961    252.97781   -654.93836    18835.335   -8719.3383            0    6370.4973 
-      20    8.0678512   -44775.934   -44766.723    24620.531   -61088.791    490.42349    4.7193467            0    1565.5541    20.415031   -260.38513   -1574.1001    253.39804   -654.26837    18815.312   -8703.4033            0    6355.1921 
-      25     10.97553   -44777.231   -44764.701    34381.242   -61068.889    490.53154    4.7164093            0    1566.5715    20.169755   -251.23111   -1582.8552    253.88696   -653.46042    18790.855   -8683.8451            0    6336.3185 
-Loop time of 3.27945 on 4 procs for 25 steps with 384 atoms
+       5   0.63682727   -44767.816   -44767.089    8391.1708   -61118.763    486.82916     4.723415            0     1567.835    20.768662   -278.20804   -1557.6962    252.64683   -655.74117    18859.328   -8738.3973            0    6388.8581 
+      10    2.4306941   -44769.405    -44766.63    11717.306   -61113.142    487.89094    4.7227063            0    1567.2936    20.705084    -274.3751   -1560.8546    252.87219   -655.43578     18850.19     -8731.08            0    6381.8083 
+      15    5.0590444     -44772.6   -44766.824    17125.207    -61103.34    489.28008    4.7214008            0    1566.4744    20.590604   -268.28963   -1566.5961    252.97781   -654.93836    18835.335   -8719.2653            0    6370.4505 
+      20    8.0678523   -44775.983   -44766.772    24620.114   -61088.791    490.42348    4.7193467            0    1565.5541    20.415031   -260.38513   -1574.1001    253.39804   -654.26837    18815.312   -8703.5228            0    6355.2629 
+      25    10.975532   -44777.234   -44764.704    34381.065   -61068.889    490.53151    4.7164093            0    1566.5715    20.169755   -251.23111   -1582.8552    253.88696   -653.46042    18790.855    -8683.898            0    6336.3682 
+Loop time of 3.74388 on 4 procs for 25 steps with 384 atoms

-Performance: 0.041 ns/day, 583.013 hours/ns, 7.623 timesteps/s
-99.8% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 0.036 ns/day, 665.579 hours/ns, 6.678 timesteps/s
+98.7% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 3.0329     | 3.1456     | 3.2612     |   5.2 | 95.92
-Neigh   | 0.011087   | 0.011261   | 0.011608   |   0.2 |  0.34
-Comm    | 0.0057111  | 0.12121    | 0.23398    |  26.2 |  3.70
-Output  | 0.00039172 | 0.0005855  | 0.00080633 |   0.6 |  0.02
-Modify  | 0.00035787 | 0.00059456 | 0.00082469 |   0.7 |  0.02
-Other   |            | 0.0002265  |            |       |  0.01
+Pair    | 3.478      | 3.6025     | 3.7215     |   4.8 | 96.22
+Neigh   | 0.012731   | 0.01299    | 0.013174   |   0.2 |  0.35
+Comm    | 0.0073411  | 0.12653    | 0.25119    |  25.4 |  3.38
+Output  | 0.00050354 | 0.00081849 | 0.0011628  |   0.0 |  0.02
+Modify  | 0.00049281 | 0.00082356 | 0.001157   |   0.0 |  0.02
+Other   |            | 0.0002663  |            |       |  0.01

 Nlocal:    96 ave 96 max 96 min
 Histogram: 4 0 0 0 0 0 0 0 0 0
--- a/examples/reax/log.8March18.reaxc.rdx.g++.1
+++ b/examples/reax/log.8March18.reaxc.rdx.g++.1
@ -0,0 +1,115 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# ReaxFF potential for RDX system
+# this run is equivalent to reax/in.reax.rdx
+
+units		real
+
+atom_style	charge
+read_data	data.rdx
+  orthogonal box = (35 35 35) to (48 48 48)
+  1 by 1 by 1 MPI processor grid
+  reading atoms ...
+  21 atoms
+
+pair_style      reax/c control.reax_c.rdx
+pair_coeff      * * ffield.reax C H O N
+Reading potential file ffield.reax with DATE: 2010-02-19
+
+compute reax all pair reax/c
+
+variable eb  	 equal c_reax[1]
+variable ea  	 equal c_reax[2]
+variable elp 	 equal c_reax[3]
+variable emol 	 equal c_reax[4]
+variable ev 	 equal c_reax[5]
+variable epen 	 equal c_reax[6]
+variable ecoa 	 equal c_reax[7]
+variable ehb 	 equal c_reax[8]
+variable et 	 equal c_reax[9]
+variable eco 	 equal c_reax[10]
+variable ew 	 equal c_reax[11]
+variable ep 	 equal c_reax[12]
+variable efi 	 equal c_reax[13]
+variable eqeq 	 equal c_reax[14]
+
+neighbor	2.5 bin
+neigh_modify	every 10 delay 0 check no
+
+fix		1 all nve
+fix             2 all qeq/reax 1 0.0 10.0 1.0e-6 reax/c
+
+thermo		10
+thermo_style 	custom step temp epair etotal press 		v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa 		v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq
+
+timestep	1.0
+
+#dump		1 all atom 10 dump.reaxc.rdx
+
+#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	2 pad 3
+
+#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	3 pad 3
+
+run		100
+Neighbor list info ...
+  update every 10 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 12.5
+  ghost atom cutoff = 12.5
+  binsize = 6.25, bins = 3 3 3
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 15.28 | 15.28 | 15.28 Mbytes
+Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
+       0            0   -1884.3081   -1884.3081    27186.181   -2958.4712    79.527715   0.31082031            0    98.589783    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79937            0    168.88402 
+      10    1288.6114   -1989.6644   -1912.8422    -19456.35   -2734.6769   -15.607219   0.20177961            0    54.629556    3.1252294     -77.7067            0    14.933901   -5.8108541    843.92074   -180.43322            0    107.75935 
+      20    538.95849   -1942.7037   -1910.5731   -10725.658   -2803.7395    7.9078331  0.077926702            0    81.610043   0.22951937   -57.557104            0    30.331203   -10.178049    878.99015   -159.69092            0    89.315159 
+      30    463.09542   -1933.5765   -1905.9685   -33255.507   -2749.8591   -8.0154628  0.027628767            0    81.627403   0.11972403   -50.262284            0     20.82032   -9.6327022    851.88722     -149.495            0    79.205731 
+      40    885.49449   -1958.9126   -1906.1228   -4814.7123    -2795.644    9.1506221    0.1374749            0    70.948046   0.24360579     -57.8627            0    19.076515   -11.141211    873.73892    -159.9939            0    92.434059 
+      50     861.1646   -1954.4599   -1903.1206   -1896.7387   -2784.8446    3.8269113    0.1579328            0    79.851775    3.3492107   -78.066127            0    32.628975   -7.9565255    872.81826   -190.98565            0    114.75994 
+      60     1167.785   -1971.8432   -1902.2243   -3482.6975   -2705.8638   -17.121582   0.22749067            0    44.507705     7.856069   -74.788959            0    16.256519   -4.6046602     835.8308   -188.33691            0    114.19414 
+      70    1439.9947   -1989.3024   -1903.4554    23845.067   -2890.7896    31.958874   0.26671735            0    85.758608    3.1803486   -71.002907            0    24.357106   -10.311315    905.86799   -175.38482            0    106.79659 
+      80    502.40024   -1930.7547   -1900.8035   -20356.557   -2703.8096   -18.663105   0.11286226            0    99.803799    2.0329394   -76.171387            0    19.236609   -6.2786041    826.47358   -166.03157            0    92.539694 
+      90    749.09267   -1946.9834   -1902.3254    17798.812   -2863.7586    42.068927   0.24338042            0     96.18195   0.96181754   -69.955528            0     24.61541    -11.58277    903.68895   -190.13838            0    120.69139 
+     100    1109.7046   -1968.5875   -1902.4311   -4490.6736   -2755.8953   -7.1235173   0.21757663            0    61.806405    7.0825933   -75.645487            0    20.114745   -6.2371664    863.56285   -198.56939            0    122.09923 
+Loop time of 0.395195 on 1 procs for 100 steps with 21 atoms
+
+Performance: 21.863 ns/day, 1.098 hours/ns, 253.039 timesteps/s
+99.3% CPU use with 1 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 0.3722     | 0.3722     | 0.3722     |   0.0 | 94.18
+Neigh   | 0.0098455  | 0.0098455  | 0.0098455  |   0.0 |  2.49
+Comm    | 0.00047445 | 0.00047445 | 0.00047445 |   0.0 |  0.12
+Output  | 0.00034022 | 0.00034022 | 0.00034022 |   0.0 |  0.09
+Modify  | 0.012187   | 0.012187   | 0.012187   |   0.0 |  3.08
+Other   |            | 0.0001521  |            |       |  0.04
+
+Nlocal:    21 ave 21 max 21 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Nghost:    546 ave 546 max 546 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Neighs:    1096 ave 1096 max 1096 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+
+Total # of neighbors = 1096
+Ave neighs/atom = 52.1905
+Neighbor list builds = 10
+Dangerous builds not checked
+
+Please see the log.cite file for references relevant to this simulation
+
+Total wall time: 0:00:00
--- a/examples/reax/log.8March18.reaxc.rdx.g++.4
+++ b/examples/reax/log.8March18.reaxc.rdx.g++.4
@ -0,0 +1,115 @@
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
+# ReaxFF potential for RDX system
+# this run is equivalent to reax/in.reax.rdx
+
+units		real
+
+atom_style	charge
+read_data	data.rdx
+  orthogonal box = (35 35 35) to (48 48 48)
+  1 by 2 by 2 MPI processor grid
+  reading atoms ...
+  21 atoms
+
+pair_style      reax/c control.reax_c.rdx
+pair_coeff      * * ffield.reax C H O N
+Reading potential file ffield.reax with DATE: 2010-02-19
+
+compute reax all pair reax/c
+
+variable eb  	 equal c_reax[1]
+variable ea  	 equal c_reax[2]
+variable elp 	 equal c_reax[3]
+variable emol 	 equal c_reax[4]
+variable ev 	 equal c_reax[5]
+variable epen 	 equal c_reax[6]
+variable ecoa 	 equal c_reax[7]
+variable ehb 	 equal c_reax[8]
+variable et 	 equal c_reax[9]
+variable eco 	 equal c_reax[10]
+variable ew 	 equal c_reax[11]
+variable ep 	 equal c_reax[12]
+variable efi 	 equal c_reax[13]
+variable eqeq 	 equal c_reax[14]
+
+neighbor	2.5 bin
+neigh_modify	every 10 delay 0 check no
+
+fix		1 all nve
+fix             2 all qeq/reax 1 0.0 10.0 1.0e-6 reax/c
+
+thermo		10
+thermo_style 	custom step temp epair etotal press 		v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa 		v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq
+
+timestep	1.0
+
+#dump		1 all atom 10 dump.reaxc.rdx
+
+#dump		2 all image 25 image.*.jpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	2 pad 3
+
+#dump		3 all movie 25 movie.mpg type type #		axes yes 0.8 0.02 view 60 -30
+#dump_modify	3 pad 3
+
+run		100
+Neighbor list info ...
+  update every 10 steps, delay 0 steps, check no
+  max neighbors/atom: 2000, page size: 100000
+  master list distance cutoff = 12.5
+  ghost atom cutoff = 12.5
+  binsize = 6.25, bins = 3 3 3
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 10.37 | 11.76 | 13.34 Mbytes
+Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
+       0            0   -1884.3081   -1884.3081    27186.178   -2958.4712    79.527715   0.31082031            0    98.589783    25.846176  -0.18034154            0    16.709078   -9.1620736    938.43732   -244.79988            0    168.88453 
+      10    1288.6115   -1989.6644   -1912.8422   -19456.354   -2734.6769    -15.60722    0.2017796            0    54.629558    3.1252286     -77.7067            0    14.933902   -5.8108544    843.92073   -180.43321            0    107.75934 
+      20    538.95818   -1942.7037   -1910.5731   -10725.623   -2803.7394    7.9078307  0.077926702            0     81.61005   0.22951942   -57.557107            0    30.331206   -10.178049     878.9901   -159.68951            0    89.313749 
+      30    463.09514   -1933.5765   -1905.9685   -33255.525    -2749.859   -8.0154737  0.027628797            0    81.627408   0.11972402   -50.262283            0     20.82031   -9.6327021    851.88715   -149.49499            0    79.205724 
+      40    885.49412   -1958.9125   -1906.1227   -4814.6606   -2795.6439     9.150622   0.13747487            0    70.948029   0.24360517   -57.862679            0    19.076509   -11.141214     873.7389   -159.99392            0    92.434078 
+      50    861.16393     -1954.46   -1903.1207   -1896.7323   -2784.8449    3.8270197    0.1579328            0    79.851743    3.3492115   -78.066132            0    32.628992   -7.9565379    872.81841   -190.98568            0    114.75996 
+      60    1167.7846   -1971.8432   -1902.2243   -3482.8111   -2705.8633   -17.121657    0.2274907            0    44.507681    7.8560366   -74.788989            0    16.256493   -4.6046537     835.8305   -188.33687            0     114.1941 
+      70    1439.9942   -1989.3023   -1903.4554    23845.444   -2890.7894    31.958784   0.26671721            0    85.758586    3.1803655   -71.002918            0    24.357158   -10.311304    905.86792   -175.38481            0    106.79657 
+      80     502.3975   -1930.7546   -1900.8036   -20356.439   -2703.8105   -18.662812   0.11286123            0     99.80391    2.0329293   -76.171334            0    19.236803   -6.2786439    826.47397   -166.03141            0    92.539551 
+      90    749.09048   -1946.9837   -1902.3258    17798.718   -2863.7582    42.068719   0.24338057            0    96.181773   0.96183581   -69.955529            0    24.615414   -11.582758    903.68862    -190.1384            0    120.69139 
+     100    1109.6999   -1968.5875   -1902.4314   -4490.3728   -2755.8964   -7.1231468   0.21757685            0    61.806149    7.0826648   -75.645428            0    20.115002   -6.2371958    863.56343   -198.56957            0    122.09942 
+Loop time of 0.329552 on 4 procs for 100 steps with 21 atoms
+
+Performance: 26.217 ns/day, 0.915 hours/ns, 303.443 timesteps/s
+96.9% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section |  min time  |  avg time  |  max time  |%varavg| %total
+---------------------------------------------------------------
+Pair    | 0.26372    | 0.26499    | 0.26754    |   0.3 | 80.41
+Neigh   | 0.0045478  | 0.0062494  | 0.0076699  |   1.5 |  1.90
+Comm    | 0.0041637  | 0.0064691  | 0.0080271  |   1.8 |  1.96
+Output  | 0.00054169 | 0.00056636 | 0.00060368 |   0.0 |  0.17
+Modify  | 0.049433   | 0.051134   | 0.05311    |   0.6 | 15.52
+Other   |            | 0.000141   |            |       |  0.04
+
+Nlocal:    5.25 ave 15 max 0 min
+Histogram: 1 0 2 0 0 0 0 0 0 1
+Nghost:    355.5 ave 432 max 282 min
+Histogram: 1 0 0 0 1 1 0 0 0 1
+Neighs:    298.75 ave 822 max 0 min
+Histogram: 1 0 2 0 0 0 0 0 0 1
+
+Total # of neighbors = 1195
+Ave neighs/atom = 56.9048
+Neighbor list builds = 10
+Dangerous builds not checked
+
+Please see the log.cite file for references relevant to this simulation
+
+Total wall time: 0:00:00
--- a/examples/reax/log.8March18.reaxc.tatb.g++.1
+++ b/examples/reax/log.8March18.reaxc.tatb.g++.1
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # ReaxFF potential for TATB system
 # this run is equivalent to reax/in.reax.tatb,

@ -56,34 +57,44 @@ fix 		3 all reax/c/species 1 5 5 species.tatb

 run		25
 Neighbor list info ...
-  2 neighbor list requests
  update every 5 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12.5
  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 5 4 3
-Memory usage per processor = 155.82 Mbytes
+  binsize = 6.25, bins = 5 4 3
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 176.7 | 176.7 | 176.7 Mbytes
 Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
-       0            0   -44760.998   -44760.998    7827.7879   -61120.591     486.4378    4.7236377            0    1574.1033    20.788929   -279.51642   -1556.4696    252.57147   -655.84699    18862.412   -8740.6394            0    6391.0274 
-       5   0.61603942   -44761.698   -44760.994    8934.6281   -61118.769    486.81263    4.7234094            0    1573.9241    20.768834   -278.24084   -1557.6713    252.64377   -655.74435    18859.379    -8738.193            0    6388.6691 
-      10    2.3525551   -44763.227   -44760.541    12288.607   -61113.174    487.82738    4.7226863            0     1573.411    20.705939   -274.50358   -1560.7569    252.85309   -655.44063    18850.391   -8730.9688            0    6381.7066 
-      15    4.9013326    -44766.36   -44760.764    17717.015   -61103.434    489.14721    4.7213644            0    1572.6349    20.593139   -268.56847   -1566.3829    252.95174   -654.96611    18835.777    -8719.237            0    6370.4033 
-      20     7.829471   -44769.686   -44760.747    25205.558   -61089.006    490.21313     4.719302            0    1571.7022    20.420943   -260.85565   -1573.7378     253.3539   -654.31623     18816.07   -8703.5091            0    6355.2604 
-      25    10.697926   -44772.904   -44760.691    34232.793   -61069.308    490.25886    4.7163736            0    1570.7397    20.181346   -251.91377   -1582.3261    253.82253   -653.53184    18791.975   -8684.3608            0    6336.8416 
-Loop time of 4.34725 on 1 procs for 25 steps with 384 atoms
+       0            0   -44760.998   -44760.998    7827.7874   -61120.591     486.4378    4.7236377            0    1574.1033    20.788929   -279.51642   -1556.4696    252.57147   -655.84699    18862.412   -8740.6395            0    6391.0275 
+       5   0.61603968   -44761.698   -44760.994    8934.6347   -61118.769    486.81263    4.7234094            0    1573.9241    20.768834   -278.24084   -1557.6713    252.64377   -655.74435    18859.379   -8738.1911            0    6388.6671 
+      10    2.3525551   -44763.227   -44760.541    12288.583   -61113.174    487.82738    4.7226863            0     1573.411    20.705939   -274.50357   -1560.7569    252.85309   -655.44063    18850.391   -8730.9768            0    6381.7146 
+      15    4.9013279    -44766.36   -44760.764     17717.01   -61103.434    489.14722    4.7213644            0    1572.6349    20.593139   -268.56847   -1566.3829    252.95174   -654.96611    18835.777   -8719.2375            0    6370.4038 
+      20    7.8294645   -44769.686   -44760.747    25205.624   -61089.006    490.21314     4.719302            0    1571.7022    20.420943   -260.85564   -1573.7378     253.3539   -654.31623     18816.07   -8703.4889            0    6355.2402 
+      25    10.697904   -44772.904   -44760.691    34232.965   -61069.308    490.25888    4.7163736            0    1570.7397    20.181346   -251.91377   -1582.3261    253.82253   -653.53184    18791.975   -8684.3125            0    6336.7934 
+Loop time of 4.72562 on 1 procs for 25 steps with 384 atoms

-Performance: 0.031 ns/day, 772.845 hours/ns, 5.751 timesteps/s
-99.8% CPU use with 1 MPI tasks x no OpenMP threads
+Performance: 0.029 ns/day, 840.110 hours/ns, 5.290 timesteps/s
+99.4% CPU use with 1 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 3.5264     | 3.5264     | 3.5264     |   0.0 | 81.12
-Neigh   | 0.40335    | 0.40335    | 0.40335    |   0.0 |  9.28
-Comm    | 0.0021031  | 0.0021031  | 0.0021031  |   0.0 |  0.05
-Output  | 0.00019765 | 0.00019765 | 0.00019765 |   0.0 |  0.00
-Modify  | 0.41479    | 0.41479    | 0.41479    |   0.0 |  9.54
-Other   |            | 0.0004084  |            |       |  0.01
+Pair    | 3.775      | 3.775      | 3.775      |   0.0 | 79.88
+Neigh   | 0.47047    | 0.47047    | 0.47047    |   0.0 |  9.96
+Comm    | 0.0025151  | 0.0025151  | 0.0025151  |   0.0 |  0.05
+Output  | 0.0003159  | 0.0003159  | 0.0003159  |   0.0 |  0.01
+Modify  | 0.47676    | 0.47676    | 0.47676    |   0.0 | 10.09
+Other   |            | 0.0005293  |            |       |  0.01

 Nlocal:    384 ave 384 max 384 min
 Histogram: 1 0 0 0 0 0 0 0 0 0
@ -99,4 +110,4 @@ Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:04
+Total wall time: 0:00:05
--- a/examples/reax/log.8March18.reaxc.tatb.g++.4
+++ b/examples/reax/log.8March18.reaxc.tatb.g++.4
@ -1,4 +1,5 @@
-LAMMPS (5 Oct 2016)
+LAMMPS (8 Mar 2018)
+  using 1 OpenMP thread(s) per MPI task
 # ReaxFF potential for TATB system
 # this run is equivalent to reax/in.reax.tatb,

@ -56,34 +57,44 @@ fix 		3 all reax/c/species 1 5 5 species.tatb

 run		25
 Neighbor list info ...
-  2 neighbor list requests
  update every 5 steps, delay 0 steps, check no
  max neighbors/atom: 2000, page size: 100000
  master list distance cutoff = 12.5
  ghost atom cutoff = 12.5
-  binsize = 6.25 -> bins = 5 4 3
-Memory usage per processor = 105.386 Mbytes
+  binsize = 6.25, bins = 5 4 3
+  2 neighbor lists, perpetual/occasional/extra = 2 0 0
+  (1) pair reax/c, perpetual
+      attributes: half, newton off, ghost
+      pair build: half/bin/newtoff/ghost
+      stencil: half/ghost/bin/3d/newtoff
+      bin: standard
+  (2) fix qeq/reax, perpetual, copy from (1)
+      attributes: half, newton off, ghost
+      pair build: copy
+      stencil: none
+      bin: none
+Per MPI rank memory allocation (min/avg/max) = 118 | 118 | 118 Mbytes
 Step Temp E_pair TotEng Press v_eb v_ea v_elp v_emol v_ev v_epen v_ecoa v_ehb v_et v_eco v_ew v_ep v_efi v_eqeq 
-       0            0   -44760.998   -44760.998    7827.7867   -61120.591     486.4378    4.7236377            0    1574.1033    20.788929   -279.51642   -1556.4696    252.57147   -655.84699    18862.412   -8740.6397            0    6391.0277 
-       5   0.61603967   -44761.698   -44760.994    8934.6339   -61118.769    486.81263    4.7234094            0    1573.9241    20.768834   -278.24084   -1557.6713    252.64377   -655.74435    18859.379   -8738.1905            0    6388.6665 
-      10    2.3525545   -44763.227   -44760.541    12288.586   -61113.174    487.82738    4.7226863            0     1573.411    20.705939   -274.50357   -1560.7569    252.85309   -655.44063    18850.391   -8730.9762            0     6381.714 
-      15    4.9013281    -44766.36   -44760.764    17716.982   -61103.434    489.14722    4.7213644            0    1572.6349    20.593139   -268.56847   -1566.3829    252.95174   -654.96611    18835.777   -8719.2476            0    6370.4138 
-      20    7.8294637   -44769.686   -44760.747    25205.512   -61089.006    490.21314     4.719302            0    1571.7022    20.420943   -260.85565   -1573.7378     253.3539   -654.31623     18816.07    -8703.518            0    6355.2692 
-      25    10.697905   -44772.904   -44760.691    34232.815   -61069.308    490.25887    4.7163736            0    1570.7397    20.181346   -251.91377   -1582.3261    253.82253   -653.53184    18791.975   -8684.3481            0     6336.829 
-Loop time of 2.60733 on 4 procs for 25 steps with 384 atoms
+       0            0   -44760.998   -44760.998    7827.7866   -61120.591     486.4378    4.7236377            0    1574.1033    20.788929   -279.51642   -1556.4696    252.57147   -655.84699    18862.412   -8740.6398            0    6391.0277 
+       5   0.61603968   -44761.698   -44760.994    8934.6335   -61118.769    486.81263    4.7234094            0    1573.9241    20.768834   -278.24084   -1557.6713    252.64377   -655.74435    18859.379   -8738.1906            0    6388.6666 
+      10    2.3525544   -44763.227   -44760.541    12288.587   -61113.174    487.82738    4.7226863            0     1573.411    20.705939   -274.50357   -1560.7569    252.85309   -655.44063    18850.391   -8730.9764            0    6381.7141 
+      15    4.9013311    -44766.36   -44760.764    17716.955   -61103.434    489.14721    4.7213644            0    1572.6349    20.593139   -268.56847   -1566.3829    252.95174   -654.96611    18835.777   -8719.2558            0    6370.4221 
+      20    7.8294715   -44769.686   -44760.747    25205.613   -61089.006    490.21314    4.7193021            0    1571.7022    20.420943   -260.85564   -1573.7378     253.3539   -654.31623     18816.07   -8703.4906            0    6355.2419 
+      25    10.697924   -44772.904   -44760.691    34232.794   -61069.308    490.25886    4.7163736            0    1570.7397    20.181347   -251.91376   -1582.3261    253.82253   -653.53183    18791.975   -8684.3641            0    6336.8449 
+Loop time of 2.84068 on 4 procs for 25 steps with 384 atoms

-Performance: 0.052 ns/day, 463.526 hours/ns, 9.588 timesteps/s
-99.9% CPU use with 4 MPI tasks x no OpenMP threads
+Performance: 0.048 ns/day, 505.009 hours/ns, 8.801 timesteps/s
+98.4% CPU use with 4 MPI tasks x 1 OpenMP threads

 MPI task timing breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
 ---------------------------------------------------------------
-Pair    | 2.1835     | 2.1843     | 2.1854     |   0.0 | 83.77
-Neigh   | 0.22091    | 0.22364    | 0.22821    |   0.6 |  8.58
-Comm    | 0.005677   | 0.0069622  | 0.0078082  |   1.0 |  0.27
-Output  | 0.00036621 | 0.0028675  | 0.0037034  |   2.7 |  0.11
-Modify  | 0.18736    | 0.18921    | 0.19102    |   0.4 |  7.26
-Other   |            | 0.0003636  |            |       |  0.01
+Pair    | 2.3253     | 2.328      | 2.3305     |   0.2 | 81.95
+Neigh   | 0.2589     | 0.26458    | 0.26897    |   0.7 |  9.31
+Comm    | 0.0094428  | 0.012062   | 0.014872   |   2.3 |  0.42
+Output  | 0.00043392 | 0.0042209  | 0.0054941  |   3.4 |  0.15
+Modify  | 0.22563    | 0.23134    | 0.23579    |   0.8 |  8.14
+Other   |            | 0.0005122  |            |       |  0.02

 Nlocal:    96 ave 96 max 96 min
 Histogram: 4 0 0 0 0 0 0 0 0 0
@ -99,4 +110,4 @@ Dangerous builds not checked

 Please see the log.cite file for references relevant to this simulation

-Total wall time: 0:00:02
+Total wall time: 0:00:03
--- a/lib/colvars/Makefile.deps
+++ b/lib/colvars/Makefile.deps
@ -1,9 +1,9 @@

 $(COLVARS_OBJ_DIR)colvaratoms.o: colvaratoms.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h \
+ colvars_version.h colvarproxy.h colvarvalue.h colvartypes.h \
 colvarparse.h colvaratoms.h colvardeps.h
 $(COLVARS_OBJ_DIR)colvarbias_abf.o: colvarbias_abf.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h colvar.h \
+ colvars_version.h colvarproxy.h colvarvalue.h colvartypes.h colvar.h \
 colvarparse.h colvardeps.h lepton/include/Lepton.h \
 lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
@ -16,9 +16,9 @@ $(COLVARS_OBJ_DIR)colvarbias_abf.o: colvarbias_abf.cpp colvarmodule.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
 colvarbias_abf.h colvarbias.h colvargrid.h colvar_UIestimator.h
 $(COLVARS_OBJ_DIR)colvarbias_alb.o: colvarbias_alb.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h \
- colvarbias_alb.h colvar.h colvarparse.h colvardeps.h \
- lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
+ colvars_version.h colvarbias.h colvar.h colvarvalue.h colvartypes.h \
+ colvarparse.h colvardeps.h lepton/include/Lepton.h \
+ lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
 lepton/include/lepton/windowsIncludes.h \
 lepton/include/lepton/CustomFunction.h \
@ -27,9 +27,9 @@ $(COLVARS_OBJ_DIR)colvarbias_alb.o: colvarbias_alb.cpp colvarmodule.h \
 lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
 lepton/include/lepton/Exception.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarbias.h
+ colvarbias_alb.h
 $(COLVARS_OBJ_DIR)colvarbias.o: colvarbias.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h colvarbias.h \
+ colvars_version.h colvarproxy.h colvarvalue.h colvartypes.h colvarbias.h \
 colvar.h colvarparse.h colvardeps.h lepton/include/Lepton.h \
 lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
@ -42,8 +42,8 @@ $(COLVARS_OBJ_DIR)colvarbias.o: colvarbias.cpp colvarmodule.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
 colvargrid.h
 $(COLVARS_OBJ_DIR)colvarbias_histogram.o: colvarbias_histogram.cpp \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvar.h colvarparse.h colvardeps.h \
+ colvarmodule.h colvars_version.h colvarproxy.h colvarvalue.h \
+ colvartypes.h colvar.h colvarparse.h colvardeps.h \
 lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
 lepton/include/lepton/windowsIncludes.h \
@ -54,9 +54,9 @@ $(COLVARS_OBJ_DIR)colvarbias_histogram.o: colvarbias_histogram.cpp \
 lepton/include/lepton/Exception.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
 colvarbias_histogram.h colvarbias.h colvargrid.h
-$(COLVARS_OBJ_DIR)colvarbias_meta.o: colvarbias_meta.cpp colvar.h \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarparse.h colvardeps.h lepton/include/Lepton.h \
+$(COLVARS_OBJ_DIR)colvarbias_meta.o: colvarbias_meta.cpp colvarmodule.h \
+ colvars_version.h colvarproxy.h colvarvalue.h colvartypes.h colvar.h \
+ colvarparse.h colvardeps.h lepton/include/Lepton.h \
 lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
 lepton/include/lepton/windowsIncludes.h \
@ -68,8 +68,8 @@ $(COLVARS_OBJ_DIR)colvarbias_meta.o: colvarbias_meta.cpp colvar.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
 colvarbias_meta.h colvarbias.h colvargrid.h
 $(COLVARS_OBJ_DIR)colvarbias_restraint.o: colvarbias_restraint.cpp \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarbias_restraint.h colvarbias.h colvar.h colvarparse.h \
+ colvarmodule.h colvars_version.h colvarproxy.h colvarvalue.h \
+ colvartypes.h colvarbias_restraint.h colvarbias.h colvar.h colvarparse.h \
 colvardeps.h lepton/include/Lepton.h \
 lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
@ -81,9 +81,9 @@ $(COLVARS_OBJ_DIR)colvarbias_restraint.o: colvarbias_restraint.cpp \
 lepton/include/lepton/Exception.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h
 $(COLVARS_OBJ_DIR)colvarcomp_angles.o: colvarcomp_angles.cpp \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvar.h colvarparse.h colvardeps.h \
- lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
+ colvarmodule.h colvars_version.h colvar.h colvarvalue.h colvartypes.h \
+ colvarparse.h colvardeps.h lepton/include/Lepton.h \
+ lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
 lepton/include/lepton/windowsIncludes.h \
 lepton/include/lepton/CustomFunction.h \
@ -92,10 +92,10 @@ $(COLVARS_OBJ_DIR)colvarcomp_angles.o: colvarcomp_angles.cpp \
 lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
 lepton/include/lepton/Exception.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarcomp.h colvaratoms.h
+ colvarcomp.h colvaratoms.h colvarproxy.h
 $(COLVARS_OBJ_DIR)colvarcomp_coordnums.o: colvarcomp_coordnums.cpp \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarparse.h colvaratoms.h colvardeps.h colvar.h \
+ colvarmodule.h colvars_version.h colvarparse.h colvarvalue.h \
+ colvartypes.h colvaratoms.h colvarproxy.h colvardeps.h colvar.h \
 lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
 lepton/include/lepton/windowsIncludes.h \
@ -107,59 +107,7 @@ $(COLVARS_OBJ_DIR)colvarcomp_coordnums.o: colvarcomp_coordnums.cpp \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
 colvarcomp.h
 $(COLVARS_OBJ_DIR)colvarcomp.o: colvarcomp.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h colvar.h \
- colvarparse.h colvardeps.h lepton/include/Lepton.h \
- lepton/include/lepton/CompiledExpression.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/windowsIncludes.h \
- lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/ExpressionProgram.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/Exception.h \
- lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarcomp.h colvaratoms.h
-$(COLVARS_OBJ_DIR)colvarcomp_distances.o: colvarcomp_distances.cpp \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarparse.h colvar.h colvardeps.h \
- lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/windowsIncludes.h \
- lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/ExpressionProgram.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/Exception.h \
- lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarcomp.h colvaratoms.h
-$(COLVARS_OBJ_DIR)colvarcomp_protein.o: colvarcomp_protein.cpp \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarparse.h colvar.h colvardeps.h \
- lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/windowsIncludes.h \
- lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/ExpressionProgram.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/Exception.h \
- lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarcomp.h colvaratoms.h
-$(COLVARS_OBJ_DIR)colvarcomp_rotations.o: colvarcomp_rotations.cpp \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarparse.h colvar.h colvardeps.h \
- lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/windowsIncludes.h \
- lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/ExpressionProgram.h \
- lepton/include/lepton/ExpressionTreeNode.h \
- lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
- lepton/include/lepton/Exception.h \
- lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarcomp.h colvaratoms.h
-$(COLVARS_OBJ_DIR)colvar.o: colvar.cpp colvarmodule.h colvars_version.h \
- colvartypes.h colvarproxy.h colvarvalue.h colvarparse.h colvar.h \
+ colvars_version.h colvarvalue.h colvartypes.h colvar.h colvarparse.h \
 colvardeps.h lepton/include/Lepton.h \
 lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
@ -170,12 +118,9 @@ $(COLVARS_OBJ_DIR)colvar.o: colvar.cpp colvarmodule.h colvars_version.h \
 lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
 lepton/include/lepton/Exception.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarcomp.h colvaratoms.h colvarscript.h colvarbias.h
-$(COLVARS_OBJ_DIR)colvardeps.o: colvardeps.cpp colvardeps.h \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarparse.h
-$(COLVARS_OBJ_DIR)colvargrid.o: colvargrid.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h \
+ colvarcomp.h colvaratoms.h colvarproxy.h
+$(COLVARS_OBJ_DIR)colvarcomp_distances.o: colvarcomp_distances.cpp \
+ colvarmodule.h colvars_version.h colvarvalue.h colvartypes.h \
 colvarparse.h colvar.h colvardeps.h lepton/include/Lepton.h \
 lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
@ -186,9 +131,9 @@ $(COLVARS_OBJ_DIR)colvargrid.o: colvargrid.cpp colvarmodule.h \
 lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
 lepton/include/lepton/Exception.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
- colvarcomp.h colvaratoms.h colvargrid.h
-$(COLVARS_OBJ_DIR)colvarmodule.o: colvarmodule.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h \
+ colvarcomp.h colvaratoms.h colvarproxy.h
+$(COLVARS_OBJ_DIR)colvarcomp_protein.o: colvarcomp_protein.cpp \
+ colvarmodule.h colvars_version.h colvarvalue.h colvartypes.h \
 colvarparse.h colvar.h colvardeps.h lepton/include/Lepton.h \
 lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
@ -199,14 +144,67 @@ $(COLVARS_OBJ_DIR)colvarmodule.o: colvarmodule.cpp colvarmodule.h \
 lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
 lepton/include/lepton/Exception.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
+ colvarcomp.h colvaratoms.h colvarproxy.h
+$(COLVARS_OBJ_DIR)colvarcomp_rotations.o: colvarcomp_rotations.cpp \
+ colvarmodule.h colvars_version.h colvarvalue.h colvartypes.h \
+ colvarparse.h colvar.h colvardeps.h lepton/include/Lepton.h \
+ lepton/include/lepton/CompiledExpression.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/windowsIncludes.h \
+ lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/ExpressionProgram.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/Exception.h \
+ lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
+ colvarcomp.h colvaratoms.h colvarproxy.h
+$(COLVARS_OBJ_DIR)colvar.o: colvar.cpp colvarmodule.h colvars_version.h \
+ colvarvalue.h colvartypes.h colvarparse.h colvar.h colvardeps.h \
+ lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/windowsIncludes.h \
+ lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/ExpressionProgram.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/Exception.h \
+ lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
+ colvarcomp.h colvaratoms.h colvarproxy.h colvarscript.h colvarbias.h
+$(COLVARS_OBJ_DIR)colvardeps.o: colvardeps.cpp colvarmodule.h \
+ colvars_version.h colvarproxy.h colvarvalue.h colvartypes.h colvardeps.h \
+ colvarparse.h
+$(COLVARS_OBJ_DIR)colvargrid.o: colvargrid.cpp colvarmodule.h \
+ colvars_version.h colvarvalue.h colvartypes.h colvarparse.h colvar.h \
+ colvardeps.h lepton/include/Lepton.h \
+ lepton/include/lepton/CompiledExpression.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/windowsIncludes.h \
+ lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/ExpressionProgram.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/Exception.h \
+ lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
+ colvarcomp.h colvaratoms.h colvarproxy.h colvargrid.h
+$(COLVARS_OBJ_DIR)colvarmodule.o: colvarmodule.cpp colvarmodule.h \
+ colvars_version.h colvarparse.h colvarvalue.h colvartypes.h \
+ colvarproxy.h colvar.h colvardeps.h lepton/include/Lepton.h \
+ lepton/include/lepton/CompiledExpression.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/windowsIncludes.h \
+ lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/ExpressionProgram.h \
+ lepton/include/lepton/ExpressionTreeNode.h \
+ lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
+ lepton/include/lepton/Exception.h \
+ lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
 colvarbias.h colvarbias_abf.h colvargrid.h colvar_UIestimator.h \
 colvarbias_alb.h colvarbias_histogram.h colvarbias_meta.h \
 colvarbias_restraint.h colvarscript.h colvaratoms.h colvarcomp.h
 $(COLVARS_OBJ_DIR)colvarparse.o: colvarparse.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h \
- colvarparse.h
+ colvars_version.h colvarvalue.h colvartypes.h colvarparse.h
 $(COLVARS_OBJ_DIR)colvarproxy.o: colvarproxy.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h \
+ colvars_version.h colvarproxy.h colvarvalue.h colvartypes.h \
 colvarscript.h colvarbias.h colvar.h colvarparse.h colvardeps.h \
 lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
@ -219,9 +217,9 @@ $(COLVARS_OBJ_DIR)colvarproxy.o: colvarproxy.cpp colvarmodule.h \
 lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
 colvaratoms.h
 $(COLVARS_OBJ_DIR)colvarscript.o: colvarscript.cpp colvarscript.h \
- colvarmodule.h colvars_version.h colvartypes.h colvarproxy.h \
- colvarvalue.h colvarbias.h colvar.h colvarparse.h colvardeps.h \
- lepton/include/Lepton.h lepton/include/lepton/CompiledExpression.h \
+ colvarmodule.h colvars_version.h colvarvalue.h colvartypes.h \
+ colvarbias.h colvar.h colvarparse.h colvardeps.h lepton/include/Lepton.h \
+ lepton/include/lepton/CompiledExpression.h \
 lepton/include/lepton/ExpressionTreeNode.h \
 lepton/include/lepton/windowsIncludes.h \
 lepton/include/lepton/CustomFunction.h \
@ -229,9 +227,9 @@ $(COLVARS_OBJ_DIR)colvarscript.o: colvarscript.cpp colvarscript.h \
 lepton/include/lepton/ExpressionTreeNode.h \
 lepton/include/lepton/Operation.h lepton/include/lepton/CustomFunction.h \
 lepton/include/lepton/Exception.h \
- lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h
+ lepton/include/lepton/ParsedExpression.h lepton/include/lepton/Parser.h \
+ colvarproxy.h
 $(COLVARS_OBJ_DIR)colvartypes.o: colvartypes.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h \
- colvarparse.h
+ colvars_version.h colvartypes.h colvarparse.h colvarvalue.h
 $(COLVARS_OBJ_DIR)colvarvalue.o: colvarvalue.cpp colvarmodule.h \
- colvars_version.h colvartypes.h colvarproxy.h colvarvalue.h
+ colvars_version.h colvarvalue.h colvartypes.h
--- a/lib/colvars/colvar.cpp
+++ b/lib/colvars/colvar.cpp
@ -195,7 +195,7 @@ int colvar::init(std::string const &conf)
  // - it is homogeneous
  // - all cvcs are periodic
  // - all cvcs have the same period
-  if (cvcs[0]->b_periodic) { // TODO make this a CVC feature
+  if (is_enabled(f_cv_homogeneous) && cvcs[0]->b_periodic) { // TODO make this a CVC feature
    bool b_periodic = true;
    period = cvcs[0]->period;
    for (i = 1; i < cvcs.size(); i++) {
--- a/lib/colvars/colvar.h
+++ b/lib/colvars/colvar.h
@ -11,8 +11,6 @@
 #define COLVAR_H

 #include <iostream>
-#include <iomanip>
-#include <cmath>

 #include "colvarmodule.h"
 #include "colvarvalue.h"
@ -60,10 +58,13 @@ public:

  /// \brief Current actual value (not extended DOF)
  colvarvalue const & actual_value() const;
-  
+
+  /// \brief Current running average (if calculated as set by analysis flag)
+  colvarvalue const & run_ave() const;
+
  /// \brief Force constant of the spring
  cvm::real const & force_constant() const;
-   
+
  /// \brief Current velocity (previously set by calc() or by read_traj())
  colvarvalue const & velocity() const;

@ -516,7 +517,7 @@ public:
  // collective variable component base class
  class cvc;

-  // currently available collective variable components
+  // list of available collective variable components

  // scalar colvar components
  class distance;
@ -611,12 +612,15 @@ inline colvarvalue const & colvar::value() const
  return x_reported;
 }

-
 inline colvarvalue const & colvar::actual_value() const
 {
  return x;
 }

+inline colvarvalue const & colvar::run_ave() const
+{
+  return runave;
+}

 inline colvarvalue const & colvar::velocity() const
 {
--- a/lib/colvars/colvar_UIestimator.h
+++ b/lib/colvars/colvar_UIestimator.h
@ -45,7 +45,7 @@ namespace UIestimator {
            this->width = width;
            this->dimension = lowerboundary.size();
            this->y_size = y_size;     // keep in mind the internal (spare) matrix is stored in diagonal form
-            this->y_total_size = int(pow(double(y_size), dimension) + EPSILON);
+            this->y_total_size = int(std::pow(double(y_size), double(dimension)) + EPSILON);

            // the range of the matrix is [lowerboundary, upperboundary]
            x_total_size = 1;
@ -121,7 +121,7 @@ namespace UIestimator {
            int index = 0;
            for (i = 0; i < dimension; i++) {
                if (i + 1 < dimension)
-                    index += temp[i] * int(pow(double(y_size), dimension - i - 1) + EPSILON);
+                    index += temp[i] * int(std::pow(double(y_size), double(dimension - i - 1)) + EPSILON);
                else
                    index += temp[i];
            }
--- a/lib/colvars/colvaratoms.cpp
+++ b/lib/colvars/colvaratoms.cpp
@ -8,9 +8,11 @@
 // Colvars repository at GitHub.

 #include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvarparse.h"
 #include "colvaratoms.h"

+
 cvm::atom::atom()
 {
  index = -1;
--- a/lib/colvars/colvaratoms.h
+++ b/lib/colvars/colvaratoms.h
@ -11,6 +11,7 @@
 #define COLVARATOMS_H

 #include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvarparse.h"
 #include "colvardeps.h"

--- a/lib/colvars/colvarbias.cpp
+++ b/lib/colvars/colvarbias.cpp
@ -8,6 +8,7 @@
 // Colvars repository at GitHub.

 #include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvarvalue.h"
 #include "colvarbias.h"
 #include "colvargrid.h"
--- a/lib/colvars/colvarbias_abf.cpp
+++ b/lib/colvars/colvarbias_abf.cpp
@ -8,6 +8,7 @@
 // Colvars repository at GitHub.

 #include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvar.h"
 #include "colvarbias_abf.h"

@ -18,16 +19,18 @@ colvarbias_abf::colvarbias_abf(char const *key)
    b_CZAR_estimator(false),
    system_force(NULL),
    gradients(NULL),
+    pmf(NULL),
    samples(NULL),
    z_gradients(NULL),
    z_samples(NULL),
    czar_gradients(NULL),
+    czar_pmf(NULL),
    last_gradients(NULL),
-    last_samples(NULL)
+    last_samples(NULL),
+    pabf_freq(0)
 {
 }

-
 int colvarbias_abf::init(std::string const &conf)
 {
  colvarbias::init(conf);
@ -91,7 +94,7 @@ int colvarbias_abf::init(std::string const &conf)

  // ************* checking the associated colvars *******************

-  if (colvars.size() == 0) {
+  if (num_variables() == 0) {
    cvm::error("Error: no collective variables specified for the ABF bias.\n");
    return COLVARS_ERROR;
  }
@ -102,7 +105,8 @@ int colvarbias_abf::init(std::string const &conf)
  }

  bool b_extended = false;
-  for (size_t i = 0; i < colvars.size(); i++) {
+  size_t i;
+  for (i = 0; i < num_variables(); i++) {

    if (colvars[i]->value().type() != colvarvalue::type_scalar) {
      cvm::error("Error: ABF bias can only use scalar-type variables.\n");
@ -132,10 +136,10 @@ int colvarbias_abf::init(std::string const &conf)
  }

  if (get_keyval(conf, "maxForce", max_force)) {
-    if (max_force.size() != colvars.size()) {
+    if (max_force.size() != num_variables()) {
      cvm::error("Error: Number of parameters to maxForce does not match number of colvars.");
    }
-    for (size_t i = 0; i < colvars.size(); i++) {
+    for (i = 0; i < num_variables(); i++) {
      if (max_force[i] < 0.0) {
        cvm::error("Error: maxForce should be non-negative.");
      }
@ -145,9 +149,9 @@ int colvarbias_abf::init(std::string const &conf)
    cap_force = false;
  }

-  bin.assign(colvars.size(), 0);
-  force_bin.assign(colvars.size(), 0);
-  system_force = new cvm::real [colvars.size()];
+  bin.assign(num_variables(), 0);
+  force_bin.assign(num_variables(), 0);
+  system_force = new cvm::real [num_variables()];

  // Construct empty grids based on the colvars
  if (cvm::debug()) {
@ -159,14 +163,14 @@ int colvarbias_abf::init(std::string const &conf)
  gradients->samples = samples;
  samples->has_parent_data = true;

-  // Data for eABF z-based estimator
-  if (b_extended) {
+  // Data for eAB F z-based estimator
+  if ( b_extended ) {
    get_keyval(conf, "CZARestimator", b_CZAR_estimator, true);
    // CZAR output files for stratified eABF
    get_keyval(conf, "writeCZARwindowFile", b_czar_window_file, false,
               colvarparse::parse_silent);

-    z_bin.assign(colvars.size(), 0);
+    z_bin.assign(num_variables(), 0);
    z_samples   = new colvar_grid_count(colvars);
    z_samples->request_actual_value();
    z_gradients = new colvar_grid_gradient(colvars);
@ -176,6 +180,27 @@ int colvarbias_abf::init(std::string const &conf)
    czar_gradients = new colvar_grid_gradient(colvars);
  }

+  // For now, we integrate on-the-fly iff the grid is < 3D
+  if ( num_variables() <= 3 ) {
+    pmf = new integrate_potential(colvars, gradients);
+    if ( b_CZAR_estimator ) {
+      czar_pmf = new integrate_potential(colvars, czar_gradients);
+    }
+    get_keyval(conf, "integrate", b_integrate, true); // Integrate for output
+    if ( num_variables() > 1 ) {
+      // Projected ABF
+      get_keyval(conf, "pABFintegrateFreq", pabf_freq, 0);
+      // Parameters for integrating initial (and final) gradient data
+      get_keyval(conf, "integrateInitSteps", integrate_initial_steps, 1e4);
+      get_keyval(conf, "integrateInitTol", integrate_initial_tol, 1e-6);
+      // for updating the integrated PMF on the fly
+      get_keyval(conf, "integrateSteps", integrate_steps, 100);
+      get_keyval(conf, "integrateTol", integrate_tol, 1e-4);
+    }
+  } else {
+    b_integrate = false;
+  }
+
  // For shared ABF, we store a second set of grids.
  // This used to be only if "shared" was defined,
  // but now we allow calling share externally (e.g. from Tcl).
@ -188,6 +213,8 @@ int colvarbias_abf::init(std::string const &conf)
  // If custom grids are provided, read them
  if ( input_prefix.size() > 0 ) {
    read_gradients_samples();
+    // Update divergence to account for input data
+    pmf->set_div();
  }

  // if extendedLangrangian is on, then call UI estimator
@ -202,7 +229,7 @@ int colvarbias_abf::init(std::string const &conf)

    bool UI_restart = (input_prefix.size() > 0);

-    for (size_t i = 0; i < colvars.size(); i++)
+    for (i = 0; i < num_variables(); i++)
    {
      UI_lowerboundary.push_back(colvars[i]->lower_boundary);
      UI_upperboundary.push_back(colvars[i]->upper_boundary);
@ -238,6 +265,11 @@ colvarbias_abf::~colvarbias_abf()
    gradients = NULL;
  }

+  if (pmf) {
+    delete pmf;
+    pmf = NULL;
+  }
+
  if (z_samples) {
    delete z_samples;
    z_samples = NULL;
@ -253,6 +285,11 @@ colvarbias_abf::~colvarbias_abf()
    czar_gradients = NULL;
  }

+  if (czar_pmf) {
+    delete czar_pmf;
+    czar_pmf = NULL;
+  }
+
  // shared ABF
  // We used to only do this if "shared" was defined,
  // but now we can call shared externally
@ -278,44 +315,48 @@ colvarbias_abf::~colvarbias_abf()

 int colvarbias_abf::update()
 {
+  int       iter;
+
  if (cvm::debug()) cvm::log("Updating ABF bias " + this->name);

-  if (cvm::step_relative() == 0) {
+  size_t i;
+  for (i = 0; i < num_variables(); i++) {
+    bin[i] = samples->current_bin_scalar(i);
+  }
+  if (cvm::proxy->total_forces_same_step()) {
+    // e.g. in LAMMPS, total forces are current
+    force_bin = bin;
+  }

-    // At first timestep, do only:
-    // initialization stuff (file operations relying on n_abf_biases
-    // compute current value of colvars
+  if (cvm::step_relative() > 0 || cvm::proxy->total_forces_same_step()) {

-    for (size_t i = 0; i < colvars.size(); i++) {
-      bin[i] = samples->current_bin_scalar(i);
-    }
+    if (update_bias) {
+//       if (b_adiabatic_reweighting) {
+//         // Update gradients non-locally based on conditional distribution of
+//         // fictitious variable TODO
+//
+//       } else
+      if (samples->index_ok(force_bin)) {
+        // Only if requested and within bounds of the grid...

-  } else {
-
-    for (size_t i = 0; i < colvars.size(); i++) {
-      bin[i] = samples->current_bin_scalar(i);
-    }
-
-    if ( update_bias && samples->index_ok(force_bin) ) {
-      // Only if requested and within bounds of the grid...
-
-      for (size_t i = 0; i < colvars.size(); i++) {
-        // get total forces (lagging by 1 timestep) from colvars
-        // and subtract previous ABF force if necessary
-        update_system_force(i);
+        for (i = 0; i < num_variables(); i++) {
+          // get total forces (lagging by 1 timestep) from colvars
+          // and subtract previous ABF force if necessary
+          update_system_force(i);
+        }
+        gradients->acc_force(force_bin, system_force);
+        if ( b_integrate ) {
+          pmf->update_div_neighbors(force_bin);
+        }
      }
-      if (cvm::proxy->total_forces_same_step()) {
-        // e.g. in LAMMPS, total forces are current
-        force_bin = bin;
-      }
-      gradients->acc_force(force_bin, system_force);
    }
+
    if ( z_gradients && update_bias ) {
-      for (size_t i = 0; i < colvars.size(); i++) {
+      for (i = 0; i < num_variables(); i++) {
        z_bin[i] = z_samples->current_bin_scalar(i);
      }
      if ( z_samples->index_ok(z_bin) ) {
-        for (size_t i = 0; i < colvars.size(); i++) {
+        for (i = 0; i < num_variables(); i++) {
          // If we are outside the range of xi, the force has not been obtained above
          // the function is just an accessor, so cheap to call again anyway
          update_system_force(i);
@ -323,6 +364,14 @@ int colvarbias_abf::update()
        z_gradients->acc_force(z_bin, system_force);
      }
    }
+
+    if ( b_integrate ) {
+      if ( pabf_freq && cvm::step_relative() % pabf_freq == 0 ) {
+        cvm::real err;
+        iter = pmf->integrate(integrate_steps, integrate_tol, err);
+        pmf->set_zero_minimum(); // TODO: do this only when necessary
+      }
+    }
  }

  if (!cvm::proxy->total_forces_same_step()) {
@ -332,14 +381,14 @@ int colvarbias_abf::update()
  }

  // Reset biasing forces from previous timestep
-  for (size_t i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    colvar_forces[i].reset();
  }

  // Compute and apply the new bias, if applicable
  if (is_enabled(f_cvb_apply_force) && samples->index_ok(bin)) {

-    size_t count = samples->value(bin);
+    cvm::real count = samples->value(bin);
    cvm::real fact = 1.0;

    // Factor that ensures smooth introduction of the force
@ -348,21 +397,34 @@ int colvarbias_abf::update()
        (cvm::real(count - min_samples)) / (cvm::real(full_samples - min_samples));
    }

-    const cvm::real * grad  = &(gradients->value(bin));
+    std::vector<cvm::real>  grad(num_variables());

+    if ( pabf_freq ) {
+      // In projected ABF, the force is the PMF gradient estimate
+      pmf->vector_gradient_finite_diff(bin, grad);
+    } else {
+      // Normal ABF
+      gradients->vector_value(bin, grad);
+    }
+
+//     if ( b_adiabatic_reweighting) {
+//       // Average of force according to conditional distribution of fictitious variable
+//       // need freshly integrated PMF, gradient TODO
+//     } else
    if ( fact != 0.0 ) {
-      if ( (colvars.size() == 1) && colvars[0]->periodic_boundaries() ) {
+      if ( (num_variables() == 1) && colvars[0]->periodic_boundaries() ) {
        // Enforce a zero-mean bias on periodic, 1D coordinates
        // in other words: boundary condition is that the biasing potential is periodic
-        colvar_forces[0].real_value = fact * (grad[0] / cvm::real(count) - gradients->average());
+        // This is enforced naturally if using integrated PMF
+        colvar_forces[0].real_value = fact * (grad[0] - gradients->average ());
      } else {
-        for (size_t i = 0; i < colvars.size(); i++) {
+        for (size_t i = 0; i < num_variables(); i++) {
          // subtracting the mean force (opposite of the FE gradient) means adding the gradient
-          colvar_forces[i].real_value = fact * grad[i] / cvm::real(count);
+          colvar_forces[i].real_value = fact * grad[i];
        }
      }
      if (cap_force) {
-        for (size_t i = 0; i < colvars.size(); i++) {
+        for (size_t i = 0; i < num_variables(); i++) {
          if ( colvar_forces[i].real_value * colvar_forces[i].real_value > max_force[i] * max_force[i] ) {
            colvar_forces[i].real_value = (colvar_forces[i].real_value > 0 ? max_force[i] : -1.0 * max_force[i]);
          }
@ -407,9 +469,9 @@ int colvarbias_abf::update()
  // update UI estimator every step
  if (b_UI_estimator)
  {
-    std::vector<double> x(colvars.size(),0);
-    std::vector<double> y(colvars.size(),0);
-    for (size_t i = 0; i < colvars.size(); i++)
+    std::vector<double> x(num_variables(),0);
+    std::vector<double> y(num_variables(),0);
+    for (size_t i = 0; i < num_variables(); i++)
    {
      x[i] = colvars[i]->actual_value();
      y[i] = colvars[i]->value();
@ -509,26 +571,60 @@ void colvarbias_abf::write_gradients_samples(const std::string &prefix, bool app
    cvm::proxy->output_stream(samples_out_name, mode);
  if (!samples_os) {
    cvm::error("Error opening ABF samples file " + samples_out_name + " for writing");
+    return;
  }
  samples->write_multicol(*samples_os);
  cvm::proxy->close_output_stream(samples_out_name);

+  // In dimension higher than 2, dx is easier to handle and visualize
+  if (num_variables() > 2) {
+    std::string  samples_dx_out_name = prefix + ".count.dx";
+    std::ostream *samples_dx_os = cvm::proxy->output_stream(samples_dx_out_name, mode);
+    if (!samples_os) {
+      cvm::error("Error opening samples file " + samples_dx_out_name + " for writing");
+      return;
+    }
+    samples->write_opendx(*samples_dx_os);
+    *samples_dx_os << std::endl;
+    cvm::proxy->close_output_stream(samples_dx_out_name);
+  }
+
  std::ostream *gradients_os =
    cvm::proxy->output_stream(gradients_out_name, mode);
  if (!gradients_os) {
    cvm::error("Error opening ABF gradient file " + gradients_out_name + " for writing");
+    return;
  }
  gradients->write_multicol(*gradients_os);
  cvm::proxy->close_output_stream(gradients_out_name);

-  if (colvars.size() == 1) {
-    // Do numerical integration and output a PMF
+  if (b_integrate) {
+    // Do numerical integration (to high precision) and output a PMF
+    cvm::real err;
+    pmf->integrate(integrate_initial_steps, integrate_initial_tol, err);
+    pmf->set_zero_minimum();
+
    std::string  pmf_out_name = prefix + ".pmf";
    std::ostream *pmf_os = cvm::proxy->output_stream(pmf_out_name, mode);
    if (!pmf_os) {
      cvm::error("Error opening pmf file " + pmf_out_name + " for writing");
+      return;
    }
-    gradients->write_1D_integral(*pmf_os);
+    pmf->write_multicol(*pmf_os);
+
+    // In dimension higher than 2, dx is easier to handle and visualize
+    if (num_variables() > 2) {
+      std::string  pmf_dx_out_name = prefix + ".pmf.dx";
+      std::ostream *pmf_dx_os = cvm::proxy->output_stream(pmf_dx_out_name, mode);
+      if (!pmf_dx_os) {
+        cvm::error("Error opening pmf file " + pmf_dx_out_name + " for writing");
+        return;
+      }
+      pmf->write_opendx(*pmf_dx_os);
+      *pmf_dx_os << std::endl;
+      cvm::proxy->close_output_stream(pmf_dx_out_name);
+    }
+
    *pmf_os << std::endl;
    cvm::proxy->close_output_stream(pmf_out_name);
  }
@ -542,6 +638,7 @@ void colvarbias_abf::write_gradients_samples(const std::string &prefix, bool app
      cvm::proxy->output_stream(z_samples_out_name, mode);
    if (!z_samples_os) {
      cvm::error("Error opening eABF z-histogram file " + z_samples_out_name + " for writing");
+      return;
    }
    z_samples->write_multicol(*z_samples_os);
    cvm::proxy->close_output_stream(z_samples_out_name);
@ -553,6 +650,7 @@ void colvarbias_abf::write_gradients_samples(const std::string &prefix, bool app
        cvm::proxy->output_stream(z_gradients_out_name, mode);
      if (!z_gradients_os) {
        cvm::error("Error opening eABF z-gradient file " + z_gradients_out_name + " for writing");
+        return;
      }
      z_gradients->write_multicol(*z_gradients_os);
      cvm::proxy->close_output_stream(z_gradients_out_name);
@ -563,8 +661,7 @@ void colvarbias_abf::write_gradients_samples(const std::string &prefix, bool app
          czar_gradients->index_ok(ix); czar_gradients->incr(ix)) {
      for (size_t n = 0; n < czar_gradients->multiplicity(); n++) {
        czar_gradients->set_value(ix, z_gradients->value_output(ix, n)
-            - cvm::temperature() * cvm::boltzmann() * z_samples->log_gradient_finite_diff(ix, n),
-            n);
+          - cvm::temperature() * cvm::boltzmann() * z_samples->log_gradient_finite_diff(ix, n), n);
      }
    }

@ -574,17 +671,39 @@ void colvarbias_abf::write_gradients_samples(const std::string &prefix, bool app
      cvm::proxy->output_stream(czar_gradients_out_name, mode);
    if (!czar_gradients_os) {
      cvm::error("Error opening CZAR gradient file " + czar_gradients_out_name + " for writing");
+      return;
    }
    czar_gradients->write_multicol(*czar_gradients_os);
    cvm::proxy->close_output_stream(czar_gradients_out_name);

-    if (colvars.size() == 1) {
-      // Do numerical integration and output a PMF
+    if (b_integrate) {
+      // Do numerical integration (to high precision) and output a PMF
+      cvm::real err;
+      czar_pmf->set_div();
+      czar_pmf->integrate(integrate_initial_steps, integrate_initial_tol, err);
+      czar_pmf->set_zero_minimum();
+
      std::string  czar_pmf_out_name = prefix + ".czar.pmf";
-      std::ostream *czar_pmf_os =
-        cvm::proxy->output_stream(czar_pmf_out_name, mode);
-      if (!czar_pmf_os)  cvm::error("Error opening CZAR pmf file " + czar_pmf_out_name + " for writing");
-      czar_gradients->write_1D_integral(*czar_pmf_os);
+      std::ostream *czar_pmf_os = cvm::proxy->output_stream(czar_pmf_out_name, mode);
+      if (!czar_pmf_os) {
+        cvm::error("Error opening CZAR pmf file " + czar_pmf_out_name + " for writing");
+        return;
+      }
+      czar_pmf->write_multicol(*czar_pmf_os);
+
+      // In dimension higher than 2, dx is easier to handle and visualize
+      if (num_variables() > 2) {
+        std::string  czar_pmf_dx_out_name = prefix + ".czar.pmf.dx";
+        std::ostream *czar_pmf_dx_os = cvm::proxy->output_stream(czar_pmf_dx_out_name, mode);
+        if (!czar_pmf_dx_os) {
+          cvm::error("Error opening CZAR pmf file " + czar_pmf_dx_out_name + " for writing");
+          return;
+        }
+        czar_pmf->write_opendx(*czar_pmf_dx_os);
+        *czar_pmf_dx_os << std::endl;
+        cvm::proxy->close_output_stream(czar_pmf_dx_out_name);
+      }
+
      *czar_pmf_os << std::endl;
      cvm::proxy->close_output_stream(czar_pmf_out_name);
    }
@ -708,6 +827,10 @@ std::istream & colvarbias_abf::read_state_data(std::istream& is)
  if (! gradients->read_raw(is)) {
    return is;
  }
+  if (b_integrate) {
+    // Update divergence to account for restart data
+    pmf->set_div();
+  }

  if (b_CZAR_estimator) {

--- a/lib/colvars/colvarbias_abf.h
+++ b/lib/colvars/colvarbias_abf.h
@ -40,28 +40,44 @@ private:
  /// Base filename(s) for reading previous gradient data (replaces data from restart file)
  std::vector<std::string> input_prefix;

-  bool		update_bias;
-  bool		hide_Jacobian;
-  size_t	full_samples;
-  size_t	min_samples;
+  bool    update_bias;
+  bool    hide_Jacobian;
+  bool    b_integrate;
+
+  size_t  full_samples;
+  size_t  min_samples;
  /// frequency for updating output files
-  int		output_freq;
+  int     output_freq;
  /// Write combined files with a history of all output data?
-  bool      b_history_files;
+  bool    b_history_files;
  /// Write CZAR output file for stratified eABF (.zgrad)
-  bool      b_czar_window_file;
-  size_t    history_freq;
+  bool    b_czar_window_file;
+  size_t  history_freq;
  /// Umbrella Integration estimator of free energy from eABF
  UIestimator::UIestimator eabf_UI;
-  // Run UI estimator?
-  bool b_UI_estimator;
-  // Run CZAR estimator?
-  bool b_CZAR_estimator;
+  /// Run UI estimator?
+  bool  b_UI_estimator;
+  /// Run CZAR estimator?
+  bool  b_CZAR_estimator;

-  /// Cap applied biasing force?
+  /// Frequency for updating pABF PMF (if zero, pABF is not used)
+  int   pabf_freq;
+  /// Max number of CG iterations for integrating PMF at startup and for file output
+  int       integrate_initial_steps;
+  /// Tolerance for integrating PMF at startup and for file output
+  cvm::real integrate_initial_tol;
+  /// Max number of CG iterations for integrating PMF at on-the-fly pABF updates
+  int       integrate_steps;
+  /// Tolerance for integrating PMF at on-the-fly pABF updates
+  cvm::real integrate_tol;
+
+  /// Cap the biasing force to be applied?
  bool                    cap_force;
  std::vector<cvm::real>  max_force;

+  // Frequency for updating 2D gradients
+  int integrate_freq;
+
  // Internal data and methods

  std::vector<int>  bin, force_bin, z_bin;
@ -71,12 +87,16 @@ private:
  colvar_grid_gradient  *gradients;
  /// n-dim grid of number of samples
  colvar_grid_count     *samples;
+  /// n-dim grid of pmf (dimension 1 to 3)
+  integrate_potential   *pmf;
  /// n-dim grid: average force on "real" coordinate for eABF z-based estimator
  colvar_grid_gradient  *z_gradients;
  /// n-dim grid of number of samples on "real" coordinate for eABF z-based estimator
  colvar_grid_count     *z_samples;
  /// n-dim grid contining CZAR estimator of "real" free energy gradients
  colvar_grid_gradient  *czar_gradients;
+  /// n-dim grid of CZAR pmf (dimension 1 to 3)
+  integrate_potential   *czar_pmf;

  inline int update_system_force(size_t i)
  {
@ -96,9 +116,9 @@ private:
  }

  // shared ABF
-  bool     shared_on;
-  size_t   shared_freq;
-  int   shared_last_step;
+  bool    shared_on;
+  size_t  shared_freq;
+  int     shared_last_step;
  // Share between replicas -- may be called independently of update
  virtual int replica_share();

@ -114,12 +134,12 @@ private:
  //// Give the count at a given bin index.
  virtual int bin_count(int bin_index);

-  /// Write human-readable FE gradients and sample count
-  void		  write_gradients_samples(const std::string &prefix, bool append = false);
-  void		  write_last_gradients_samples(const std::string &prefix, bool append = false);
+  /// Write human-readable FE gradients and sample count, and DX file in dim > 2
+  void write_gradients_samples(const std::string &prefix, bool append = false);
+  void write_last_gradients_samples(const std::string &prefix, bool append = false);

  /// Read human-readable FE gradients and sample count (if not using restart)
-  void		  read_gradients_samples();
+  void read_gradients_samples();

  std::istream& read_state_data(std::istream&);
  std::ostream& write_state_data(std::ostream&);
--- a/lib/colvars/colvarbias_alb.cpp
+++ b/lib/colvars/colvarbias_alb.cpp
@ -7,13 +7,11 @@
 // If you wish to distribute your changes, please submit them to the
 // Colvars repository at GitHub.

-#include <stdio.h>
-#include <stdlib.h>
-#include <math.h>
+#include <cstdlib>

 #include "colvarmodule.h"
-#include "colvarbias_alb.h"
 #include "colvarbias.h"
+#include "colvarbias_alb.h"

 #ifdef _MSC_VER
 #if _MSC_VER <= 1700
@ -45,22 +43,22 @@ int colvarbias_alb::init(std::string const &conf)
  size_t i;

  // get the initial restraint centers
-  colvar_centers.resize(colvars.size());
+  colvar_centers.resize(num_variables());

-  means.resize(colvars.size());
-  ssd.resize(colvars.size()); //sum of squares of differences from mean
+  means.resize(num_variables());
+  ssd.resize(num_variables()); //sum of squares of differences from mean

  //setup force vectors
-  max_coupling_range.resize(colvars.size());
-  max_coupling_rate.resize(colvars.size());
-  coupling_accum.resize(colvars.size());
-  set_coupling.resize(colvars.size());
-  current_coupling.resize(colvars.size());
-  coupling_rate.resize(colvars.size());
+  max_coupling_range.resize(num_variables());
+  max_coupling_rate.resize(num_variables());
+  coupling_accum.resize(num_variables());
+  set_coupling.resize(num_variables());
+  current_coupling.resize(num_variables());
+  coupling_rate.resize(num_variables());

  enable(f_cvb_apply_force);

-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    colvar_centers[i].type(colvars[i]->value());
    //zero moments
    means[i] = ssd[i] = 0;
@ -70,7 +68,7 @@ int colvarbias_alb::init(std::string const &conf)

  }
  if (get_keyval(conf, "centers", colvar_centers, colvar_centers)) {
-    for (i = 0; i < colvars.size(); i++) {
+    for (i = 0; i < num_variables(); i++) {
      colvar_centers[i].apply_constraints();
    }
  } else {
@ -78,7 +76,7 @@ int colvarbias_alb::init(std::string const &conf)
    cvm::fatal_error("Error: must define the initial centers of adaptive linear bias .\n");
  }

-  if (colvar_centers.size() != colvars.size())
+  if (colvar_centers.size() != num_variables())
    cvm::fatal_error("Error: number of centers does not match "
                      "that of collective variables.\n");

@ -100,17 +98,17 @@ int colvarbias_alb::init(std::string const &conf)

  //initial guess
  if (!get_keyval(conf, "forceConstant", set_coupling, set_coupling))
-    for (i =0 ; i < colvars.size(); i++)
+    for (i =0 ; i < num_variables(); i++)
      set_coupling[i] = 0.;

  //how we're going to increase to that point
-  for (i = 0; i < colvars.size(); i++)
+  for (i = 0; i < num_variables(); i++)
    coupling_rate[i] = (set_coupling[i] - current_coupling[i]) / update_freq;


  if (!get_keyval(conf, "forceRange", max_coupling_range, max_coupling_range)) {
    //set to default
-    for (i = 0; i < colvars.size(); i++) {
+    for (i = 0; i < num_variables(); i++) {
      if (cvm::temperature() > 0)
        max_coupling_range[i] =   3 * cvm::temperature() * cvm::boltzmann();
      else
@ -120,7 +118,7 @@ int colvarbias_alb::init(std::string const &conf)

  if (!get_keyval(conf, "rateMax", max_coupling_rate, max_coupling_rate)) {
    //set to default
-    for (i = 0; i < colvars.size(); i++) {
+    for (i = 0; i < num_variables(); i++) {
      max_coupling_rate[i] =   max_coupling_range[i] / (10 * update_freq);
    }
  }
@ -151,7 +149,7 @@ int colvarbias_alb::update()
  // Force and energy calculation
  bool finished_equil_flag = 1;
  cvm::real delta;
-  for (size_t i = 0; i < colvars.size(); i++) {
+  for (size_t i = 0; i < num_variables(); i++) {
    colvar_forces[i] = -1.0 * restraint_force(restraint_convert_k(current_coupling[i], colvars[i]->width),
                                              colvars[i],
                                              colvar_centers[i]);
@ -168,7 +166,9 @@ int colvarbias_alb::update()

    } else {
      //check if we've reached the setpoint
-      if (coupling_rate[i] == 0 || pow(current_coupling[i] - set_coupling[i],2)  < pow(coupling_rate[i],2)) {
+      cvm::real const coupling_diff = current_coupling[i] - set_coupling[i];
+      if ((coupling_rate[i] == 0) ||
+          ((coupling_diff*coupling_diff) < (coupling_rate[i]*coupling_rate[i]))) {
        finished_equil_flag &= 1; //we continue equilibrating as long as we haven't reached all the set points
      }
      else {
@ -209,7 +209,7 @@ int colvarbias_alb::update()
    cvm::real temp;

    //reset means and sum of squares of differences
-    for (size_t i = 0; i < colvars.size(); i++) {
+    for (size_t i = 0; i < num_variables(); i++) {

      temp = 2. * (means[i] / (static_cast<cvm::real> (colvar_centers[i])) - 1) * ssd[i] / (update_calls - 1);

@ -222,7 +222,7 @@ int colvarbias_alb::update()
      ssd[i] = 0;

      //stochastic if we do that update or not
-      if (colvars.size() == 1 || rand() < RAND_MAX / ((int) colvars.size())) {
+      if (num_variables() == 1 || rand() < RAND_MAX / ((int) num_variables())) {
        coupling_accum[i] += step_size * step_size;
        current_coupling[i] = set_coupling[i];
        set_coupling[i] += max_coupling_range[i] / sqrt(coupling_accum[i]) * step_size;
@ -284,37 +284,37 @@ std::string const colvarbias_alb::get_state_params() const
  std::ostringstream os;
  os << "    setCoupling ";
  size_t i;
-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    os << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << set_coupling[i] << "\n";
  }
  os << "    currentCoupling ";
-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    os << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << current_coupling[i] << "\n";
  }
  os << "    maxCouplingRange ";
-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    os << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << max_coupling_range[i] << "\n";
  }
  os << "    couplingRate ";
-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    os << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << coupling_rate[i] << "\n";
  }
  os << "    couplingAccum ";
-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    os << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << coupling_accum[i] << "\n";
  }
  os << "    mean ";
-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    os << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << means[i] << "\n";
  }
  os << "    ssd ";
-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    os << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << ssd[i] << "\n";
  }
@ -350,7 +350,7 @@ std::ostream & colvarbias_alb::write_traj_label(std::ostream &os)
    }

  if (b_output_centers)
-    for (size_t i = 0; i < colvars.size(); i++) {
+    for (size_t i = 0; i < num_variables(); i++) {
      size_t const this_cv_width = (colvars[i]->value()).output_width(cvm::cv_width);
      os << " x0_"
         << cvm::wrap_string(colvars[i]->name, this_cv_width-3);
@ -378,7 +378,7 @@ std::ostream & colvarbias_alb::write_traj(std::ostream &os)


  if (b_output_centers)
-    for (size_t i = 0; i < colvars.size(); i++) {
+    for (size_t i = 0; i < num_variables(); i++) {
      os << " "
         << std::setprecision(cvm::cv_prec) << std::setw(cvm::cv_width)
         << colvar_centers[i];
--- a/lib/colvars/colvarbias_histogram.cpp
+++ b/lib/colvars/colvarbias_histogram.cpp
@ -8,10 +8,10 @@
 // Colvars repository at GitHub.

 #include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvar.h"
 #include "colvarbias_histogram.h"

-/// Histogram "bias" constructor

 colvarbias_histogram::colvarbias_histogram(char const *key)
  : colvarbias(key),
@ -44,7 +44,7 @@ int colvarbias_histogram::init(std::string const &conf)
    get_keyval(conf, "gatherVectorColvars", colvar_array, colvar_array);

    if (colvar_array) {
-      for (i = 0; i < colvars.size(); i++) { // should be all vector
+      for (i = 0; i < num_variables(); i++) { // should be all vector
        if (colvars[i]->value().type() != colvarvalue::type_vector) {
          cvm::error("Error: used gatherVectorColvars with non-vector colvar.\n", INPUT_ERROR);
          return INPUT_ERROR;
@ -63,7 +63,7 @@ int colvarbias_histogram::init(std::string const &conf)
        }
      }
    } else {
-      for (i = 0; i < colvars.size(); i++) { // should be all scalar
+      for (i = 0; i < num_variables(); i++) { // should be all scalar
        if (colvars[i]->value().type() != colvarvalue::type_scalar) {
          cvm::error("Error: only scalar colvars are supported when gatherVectorColvars is off.\n", INPUT_ERROR);
          return INPUT_ERROR;
@ -77,7 +77,7 @@ int colvarbias_histogram::init(std::string const &conf)
    get_keyval(conf, "weights", weights, weights);
  }

-  for (i = 0; i < colvars.size(); i++) {
+  for (i = 0; i < num_variables(); i++) {
    colvars[i]->enable(f_cv_grid);
  }

@ -116,7 +116,7 @@ int colvarbias_histogram::update()
  }

  // assign a valid bin size
-  bin.assign(colvars.size(), 0);
+  bin.assign(num_variables(), 0);

  if (out_name.size() == 0) {
    // At the first timestep, we need to assign out_name since
@ -137,7 +137,7 @@ int colvarbias_histogram::update()
  if (colvar_array_size == 0) {
    // update indices for scalar values
    size_t i;
-    for (i = 0; i < colvars.size(); i++) {
+    for (i = 0; i < num_variables(); i++) {
      bin[i] = grid->current_bin_scalar(i);
    }

@ -148,7 +148,7 @@ int colvarbias_histogram::update()
    // update indices for vector/array values
    size_t iv, i;
    for (iv = 0; iv < colvar_array_size; iv++) {
-      for (i = 0; i < colvars.size(); i++) {
+      for (i = 0; i < num_variables(); i++) {
        bin[i] = grid->current_bin_scalar(i, iv);
      }

--- a/lib/colvars/colvarbias_meta.cpp
+++ b/lib/colvars/colvarbias_meta.cpp
@ -27,7 +27,8 @@
 #define PATHSEP "/"
 #endif

-
+#include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvar.h"
 #include "colvarbias_meta.h"

--- a/lib/colvars/colvarbias_restraint.cpp
+++ b/lib/colvars/colvarbias_restraint.cpp
@ -7,7 +7,10 @@
 // If you wish to distribute your changes, please submit them to the
 // Colvars repository at GitHub.

+#include <cmath>
+
 #include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvarvalue.h"
 #include "colvarbias_restraint.h"

@ -150,13 +153,14 @@ colvarbias_restraint_k::colvarbias_restraint_k(char const *key)
  : colvarbias(key), colvarbias_ti(key), colvarbias_restraint(key)
 {
  force_k = -1.0;
+  check_positive_k = true;
 }


 int colvarbias_restraint_k::init(std::string const &conf)
 {
  get_keyval(conf, "forceConstant", force_k, (force_k > 0.0 ? force_k : 1.0));
-  if (force_k < 0.0) {
+  if (check_positive_k && (force_k < 0.0)) {
    cvm::error("Error: undefined or invalid force constant.\n", INPUT_ERROR);
    return INPUT_ERROR;
  }
@ -177,6 +181,7 @@ colvarbias_restraint_moving::colvarbias_restraint_moving(char const *key)
  target_nstages = 0;
  target_nsteps = 0;
  stage = 0;
+  acc_work = 0.0;
  b_chg_centers = false;
  b_chg_force_k = false;
 }
@ -203,6 +208,14 @@ int colvarbias_restraint_moving::init(std::string const &conf)
      cvm::error("Error: targetNumStages and lambdaSchedule are incompatible.\n", INPUT_ERROR);
      return cvm::get_error();
    }
+
+    get_keyval_feature(this, conf, "outputAccumulatedWork",
+                       f_cvb_output_acc_work,
+                       is_enabled(f_cvb_output_acc_work));
+    if (is_enabled(f_cvb_output_acc_work) && (target_nstages > 0)) {
+      return cvm::error("Error: outputAccumulatedWork and targetNumStages "
+                        "are incompatible.\n", INPUT_ERROR);
+    }
  }

  return COLVARS_OK;
@ -246,8 +259,6 @@ colvarbias_restraint_centers_moving::colvarbias_restraint_centers_moving(char co
 {
  b_chg_centers = false;
  b_output_centers = false;
-  b_output_acc_work = false;
-  acc_work = 0.0;
 }


@ -288,9 +299,6 @@ int colvarbias_restraint_centers_moving::init(std::string const &conf)
                                 0.5);
    }

-    get_keyval(conf, "outputAccumulatedWork", b_output_acc_work,
-               b_output_acc_work); // TODO this conflicts with stages
-
  } else {
    target_centers.clear();
  }
@ -382,12 +390,14 @@ int colvarbias_restraint_centers_moving::update()

 int colvarbias_restraint_centers_moving::update_acc_work()
 {
-  if (b_output_acc_work) {
-    if ((cvm::step_relative() > 0) &&
-        (cvm::step_absolute() <= target_nsteps)) {
-      for (size_t i = 0; i < num_variables(); i++) {
-        // project forces on the calculated increments at this step
-        acc_work += colvar_forces[i] * centers_incr[i];
+  if (b_chg_centers) {
+    if (is_enabled(f_cvb_output_acc_work)) {
+      if ((cvm::step_relative() > 0) &&
+          (cvm::step_absolute() <= target_nsteps)) {
+        for (size_t i = 0; i < num_variables(); i++) {
+          // project forces on the calculated increments at this step
+          acc_work += colvar_forces[i] * centers_incr[i];
+        }
      }
    }
  }
@ -410,7 +420,7 @@ std::string const colvarbias_restraint_centers_moving::get_state_params() const
    }
    os << "\n";

-    if (b_output_acc_work) {
+    if (is_enabled(f_cvb_output_acc_work)) {
      os << "accumulatedWork "
         << std::setprecision(cvm::en_prec) << std::setw(cvm::en_width)
         << acc_work << "\n";
@ -429,7 +439,7 @@ int colvarbias_restraint_centers_moving::set_state_params(std::string const &con
    //    cvm::log ("Reading the updated restraint centers from the restart.\n");
    if (!get_keyval(conf, "centers", colvar_centers))
      cvm::error("Error: restraint centers are missing from the restart.\n");
-    if (b_output_acc_work) {
+    if (is_enabled(f_cvb_output_acc_work)) {
      if (!get_keyval(conf, "accumulatedWork", acc_work))
        cvm::error("Error: accumulatedWork is missing from the restart.\n");
    }
@ -449,7 +459,7 @@ std::ostream & colvarbias_restraint_centers_moving::write_traj_label(std::ostrea
    }
  }

-  if (b_output_acc_work) {
+  if (b_chg_centers && is_enabled(f_cvb_output_acc_work)) {
    os << " W_"
       << cvm::wrap_string(this->name, cvm::en_width-2);
  }
@ -468,7 +478,7 @@ std::ostream & colvarbias_restraint_centers_moving::write_traj(std::ostream &os)
    }
  }

-  if (b_output_acc_work) {
+  if (b_chg_centers && is_enabled(f_cvb_output_acc_work)) {
    os << " "
       << std::setprecision(cvm::en_prec) << std::setw(cvm::en_width)
       << acc_work;
@ -488,10 +498,11 @@ colvarbias_restraint_k_moving::colvarbias_restraint_k_moving(char const *key)
 {
  b_chg_force_k = false;
  target_equil_steps = 0;
-  target_force_k = 0.0;
-  starting_force_k = 0.0;
+  target_force_k = -1.0;
+  starting_force_k = -1.0;
  force_k_exp = 1.0;
  restraint_FE = 0.0;
+  force_k_incr = 0.0;
 }


@ -569,14 +580,13 @@ int colvarbias_restraint_k_moving::update()
      if (target_equil_steps == 0 || cvm::step_absolute() % target_nsteps >= target_equil_steps) {
        // Start averaging after equilibration period, if requested

-        // Square distance normalized by square colvar width
-        cvm::real dist_sq = 0.0;
+        // Derivative of energy with respect to force_k
+        cvm::real dU_dk = 0.0;
        for (size_t i = 0; i < num_variables(); i++) {
-          dist_sq += d_restraint_potential_dk(i);
+          dU_dk += d_restraint_potential_dk(i);
        }
-
-        restraint_FE += 0.5 * force_k_exp * std::pow(lambda, force_k_exp - 1.0)
-          * (target_force_k - starting_force_k) * dist_sq;
+        restraint_FE += force_k_exp * std::pow(lambda, force_k_exp - 1.0)
+          * (target_force_k - starting_force_k) * dU_dk;
      }

      // Finish current stage...
@ -607,10 +617,13 @@ int colvarbias_restraint_k_moving::update()

    } else if (cvm::step_absolute() <= target_nsteps) {

+
      // update force constant (slow growth)
      lambda = cvm::real(cvm::step_absolute()) / cvm::real(target_nsteps);
+      cvm::real const force_k_old = force_k;
      force_k = starting_force_k + (target_force_k - starting_force_k)
        * std::pow(lambda, force_k_exp);
+      force_k_incr = force_k - force_k_old;
    }
  }

@ -618,6 +631,23 @@ int colvarbias_restraint_k_moving::update()
 }


+int colvarbias_restraint_k_moving::update_acc_work()
+{
+  if (b_chg_force_k) {
+    if (is_enabled(f_cvb_output_acc_work)) {
+      if (cvm::step_relative() > 0) {
+        cvm::real dU_dk = 0.0;
+        for (size_t i = 0; i < num_variables(); i++) {
+          dU_dk += d_restraint_potential_dk(i);
+        }
+        acc_work += dU_dk * force_k_incr;
+      }
+    }
+  }
+  return COLVARS_OK;
+}
+
+
 std::string const colvarbias_restraint_k_moving::get_state_params() const
 {
  std::ostringstream os;
@ -626,6 +656,12 @@ std::string const colvarbias_restraint_k_moving::get_state_params() const
    os << "forceConstant "
       << std::setprecision(cvm::en_prec)
       << std::setw(cvm::en_width) << force_k << "\n";
+
+    if (is_enabled(f_cvb_output_acc_work)) {
+      os << "accumulatedWork "
+         << std::setprecision(cvm::en_prec) << std::setw(cvm::en_width)
+         << acc_work << "\n";
+    }
  }
  return os.str();
 }
@ -639,6 +675,10 @@ int colvarbias_restraint_k_moving::set_state_params(std::string const &conf)
    //    cvm::log ("Reading the updated force constant from the restart.\n");
    if (!get_keyval(conf, "forceConstant", force_k, force_k))
      cvm::error("Error: force constant is missing from the restart.\n");
+    if (is_enabled(f_cvb_output_acc_work)) {
+      if (!get_keyval(conf, "accumulatedWork", acc_work))
+        cvm::error("Error: accumulatedWork is missing from the restart.\n");
+    }
  }

  return COLVARS_OK;
@ -647,12 +687,21 @@ int colvarbias_restraint_k_moving::set_state_params(std::string const &conf)

 std::ostream & colvarbias_restraint_k_moving::write_traj_label(std::ostream &os)
 {
+  if (b_chg_force_k && is_enabled(f_cvb_output_acc_work)) {
+    os << " W_"
+       << cvm::wrap_string(this->name, cvm::en_width-2);
+  }
  return os;
 }


 std::ostream & colvarbias_restraint_k_moving::write_traj(std::ostream &os)
 {
+  if (b_chg_force_k && is_enabled(f_cvb_output_acc_work)) {
+    os << " "
+       << std::setprecision(cvm::en_prec) << std::setw(cvm::en_width)
+       << acc_work;
+  }
  return os;
 }

@ -765,6 +814,7 @@ int colvarbias_restraint_harmonic::update()

  // update accumulated work using the current forces
  error_code |= colvarbias_restraint_centers_moving::update_acc_work();
+  error_code |= colvarbias_restraint_k_moving::update_acc_work();

  return error_code;
 }
@ -876,8 +926,8 @@ colvarbias_restraint_harmonic_walls::colvarbias_restraint_harmonic_walls(char co
    colvarbias_restraint_moving(key),
    colvarbias_restraint_k_moving(key)
 {
-  lower_wall_k = 0.0;
-  upper_wall_k = 0.0;
+  lower_wall_k = -1.0;
+  upper_wall_k = -1.0;
 }


@ -887,26 +937,6 @@ int colvarbias_restraint_harmonic_walls::init(std::string const &conf)
  colvarbias_restraint_moving::init(conf);
  colvarbias_restraint_k_moving::init(conf);

-  get_keyval(conf, "lowerWallConstant", lower_wall_k,
-             (lower_wall_k > 0.0) ? lower_wall_k : force_k);
-  get_keyval(conf, "upperWallConstant", upper_wall_k,
-             (upper_wall_k > 0.0) ? upper_wall_k : force_k);
-
-  if (lower_wall_k * upper_wall_k > 0.0) {
-    for (size_t i = 0; i < num_variables(); i++) {
-      if (variables(i)->width != 1.0)
-        cvm::log("The lower and upper wall force constants for colvar \""+
-                 variables(i)->name+
-                 "\" will be rescaled to "+
-                 cvm::to_str(lower_wall_k /
-                             (variables(i)->width * variables(i)->width))+
-                 " and "+
-                 cvm::to_str(upper_wall_k /
-                             (variables(i)->width * variables(i)->width))+
-                 " according to the specified width.\n");
-    }
-  }
-
  enable(f_cvb_scalar_variables);

  size_t i;
@ -942,16 +972,23 @@ int colvarbias_restraint_harmonic_walls::init(std::string const &conf)
  }

  if ((lower_walls.size() == 0) && (upper_walls.size() == 0)) {
-    cvm::error("Error: no walls provided.\n", INPUT_ERROR);
-    return INPUT_ERROR;
+    return cvm::error("Error: no walls provided.\n", INPUT_ERROR);
+  }
+
+  if (lower_walls.size() > 0) {
+    get_keyval(conf, "lowerWallConstant", lower_wall_k,
+               (lower_wall_k > 0.0) ? lower_wall_k : force_k);
+  }
+  if (upper_walls.size() > 0) {
+    get_keyval(conf, "upperWallConstant", upper_wall_k,
+               (upper_wall_k > 0.0) ? upper_wall_k : force_k);
  }

  if ((lower_walls.size() == 0) || (upper_walls.size() == 0)) {
    for (i = 0; i < num_variables(); i++) {
      if (variables(i)->is_enabled(f_cv_periodic)) {
-        cvm::error("Error: at least one variable is periodic, "
-                   "both walls must be provided.\n", INPUT_ERROR);
-        return INPUT_ERROR;
+        return cvm::error("Error: at least one variable is periodic, "
+                          "both walls must be provided.\n", INPUT_ERROR);
      }
    }
  }
@ -972,19 +1009,49 @@ int colvarbias_restraint_harmonic_walls::init(std::string const &conf)
                 INPUT_ERROR);
      return INPUT_ERROR;
    }
-    force_k = lower_wall_k * upper_wall_k;
-    // transform the two constants to relative values
+    force_k = std::sqrt(lower_wall_k * upper_wall_k);
+    // transform the two constants to relative values using gemetric mean as ref
+    // to preserve force_k if provided as single parameter
    // (allow changing both via force_k)
    lower_wall_k /= force_k;
    upper_wall_k /= force_k;
+  } else {
+    // If only one wall is defined, need to rescale as well
+    if (lower_walls.size() > 0) {
+      force_k = lower_wall_k;
+      lower_wall_k = 1.0;
+    }
+    if (upper_walls.size() > 0) {
+      force_k = upper_wall_k;
+      upper_wall_k = 1.0;
+    }
  }

-  for (i = 0; i < num_variables(); i++) {
-    if (variables(i)->width != 1.0)
-      cvm::log("The force constant for colvar \""+variables(i)->name+
-               "\" will be rescaled to "+
-               cvm::to_str(force_k / (variables(i)->width * variables(i)->width))+
-               " according to the specified width.\n");
+  // Initialize starting value of the force constant (in case it's changing)
+  starting_force_k = force_k;
+
+  if (lower_walls.size() > 0) {
+    for (i = 0; i < num_variables(); i++) {
+      if (variables(i)->width != 1.0)
+        cvm::log("The lower wall force constant for colvar \""+
+                 variables(i)->name+
+                 "\" will be rescaled to "+
+                 cvm::to_str(lower_wall_k * force_k /
+                             (variables(i)->width * variables(i)->width))+
+                 " according to the specified width.\n");
+    }
+  }
+
+  if (upper_walls.size() > 0) {
+    for (i = 0; i < num_variables(); i++) {
+      if (variables(i)->width != 1.0)
+        cvm::log("The upper wall force constant for colvar \""+
+                 variables(i)->name+
+                 "\" will be rescaled to "+
+                 cvm::to_str(upper_wall_k * force_k /
+                             (variables(i)->width * variables(i)->width))+
+                 " according to the specified width.\n");
+    }
  }

  return COLVARS_OK;
@ -1001,6 +1068,8 @@ int colvarbias_restraint_harmonic_walls::update()

  error_code |= colvarbias_restraint::update();

+  error_code |= colvarbias_restraint_k_moving::update_acc_work();
+
  return error_code;
 }

@ -1134,6 +1203,7 @@ colvarbias_restraint_linear::colvarbias_restraint_linear(char const *key)
    colvarbias_restraint_centers_moving(key),
    colvarbias_restraint_k_moving(key)
 {
+  check_positive_k = false;
 }


@ -1177,6 +1247,7 @@ int colvarbias_restraint_linear::update()

  // update accumulated work using the current forces
  error_code |= colvarbias_restraint_centers_moving::update_acc_work();
+  error_code |= colvarbias_restraint_k_moving::update_acc_work();

  return error_code;
 }
--- a/lib/colvars/colvarbias_restraint.h
+++ b/lib/colvars/colvarbias_restraint.h
@ -89,8 +89,12 @@ public:
  virtual int change_configuration(std::string const &conf);

 protected:
+
  /// \brief Restraint force constant
  cvm::real force_k;
+
+  /// \brief Whether the force constant should be positive
+  bool check_positive_k;
 };


@ -129,6 +133,9 @@ protected:
  /// \brief Number of steps required to reach the target force constant
  /// or restraint centers
  long target_nsteps;
+
+  /// \brief Accumulated work (computed when outputAccumulatedWork == true)
+  cvm::real acc_work;
 };


@ -157,8 +164,7 @@ protected:
  /// \brief Initial value of the restraint centers
  std::vector<colvarvalue> initial_centers;

-  /// \brief Amplitude of the restraint centers' increment at each step
-  /// towards the new values (calculated from target_nsteps)
+  /// \brief Increment of the restraint centers at each step
  std::vector<colvarvalue> centers_incr;

  /// \brief Update the centers by interpolating between initial and target
@ -167,12 +173,6 @@ protected:
  /// Whether to write the current restraint centers to the trajectory file
  bool b_output_centers;

-  /// Whether to write the current accumulated work to the trajectory file
-  bool b_output_acc_work;
-
-  /// \brief Accumulated work
-  cvm::real acc_work;
-
  /// Update the accumulated work
  int update_acc_work();
 };
@ -212,6 +212,12 @@ protected:

  /// \brief Equilibration steps for restraint FE calculation through TI
  cvm::real target_equil_steps;
+
+  /// \brief Increment of the force constant at each step
+  cvm::real force_k_incr;
+
+  /// Update the accumulated work
+  int update_acc_work();
 };


--- a/lib/colvars/colvarcomp.h
+++ b/lib/colvars/colvarcomp.h
@ -20,52 +20,48 @@
 // simple_scalar_dist_functions (derived_class)


-#include <fstream>
-#include <cmath>
-
-
 #include "colvarmodule.h"
 #include "colvar.h"
 #include "colvaratoms.h"


-/// \brief Colvar component (base class); most implementations of
-/// \link cvc \endlink utilize one or more \link
-/// colvarmodule::atom \endlink or \link colvarmodule::atom_group
-/// \endlink objects to access atoms.
+/// \brief Colvar component (base class for collective variables)
 ///
 /// A \link cvc \endlink object (or an object of a
-/// cvc-derived class) specifies how to calculate a collective
-/// variable, its gradients and other related physical quantities
-/// which do not depend only on the numeric value (the \link colvar
-/// \endlink class already serves this purpose).
+/// cvc-derived class) implements the calculation of a collective
+/// variable, its gradients and any other related physical quantities
+/// that depend on microscopic degrees of freedom.
 ///
-/// No restriction is set to what kind of calculation a \link
-/// cvc \endlink object performs (usually calculate an
-/// analytical function of atomic coordinates).  The only constraint
-/// is that the value calculated is implemented as a \link colvarvalue
-/// \endlink object.  This serves to provide a unique way to calculate
-/// scalar and non-scalar collective variables, and specify if and how
-/// they can be combined together by the parent \link colvar \endlink
-/// object.
+/// No restriction is set to what kind of calculation a \link cvc \endlink
+/// object performs (usually an analytical function of atomic coordinates).
+/// The only constraints are that: \par
+///
+/// - The value is calculated by the \link calc_value() \endlink
+///   method, and is an object of \link colvarvalue \endlink class.  This
+///   provides a transparent way to treat scalar and non-scalar variables
+///   alike, and allows an automatic selection of the applicable algorithms.
+///
+/// - The object provides an implementation \link apply_force() \endlink to
+///   apply forces to atoms.  Typically, one or more \link cvm::atom_group
+///   \endlink objects are used, but this is not a requirement for as long as
+///   the \link cvc \endlink object communicates with the simulation program.
 ///
 /// <b> If you wish to implement a new collective variable component, you
 /// should write your own class by inheriting directly from \link
-/// cvc \endlink, or one of its derived classes (for instance,
-/// \link distance \endlink is frequently used, because it provides
+/// colvar::cvc \endlink, or one of its derived classes (for instance,
+/// \link colvar::distance \endlink is frequently used, because it provides
 /// useful data and function members for any colvar based on two
-/// atom groups).  The steps are: \par
-/// 1. add the name of this class under colvar.h \par
-/// 2. add a call to the parser in colvar.C, within the function colvar::colvar() \par
-/// 3. declare the class in colvarcomp.h \par
-/// 4. implement the class in one of the files colvarcomp_*.C
+/// atom groups).</b>
+///
+/// The steps are: \par
+/// 1. Declare the new class as a derivative of \link colvar::cvc \endlink
+///    in the file \link colvarcomp.h \endlink
+/// 2. Implement the new class in a file named colvarcomp_<something>.cpp
+/// 3. Declare the name of the new class inside the \link colvar \endlink class
+///    in \link colvar.h \endlink (see "list of available components")
+/// 4. Add a call for the new class in colvar::init_components()
+////   (file: colvar.cpp)
 ///
-/// </b>
-/// The cvm::atom and cvm::atom_group classes are available to
-/// transparently communicate with the simulation program.  However,
-/// they are not strictly needed, as long as all the degrees of
-/// freedom associated to the cvc are properly evolved by a simple
-/// call to e.g. apply_force().

 class colvar::cvc
  : public colvarparse, public colvardeps
@ -155,7 +151,7 @@ public:

  /// \brief Calculate the atomic gradients, to be reused later in
  /// order to apply forces
-  virtual void calc_gradients() = 0;
+  virtual void calc_gradients() {}

  /// \brief Calculate the atomic fit gradients
  void calc_fit_gradients();
--- a/lib/colvars/colvarcomp_distances.cpp
+++ b/lib/colvars/colvarcomp_distances.cpp
@ -581,6 +581,12 @@ colvar::distance_inv::distance_inv(std::string const &conf)
    }
  }

+  if (is_enabled(f_cvc_debug_gradient)) {
+    cvm::log("Warning: debugGradients will not give correct results "
+             "for distanceInv, because its value and gradients are computed "
+             "simultaneously.\n");
+  }
+
  x.type(colvarvalue::type_scalar);
 }

@ -601,11 +607,9 @@ void colvar::distance_inv::calc_value()
      for (cvm::atom_iter ai2 = group2->begin(); ai2 != group2->end(); ai2++) {
        cvm::rvector const dv = ai2->pos - ai1->pos;
        cvm::real const d2 = dv.norm2();
-        cvm::real dinv = 1.0;
-        for (int ne = 0; ne < exponent/2; ne++)
-          dinv *= 1.0/d2;
+        cvm::real const dinv = cvm::integer_power(d2, -1*(exponent/2));
        x.real_value += dinv;
-        cvm::rvector const dsumddv = -(cvm::real(exponent)) * dinv/d2 * dv;
+        cvm::rvector const dsumddv = -1.0*(exponent/2) * dinv/d2 * 2.0 * dv;
        ai1->grad += -1.0 * dsumddv;
        ai2->grad +=        dsumddv;
      }
@ -615,11 +619,9 @@ void colvar::distance_inv::calc_value()
      for (cvm::atom_iter ai2 = group2->begin(); ai2 != group2->end(); ai2++) {
        cvm::rvector const dv = cvm::position_distance(ai1->pos, ai2->pos);
        cvm::real const d2 = dv.norm2();
-        cvm::real dinv = 1.0;
-        for (int ne = 0; ne < exponent/2; ne++)
-          dinv *= 1.0/d2;
+        cvm::real const dinv = cvm::integer_power(d2, -1*(exponent/2));
        x.real_value += dinv;
-        cvm::rvector const dsumddv = -(cvm::real(exponent)) * dinv/d2 * dv;
+        cvm::rvector const dsumddv = -1.0*(exponent/2) * dinv/d2 * 2.0 * dv;
        ai1->grad += -1.0 * dsumddv;
        ai2->grad +=        dsumddv;
      }
@ -627,13 +629,11 @@ void colvar::distance_inv::calc_value()
  }

  x.real_value *= 1.0 / cvm::real(group1->size() * group2->size());
-  x.real_value = std::pow(x.real_value, -1.0/(cvm::real(exponent)));
-}
+  x.real_value = std::pow(x.real_value, -1.0/cvm::real(exponent));

-
-void colvar::distance_inv::calc_gradients()
-{
-  cvm::real const dxdsum = (-1.0/(cvm::real(exponent))) * std::pow(x.real_value, exponent+1) / cvm::real(group1->size() * group2->size());
+  cvm::real const dxdsum = (-1.0/(cvm::real(exponent))) *
+    cvm::integer_power(x.real_value, exponent+1) /
+    cvm::real(group1->size() * group2->size());
  for (cvm::atom_iter ai1 = group1->begin(); ai1 != group1->end(); ai1++) {
    ai1->grad *= dxdsum;
  }
@ -643,6 +643,11 @@ void colvar::distance_inv::calc_gradients()
 }


+void colvar::distance_inv::calc_gradients()
+{
+}
+
+
 void colvar::distance_inv::apply_force(colvarvalue const &force)
 {
  if (!group1->noforce)
--- a/lib/colvars/colvardeps.cpp
+++ b/lib/colvars/colvardeps.cpp
@ -8,10 +8,16 @@
 // Colvars repository at GitHub.


+#include "colvarmodule.h"
+#include "colvarproxy.h"
 #include "colvardeps.h"

+
 colvardeps::colvardeps()
-  : time_step_factor (1) {}
+{
+  time_step_factor = 1;
+}
+

 colvardeps::~colvardeps() {
  size_t i;
@ -416,6 +422,9 @@ void colvardeps::init_cvb_requires() {
    init_feature(f_cvb_get_total_force, "obtain total force", f_type_dynamic);
    f_req_children(f_cvb_get_total_force, f_cv_total_force);

+    init_feature(f_cvb_output_acc_work, "output accumulated work", f_type_user);
+    f_req_self(f_cvb_output_acc_work, f_cvb_apply_force);
+
    init_feature(f_cvb_history_dependent, "history-dependent", f_type_static);

    init_feature(f_cvb_time_dependent, "time-dependent", f_type_static);
--- a/lib/colvars/colvardeps.h
+++ b/lib/colvars/colvardeps.h
@ -225,6 +225,8 @@ public:
    f_cvb_apply_force,
    /// \brief requires total forces
    f_cvb_get_total_force,
+    /// \brief whether this bias should record the accumulated work
+    f_cvb_output_acc_work,
    /// \brief depends on simulation history
    f_cvb_history_dependent,
    /// \brief depends on time
--- a/lib/colvars/colvargrid.cpp
+++ b/lib/colvars/colvargrid.cpp
@ -14,6 +14,7 @@
 #include "colvarcomp.h"
 #include "colvargrid.h"

+#include <ctime>

 colvar_grid_count::colvar_grid_count()
  : colvar_grid<size_t>()
@ -22,43 +23,37 @@ colvar_grid_count::colvar_grid_count()
 }

 colvar_grid_count::colvar_grid_count(std::vector<int> const &nx_i,
-                                     size_t const           &def_count)
+                                     size_t const &def_count)
  : colvar_grid<size_t>(nx_i, def_count, 1)
 {}

 colvar_grid_count::colvar_grid_count(std::vector<colvar *>  &colvars,
-                                     size_t const           &def_count)
-  : colvar_grid<size_t>(colvars, def_count, 1)
+                                     size_t const &def_count,
+                                     bool margin)
+  : colvar_grid<size_t>(colvars, def_count, 1, margin)
 {}

 colvar_grid_scalar::colvar_grid_scalar()
-  : colvar_grid<cvm::real>(), samples(NULL), grad(NULL)
+  : colvar_grid<cvm::real>(), samples(NULL)
 {}

 colvar_grid_scalar::colvar_grid_scalar(colvar_grid_scalar const &g)
-  : colvar_grid<cvm::real>(g), samples(NULL), grad(NULL)
+  : colvar_grid<cvm::real>(g), samples(NULL)
 {
-  grad = new cvm::real[nd];
 }

 colvar_grid_scalar::colvar_grid_scalar(std::vector<int> const &nx_i)
-  : colvar_grid<cvm::real>(nx_i, 0.0, 1), samples(NULL), grad(NULL)
+  : colvar_grid<cvm::real>(nx_i, 0.0, 1), samples(NULL)
 {
-  grad = new cvm::real[nd];
 }

 colvar_grid_scalar::colvar_grid_scalar(std::vector<colvar *> &colvars, bool margin)
-  : colvar_grid<cvm::real>(colvars, 0.0, 1, margin), samples(NULL), grad(NULL)
+  : colvar_grid<cvm::real>(colvars, 0.0, 1, margin), samples(NULL)
 {
-  grad = new cvm::real[nd];
 }

 colvar_grid_scalar::~colvar_grid_scalar()
 {
-  if (grad) {
-    delete [] grad;
-    grad = NULL;
-  }
 }

 cvm::real colvar_grid_scalar::maximum_value() const
@ -143,18 +138,18 @@ void colvar_grid_gradient::write_1D_integral(std::ostream &os)

  os << "#       xi            A(xi)\n";

-  if ( cv.size() != 1 ) {
+  if (cv.size() != 1) {
    cvm::error("Cannot write integral for multi-dimensional gradient grids.");
    return;
  }

  integral = 0.0;
-  int_vals.push_back( 0.0 );
+  int_vals.push_back(0.0);
  min = 0.0;

  // correction for periodic colvars, so that the PMF is periodic
  cvm::real corr;
-  if ( periodic[0] ) {
+  if (periodic[0]) {
    corr = average();
  } else {
    corr = 0.0;
@ -171,7 +166,7 @@ void colvar_grid_gradient::write_1D_integral(std::ostream &os)
    }

    if ( integral < min ) min = integral;
-    int_vals.push_back( integral );
+    int_vals.push_back(integral);
  }

  bin = 0.0;
@ -192,3 +187,670 @@ void colvar_grid_gradient::write_1D_integral(std::ostream &os)



+integrate_potential::integrate_potential(std::vector<colvar *> &colvars, colvar_grid_gradient * gradients)
+  : colvar_grid_scalar(colvars, true),
+    gradients(gradients)
+{
+  // parent class colvar_grid_scalar is constructed with margin option set to true
+  // hence PMF grid is wider than gradient grid if non-PBC
+
+  if (nd > 1) {
+    divergence.resize(nt);
+
+    // Compute inverse of Laplacian diagonal for Jacobi preconditioning
+    // For now all code related to preconditioning is commented out
+    // until a method better than Jacobi is implemented
+//     cvm::log("Preparing inverse diagonal for preconditioning...");
+//     inv_lap_diag.resize(nt);
+//     std::vector<cvm::real> id(nt), lap_col(nt);
+//     for (int i = 0; i < nt; i++) {
+//       if (i % (nt / 100) == 0)
+//         cvm::log(cvm::to_str(i));
+//       id[i] = 1.;
+//       atimes(id, lap_col);
+//       id[i] = 0.;
+//       inv_lap_diag[i] = 1. / lap_col[i];
+//     }
+//     cvm::log("Done.");
+  }
+}
+
+
+int integrate_potential::integrate(const int itmax, const cvm::real &tol, cvm::real & err)
+{
+  int iter = 0;
+
+  if (nd == 1) {
+
+    cvm::real sum = 0.0;
+    cvm::real corr;
+    if ( periodic[0] ) {
+      corr = gradients->average(); // Enforce PBC by subtracting average gradient
+    } else {
+      corr = 0.0;
+    }
+
+    std::vector<int> ix;
+    // Iterate over valid indices in gradient grid
+    for (ix = new_index(); gradients->index_ok(ix); incr(ix)) {
+      set_value(ix, sum);
+      sum += (gradients->value_output(ix) - corr) * widths[0];
+    }
+    if (index_ok(ix)) {
+      // This will happen if non-periodic: then PMF grid has one extra bin wrt gradient grid
+      set_value(ix, sum);
+    }
+
+  } else if (nd <= 3) {
+
+    nr_linbcg_sym(divergence, data, tol, itmax, iter, err);
+    cvm::log("Integrated in " + cvm::to_str(iter) + " steps, error: " + cvm::to_str(err));
+
+  } else {
+    cvm::error("Cannot integrate PMF in dimension > 3\n");
+  }
+
+  return iter;
+}
+
+
+void integrate_potential::set_div()
+{
+  if (nd == 1) return;
+  for (std::vector<int> ix = new_index(); index_ok(ix); incr(ix)) {
+    update_div_local(ix);
+  }
+}
+
+
+void integrate_potential::update_div_neighbors(const std::vector<int> &ix0)
+{
+  std::vector<int> ix(ix0);
+  int i, j, k;
+
+  // If not periodic, expanded grid ensures that neighbors of ix0 are valid grid points
+  if (nd == 1) {
+    return;
+
+  } else if (nd == 2) {
+
+    update_div_local(ix);
+    ix[0]++; wrap(ix);
+    update_div_local(ix);
+    ix[1]++; wrap(ix);
+    update_div_local(ix);
+    ix[0]--; wrap(ix);
+    update_div_local(ix);
+
+  } else if (nd == 3) {
+
+    for (i = 0; i<2; i++) {
+      ix[1] = ix0[1];
+      for (j = 0; j<2; j++) {
+        ix[2] = ix0[2];
+        for (k = 0; k<2; k++) {
+          wrap(ix);
+          update_div_local(ix);
+          ix[2]++;
+        }
+        ix[1]++;
+      }
+      ix[0]++;
+    }
+  }
+}
+
+void integrate_potential::get_grad(cvm::real * g, std::vector<int> &ix)
+{
+  size_t count, i;
+  bool edge = gradients->wrap_edge(ix); // Detect edge if non-PBC
+
+  if (gradients->samples) {
+    count = gradients->samples->value(ix);
+  } else {
+    count = 1;
+  }
+
+  if (!edge && count) {
+    cvm::real const *grad = &(gradients->value(ix));
+    cvm::real const fact = 1.0 / count;
+    for ( i = 0; i<nd; i++ ) {
+      g[i] = fact * grad[i];
+    }
+  } else {
+    for ( i = 0; i<nd; i++ ) {
+      g[i] = 0.0;
+    }
+  }
+}
+
+void integrate_potential::update_div_local(const std::vector<int> &ix0)
+{
+  const int linear_index = address(ix0);
+  int i, j, k;
+  std::vector<int> ix = ix0;
+  const cvm::real * g;
+
+  if (nd == 2) {
+    // gradients at grid points surrounding the current scalar grid point
+    cvm::real g00[2], g01[2], g10[2], g11[2];
+
+    get_grad(g11, ix);
+    ix[0] = ix0[0] - 1;
+    get_grad(g01, ix);
+    ix[1] = ix0[1] - 1;
+    get_grad(g00, ix);
+    ix[0] = ix0[0];
+    get_grad(g10, ix);
+
+    divergence[linear_index] = ((g10[0]-g00[0] + g11[0]-g01[0]) / widths[0]
+                              + (g01[1]-g00[1] + g11[1]-g10[1]) / widths[1]) * 0.5;
+  } else if (nd == 3) {
+    cvm::real gc[24]; // stores 3d gradients in 8 contiguous bins
+    int index = 0;
+
+    ix[0] = ix0[0] - 1;
+    for (i = 0; i<2; i++) {
+      ix[1] = ix0[1] - 1;
+      for (j = 0; j<2; j++) {
+        ix[2] = ix0[2] - 1;
+        for (k = 0; k<2; k++) {
+          get_grad(gc + index, ix);
+          index += 3;
+          ix[2]++;
+        }
+        ix[1]++;
+      }
+      ix[0]++;
+    }
+
+    divergence[linear_index] =
+     ((gc[3*4]-gc[0] + gc[3*5]-gc[3*1] + gc[3*6]-gc[3*2] + gc[3*7]-gc[3*3])
+      / widths[0]
+    + (gc[3*2+1]-gc[0+1] + gc[3*3+1]-gc[3*1+1] + gc[3*6+1]-gc[3*4+1] + gc[3*7+1]-gc[3*5+1])
+      / widths[1]
+    + (gc[3*1+2]-gc[0+2] + gc[3*3+2]-gc[3*2+2] + gc[3*5+2]-gc[3*4+2] + gc[3*7+2]-gc[3*6+2])
+      / widths[2]) * 0.25;
+  }
+}
+
+
+/// Multiplication by sparse matrix representing Laplacian
+/// NOTE: Laplacian must be symmetric for solving with CG
+void integrate_potential::atimes(const std::vector<cvm::real> &A, std::vector<cvm::real> &LA)
+{
+  if (nd == 2) {
+    // DIMENSION 2
+
+    size_t index, index2;
+    int i, j;
+    cvm::real fact;
+    const cvm::real ffx = 1.0 / (widths[0] * widths[0]);
+    const cvm::real ffy = 1.0 / (widths[1] * widths[1]);
+    const int h = nx[1];
+    const int w = nx[0];
+    // offsets for 4 reference points of the Laplacian stencil
+    int xm = -h;
+    int xp =  h;
+    int ym = -1;
+    int yp =  1;
+
+    // NOTE on performance: this version is slightly sub-optimal because
+    // it contains two double loops on the core of the array (for x and y terms)
+    // The slightly faster version is in commit 0254cb5a2958cb2e135f268371c4b45fad34866b
+    // yet it is much uglier, and probably horrible to extend to dimension 3
+    // All terms in the matrix are assigned (=) during the x loops, then updated (+=)
+    // with the y (and z) contributions
+
+
+    // All x components except on x edges
+    index = h; // Skip first column
+
+    // Halve the term on y edges (if any) to preserve symmetry of the Laplacian matrix
+    // (Long Chen, Finite Difference Methods, UCI, 2017)
+    fact = periodic[1] ? 1.0 : 0.5;
+
+    for (i=1; i<w-1; i++) {
+      // Full range of j, but factor may change on y edges (j == 0 and j == h-1)
+      LA[index] = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+      index++;
+      for (j=1; j<h-1; j++) {
+        LA[index] = ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+        index++;
+      }
+      LA[index] = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+      index++;
+    }
+    // Edges along x (x components only)
+    index = 0; // Follows left edge
+    index2 = h * (w - 1); // Follows right edge
+    if (periodic[0]) {
+      xm =  h * (w - 1);
+      xp =  h;
+      fact = periodic[1] ? 1.0 : 0.5;
+      LA[index]  = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+      LA[index2] = fact * ffx * (A[index2 - xp] + A[index2 - xm] - 2.0 * A[index2]);
+      index++;
+      index2++;
+      for (j=1; j<h-1; j++) {
+        LA[index]  = ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+        LA[index2] = ffx * (A[index2 - xp] + A[index2 - xm] - 2.0 * A[index2]);
+        index++;
+        index2++;
+      }
+      LA[index]  = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+      LA[index2] = fact * ffx * (A[index2 - xp] + A[index2 - xm] - 2.0 * A[index2]);
+    } else {
+      xm = -h;
+      xp =  h;
+      fact = periodic[1] ? 1.0 : 0.5; // Halve in corners in full PBC only
+      // lower corner, "j == 0"
+      LA[index]  = fact * ffx * (A[index + xp] - A[index]);
+      LA[index2] = fact * ffx * (A[index2 + xm] - A[index2]);
+      index++;
+      index2++;
+      for (j=1; j<h-1; j++) {
+        // x gradient (+ y term of laplacian, calculated below)
+        LA[index]  = ffx * (A[index + xp] - A[index]);
+        LA[index2] = ffx * (A[index2 + xm] - A[index2]);
+        index++;
+        index2++;
+      }
+      // upper corner, j == h-1
+      LA[index]  = fact * ffx * (A[index + xp] - A[index]);
+      LA[index2] = fact * ffx * (A[index2 + xm] - A[index2]);
+    }
+
+    // Now adding all y components
+    // All y components except on y edges
+    index = 1; // Skip first element (in first row)
+
+    fact = periodic[0] ? 1.0 : 0.5; // for i == 0
+    for (i=0; i<w; i++) {
+      // Factor of 1/2 on x edges if non-periodic
+      if (i == 1) fact = 1.0;
+      if (i == w - 1) fact = periodic[0] ? 1.0 : 0.5;
+      for (j=1; j<h-1; j++) {
+        LA[index] += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+        index++;
+      }
+      index += 2; // skip the edges and move to next column
+    }
+    // Edges along y (y components only)
+    index = 0; // Follows bottom edge
+    index2 = h - 1; // Follows top edge
+    if (periodic[1]) {
+      fact = periodic[0] ? 1.0 : 0.5;
+      ym = h - 1;
+      yp = 1;
+      LA[index]  += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+      LA[index2] += fact * ffy * (A[index2 - yp] + A[index2 - ym] - 2.0 * A[index2]);
+      index  += h;
+      index2 += h;
+      for (i=1; i<w-1; i++) {
+        LA[index]  += ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+        LA[index2] += ffy * (A[index2 - yp] + A[index2 - ym] - 2.0 * A[index2]);
+        index  += h;
+        index2 += h;
+      }
+      LA[index]  += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+      LA[index2] += fact * ffy * (A[index2 - yp] + A[index2 - ym] - 2.0 * A[index2]);
+    } else {
+      ym = -1;
+      yp = 1;
+      fact = periodic[0] ? 1.0 : 0.5; // Halve in corners in full PBC only
+      // Left corner
+      LA[index]  += fact * ffy * (A[index + yp] - A[index]);
+      LA[index2] += fact * ffy * (A[index2 + ym] - A[index2]);
+      index  += h;
+      index2 += h;
+      for (i=1; i<w-1; i++) {
+        // y gradient (+ x term of laplacian, calculated above)
+        LA[index]  += ffy * (A[index + yp] - A[index]);
+        LA[index2] += ffy * (A[index2 + ym] - A[index2]);
+        index  += h;
+        index2 += h;
+      }
+      // Right corner
+      LA[index]  += fact * ffy * (A[index + yp] - A[index]);
+      LA[index2] += fact * ffy * (A[index2 + ym] - A[index2]);
+    }
+
+  } else if (nd == 3) {
+    // DIMENSION 3
+
+    int i, j, k;
+    size_t index, index2;
+    cvm::real fact = 1.0;
+    const cvm::real ffx = 1.0 / (widths[0] * widths[0]);
+    const cvm::real ffy = 1.0 / (widths[1] * widths[1]);
+    const cvm::real ffz = 1.0 / (widths[2] * widths[2]);
+    const int h = nx[2]; // height
+    const int d = nx[1]; // depth
+    const int w = nx[0]; // width
+    // offsets for 6 reference points of the Laplacian stencil
+    int xm = -d * h;
+    int xp =  d * h;
+    int ym = -h;
+    int yp =  h;
+    int zm = -1;
+    int zp =  1;
+
+    cvm::real factx = periodic[0] ? 1 : 0.5; // factor to be applied on x edges
+    cvm::real facty = periodic[1] ? 1 : 0.5; // same for y
+    cvm::real factz = periodic[2] ? 1 : 0.5; // same for z
+    cvm::real ifactx = 1 / factx;
+    cvm::real ifacty = 1 / facty;
+    cvm::real ifactz = 1 / factz;
+
+    // All x components except on x edges
+    index = d * h; // Skip left slab
+    fact = facty * factz;
+    for (i=1; i<w-1; i++) {
+      for (j=0; j<d; j++) { // full range of y
+        if (j == 1) fact *= ifacty;
+        if (j == d-1) fact *= facty;
+        LA[index] = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+        index++;
+        fact *= ifactz;
+        for (k=1; k<h-1; k++) { // full range of z
+          LA[index] = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+          index++;
+        }
+        fact *= factz;
+        LA[index] = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+        index++;
+      }
+    }
+    // Edges along x (x components only)
+    index = 0; // Follows left slab
+    index2 = d * h * (w - 1); // Follows right slab
+    if (periodic[0]) {
+      xm =  d * h * (w - 1);
+      xp =  d * h;
+      fact = facty * factz;
+      for (j=0; j<d; j++) {
+        if (j == 1) fact *= ifacty;
+        if (j == d-1) fact *= facty;
+        LA[index]  = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+        LA[index2] = fact * ffx * (A[index2 - xp] + A[index2 - xm] - 2.0 * A[index2]);
+        index++;
+        index2++;
+        fact *= ifactz;
+        for (k=1; k<h-1; k++) {
+          LA[index]  = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+          LA[index2] = fact * ffx * (A[index2 - xp] + A[index2 - xm] - 2.0 * A[index2]);
+          index++;
+          index2++;
+        }
+        fact *= factz;
+        LA[index]  = fact * ffx * (A[index + xm] + A[index + xp] - 2.0 * A[index]);
+        LA[index2] = fact * ffx * (A[index2 - xp] + A[index2 - xm] - 2.0 * A[index2]);
+        index++;
+        index2++;
+      }
+    } else {
+      xm = -d * h;
+      xp =  d * h;
+      fact = facty * factz;
+      for (j=0; j<d; j++) {
+        if (j == 1) fact *= ifacty;
+        if (j == d-1) fact *= facty;
+        LA[index]  = fact * ffx * (A[index + xp] - A[index]);
+        LA[index2] = fact * ffx * (A[index2 + xm] - A[index2]);
+        index++;
+        index2++;
+        fact *= ifactz;
+        for (k=1; k<h-1; k++) {
+          // x gradient (+ y, z terms of laplacian, calculated below)
+          LA[index]  = fact * ffx * (A[index + xp] - A[index]);
+          LA[index2] = fact * ffx * (A[index2 + xm] - A[index2]);
+          index++;
+          index2++;
+        }
+        fact *= factz;
+        LA[index]  = fact * ffx * (A[index + xp] - A[index]);
+        LA[index2] = fact * ffx * (A[index2 + xm] - A[index2]);
+        index++;
+        index2++;
+      }
+    }
+
+    // Now adding all y components
+    // All y components except on y edges
+    index = h; // Skip first column (in front slab)
+    fact = factx * factz;
+    for (i=0; i<w; i++) { // full range of x
+      if (i == 1) fact *= ifactx;
+      if (i == w-1) fact *= factx;
+      for (j=1; j<d-1; j++) {
+        LA[index] += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+        index++;
+        fact *= ifactz;
+        for (k=1; k<h-1; k++) {
+          LA[index] += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+          index++;
+        }
+        fact *= factz;
+        LA[index] += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+        index++;
+      }
+      index += 2 * h; // skip columns in front and back slabs
+    }
+    // Edges along y (y components only)
+    index = 0; // Follows front slab
+    index2 = h * (d - 1); // Follows back slab
+    if (periodic[1]) {
+      ym = h * (d - 1);
+      yp = h;
+      fact = factx * factz;
+      for (i=0; i<w; i++) {
+        if (i == 1) fact *= ifactx;
+        if (i == w-1) fact *= factx;
+        LA[index]  += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+        LA[index2] += fact * ffy * (A[index2 - yp] + A[index2 - ym] - 2.0 * A[index2]);
+        index++;
+        index2++;
+        fact *= ifactz;
+        for (k=1; k<h-1; k++) {
+          LA[index]  += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+          LA[index2] += fact * ffy * (A[index2 - yp] + A[index2 - ym] - 2.0 * A[index2]);
+          index++;
+          index2++;
+        }
+        fact *= factz;
+        LA[index]  += fact * ffy * (A[index + ym] + A[index + yp] - 2.0 * A[index]);
+        LA[index2] += fact * ffy * (A[index2 - yp] + A[index2 - ym] - 2.0 * A[index2]);
+        index++;
+        index2++;
+        index  += h * (d - 1);
+        index2 += h * (d - 1);
+      }
+    } else {
+      ym = -h;
+      yp =  h;
+      fact = factx * factz;
+      for (i=0; i<w; i++) {
+        if (i == 1) fact *= ifactx;
+        if (i == w-1) fact *= factx;
+        LA[index]  += fact * ffy * (A[index + yp] - A[index]);
+        LA[index2] += fact * ffy * (A[index2 + ym] - A[index2]);
+        index++;
+        index2++;
+        fact *= ifactz;
+        for (k=1; k<h-1; k++) {
+          // y gradient (+ x, z terms of laplacian, calculated above and below)
+          LA[index]  += fact * ffy * (A[index + yp] - A[index]);
+          LA[index2] += fact * ffy * (A[index2 + ym] - A[index2]);
+          index++;
+          index2++;
+        }
+        fact *= factz;
+        LA[index]  += fact * ffy * (A[index + yp] - A[index]);
+        LA[index2] += fact * ffy * (A[index2 + ym] - A[index2]);
+        index++;
+        index2++;
+        index  += h * (d - 1);
+        index2 += h * (d - 1);
+      }
+    }
+
+  // Now adding all z components
+    // All z components except on z edges
+    index = 1; // Skip first element (in bottom slab)
+    fact = factx * facty;
+    for (i=0; i<w; i++) { // full range of x
+      if (i == 1) fact *= ifactx;
+      if (i == w-1) fact *= factx;
+      for (k=1; k<h-1; k++) {
+        LA[index] += fact * ffz * (A[index + zm] + A[index + zp] - 2.0 * A[index]);
+        index++;
+      }
+      fact *= ifacty;
+      index += 2; // skip edge slabs
+      for (j=1; j<d-1; j++) { // full range of y
+        for (k=1; k<h-1; k++) {
+          LA[index] += fact * ffz * (A[index + zm] + A[index + zp] - 2.0 * A[index]);
+          index++;
+        }
+        index += 2; // skip edge slabs
+      }
+      fact *= facty;
+      for (k=1; k<h-1; k++) {
+        LA[index] += fact * ffz * (A[index + zm] + A[index + zp] - 2.0 * A[index]);
+        index++;
+      }
+      index += 2; // skip edge slabs
+    }
+    // Edges along z (z components onlz)
+    index = 0; // Follows bottom slab
+    index2 = h - 1; // Follows top slab
+    if (periodic[2]) {
+      zm = h - 1;
+      zp = 1;
+      fact = factx * facty;
+      for (i=0; i<w; i++) {
+        if (i == 1) fact *= ifactx;
+        if (i == w-1) fact *= factx;
+        LA[index]  += fact * ffz * (A[index + zm] + A[index + zp] - 2.0 * A[index]);
+        LA[index2] += fact * ffz * (A[index2 - zp] + A[index2 - zm] - 2.0 * A[index2]);
+        index  += h;
+        index2 += h;
+        fact *= ifacty;
+        for (j=1; j<d-1; j++) {
+          LA[index]  += fact * ffz * (A[index + zm] + A[index + zp] - 2.0 * A[index]);
+          LA[index2] += fact * ffz * (A[index2 - zp] + A[index2 - zm] - 2.0 * A[index2]);
+          index  += h;
+          index2 += h;
+        }
+        fact *= facty;
+        LA[index]  += fact * ffz * (A[index + zm] + A[index + zp] - 2.0 * A[index]);
+        LA[index2] += fact * ffz * (A[index2 - zp] + A[index2 - zm] - 2.0 * A[index2]);
+        index  += h;
+        index2 += h;
+      }
+    } else {
+      zm = -1;
+      zp = 1;
+      fact = factx * facty;
+      for (i=0; i<w; i++) {
+        if (i == 1) fact *= ifactx;
+        if (i == w-1) fact *= factx;
+        LA[index]  += fact * ffz * (A[index + zp] - A[index]);
+        LA[index2] += fact * ffz * (A[index2 + zm] - A[index2]);
+        index  += h;
+        index2 += h;
+        fact *= ifacty;
+        for (j=1; j<d-1; j++) {
+          // z gradient (+ x, y terms of laplacian, calculated above)
+          LA[index]  += fact * ffz * (A[index + zp] - A[index]);
+          LA[index2] += fact * ffz * (A[index2 + zm] - A[index2]);
+          index  += h;
+          index2 += h;
+        }
+        fact *= facty;
+        LA[index]  += fact * ffz * (A[index + zp] - A[index]);
+        LA[index2] += fact * ffz * (A[index2 + zm] - A[index2]);
+        index  += h;
+        index2 += h;
+      }
+    }
+  }
+}
+
+
+/*
+/// Inversion of preconditioner matrix (e.g. diagonal of the Laplacian)
+void integrate_potential::asolve(const std::vector<cvm::real> &b, std::vector<cvm::real> &x)
+{
+  for (size_t i=0; i<nt; i++) {
+    x[i] = b[i] * inv_lap_diag[i]; // Jacobi preconditioner - little benefit in tests so far
+  }
+  return;
+}*/
+
+
+// b : RHS of equation
+// x : initial guess for the solution; output is solution
+// itol : convergence criterion
+void integrate_potential::nr_linbcg_sym(const std::vector<cvm::real> &b, std::vector<cvm::real> &x, const cvm::real &tol,
+  const int itmax, int &iter, cvm::real &err)
+{
+  cvm::real ak,akden,bk,bkden,bknum,bnrm;
+  const cvm::real EPS=1.0e-14;
+  int j;
+  std::vector<cvm::real> p(nt), r(nt), z(nt);
+
+  iter=0;
+  atimes(x,r);
+  for (j=0;j<nt;j++) {
+    r[j]=b[j]-r[j];
+  }
+  bnrm=l2norm(b);
+  if (bnrm < EPS) {
+    return; // Target is zero, will break relative error calc
+  }
+//   asolve(r,z); // precon
+  bkden = 1.0;
+  while (iter < itmax) {
+    ++iter;
+    for (bknum=0.0,j=0;j<nt;j++) {
+      bknum += r[j]*r[j];  // precon: z[j]*r[j]
+    }
+    if (iter == 1) {
+      for (j=0;j<nt;j++) {
+        p[j] = r[j];  // precon: p[j] = z[j]
+      }
+    } else {
+      bk=bknum/bkden;
+      for (j=0;j<nt;j++) {
+        p[j] = bk*p[j] + r[j];  // precon:  bk*p[j] + z[j]
+      }
+    }
+    bkden = bknum;
+    atimes(p,z);
+    for (akden=0.0,j=0;j<nt;j++) {
+      akden += z[j]*p[j];
+    }
+    ak = bknum/akden;
+    for (j=0;j<nt;j++) {
+      x[j] += ak*p[j];
+      r[j] -= ak*z[j];
+    }
+//     asolve(r,z);  // precon
+    err = l2norm(r)/bnrm;
+    if (cvm::debug())
+      std::cout << "iter=" << std::setw(4) << iter+1 << std::setw(12) << err << std::endl;
+    if (err <= tol)
+      break;
+  }
+}
+
+cvm::real integrate_potential::l2norm(const std::vector<cvm::real> &x)
+{
+  size_t i;
+  cvm::real sum = 0.0;
+  for (i=0;i<x.size();i++)
+    sum += x[i]*x[i];
+  return sqrt(sum);
+}
--- a/lib/colvars/colvargrid.h
+++ b/lib/colvars/colvargrid.h
@ -97,8 +97,8 @@ public:
  /// Whether this grid has been filled with data or is still empty
  bool has_data;

-  /// Return the number of colvars
-  inline size_t number_of_colvars() const
+  /// Return the number of colvar objects
+  inline size_t num_variables() const
  {
    return nd;
  }
@ -374,6 +374,20 @@ public:
    }
  }

+  /// Wrap an index vector around periodic boundary conditions
+  /// or detects edges if non-periodic
+  inline bool wrap_edge(std::vector<int> & ix) const
+  {
+    bool edge = false;
+    for (size_t i = 0; i < nd; i++) {
+      if (periodic[i]) {
+        ix[i] = (ix[i] + nx[i]) % nx[i]; //to ensure non-negative result
+      } else if (ix[i] == -1 || ix[i] == nx[i]) {
+        edge = true;
+      }
+    }
+    return edge;
+  }

  /// \brief Report the bin corresponding to the current value of variable i
  inline int current_bin_scalar(int const i) const
@ -445,6 +459,12 @@ public:
    has_data = true;
  }

+  /// Set the value at the point with linear address i (for speed)
+  inline void set_value(size_t i, T const &t)
+  {
+    data[i] = t;
+  }
+
  /// \brief Get the change from this to other_grid
  /// and store the result in this.
  /// this_grid := other_grid - this_grid
@ -518,6 +538,11 @@ public:
    return data[this->address(ix) + imult];
  }

+  /// \brief Get the binned value indexed by linear address i
+  inline T const & value(size_t i) const
+  {
+    return data[i];
+  }

  /// \brief Add a constant to all elements (fast loop)
  inline void add_constant(T const &t)
@ -1110,20 +1135,20 @@ public:
    // write the header
    os << "object 1 class gridpositions counts";
    size_t icv;
-    for (icv = 0; icv < number_of_colvars(); icv++) {
+    for (icv = 0; icv < num_variables(); icv++) {
      os << " " << number_of_points(icv);
    }
    os << "\n";

    os << "origin";
-    for (icv = 0; icv < number_of_colvars(); icv++) {
+    for (icv = 0; icv < num_variables(); icv++) {
      os << " " << (lower_boundaries[icv].real_value + 0.5 * widths[icv]);
    }
    os << "\n";

-    for (icv = 0; icv < number_of_colvars(); icv++) {
+    for (icv = 0; icv < num_variables(); icv++) {
      os << "delta";
-      for (size_t icv2 = 0; icv2 < number_of_colvars(); icv2++) {
+      for (size_t icv2 = 0; icv2 < num_variables(); icv2++) {
        if (icv == icv2) os << " " << widths[icv];
        else os << " " << 0.0;
      }
@ -1131,7 +1156,7 @@ public:
    }

    os << "object 2 class gridconnections counts";
-    for (icv = 0; icv < number_of_colvars(); icv++) {
+    for (icv = 0; icv < num_variables(); icv++) {
      os << " " << number_of_points(icv);
    }
    os << "\n";
@ -1167,7 +1192,8 @@ public:

  /// Constructor from a vector of colvars
  colvar_grid_count(std::vector<colvar *>  &colvars,
-                    size_t const           &def_count = 0);
+                    size_t const           &def_count = 0,
+                    bool                   margin = false);

  /// Increment the counter at given position
  inline void incr_count(std::vector<int> const &ix)
@ -1210,12 +1236,13 @@ public:
    int A0, A1, A2;
    std::vector<int> ix = ix0;

+    // TODO this can be rewritten more concisely with wrap_edge()
    if (periodic[n]) {
      ix[n]--; wrap(ix);
-      A0 = data[address(ix)];
+      A0 = value(ix);
      ix = ix0;
      ix[n]++; wrap(ix);
-      A1 = data[address(ix)];
+      A1 = value(ix);
      if (A0 * A1 == 0) {
        return 0.; // can't handle empty bins
      } else {
@ -1224,10 +1251,10 @@ public:
      }
    } else if (ix[n] > 0 && ix[n] < nx[n]-1) { // not an edge
      ix[n]--;
-      A0 = data[address(ix)];
+      A0 = value(ix);
      ix = ix0;
      ix[n]++;
-      A1 = data[address(ix)];
+      A1 = value(ix);
      if (A0 * A1 == 0) {
        return 0.; // can't handle empty bins
      } else {
@ -1238,9 +1265,9 @@ public:
      // edge: use 2nd order derivative
      int increment = (ix[n] == 0 ? 1 : -1);
      // move right from left edge, or the other way around
-      A0 = data[address(ix)];
-      ix[n] += increment; A1 = data[address(ix)];
-      ix[n] += increment; A2 = data[address(ix)];
+      A0 = value(ix);
+      ix[n] += increment; A1 = value(ix);
+      ix[n] += increment; A2 = value(ix);
      if (A0 * A1 * A2 == 0) {
        return 0.; // can't handle empty bins
      } else {
@ -1249,6 +1276,49 @@ public:
      }
    }
  }
+
+  /// \brief Return the gradient of discrete count from finite differences
+  /// on the *same* grid for dimension n
+  inline cvm::real gradient_finite_diff(const std::vector<int> &ix0,
+                                            int n = 0)
+  {
+    int A0, A1, A2;
+    std::vector<int> ix = ix0;
+
+    // FIXME this can be rewritten more concisely with wrap_edge()
+    if (periodic[n]) {
+      ix[n]--; wrap(ix);
+      A0 = value(ix);
+      ix = ix0;
+      ix[n]++; wrap(ix);
+      A1 = value(ix);
+      if (A0 * A1 == 0) {
+        return 0.; // can't handle empty bins
+      } else {
+        return cvm::real(A1 - A0) / (widths[n] * 2.);
+      }
+    } else if (ix[n] > 0 && ix[n] < nx[n]-1) { // not an edge
+      ix[n]--;
+      A0 = value(ix);
+      ix = ix0;
+      ix[n]++;
+      A1 = value(ix);
+      if (A0 * A1 == 0) {
+        return 0.; // can't handle empty bins
+      } else {
+        return cvm::real(A1 - A0) / (widths[n] * 2.);
+      }
+    } else {
+      // edge: use 2nd order derivative
+      int increment = (ix[n] == 0 ? 1 : -1);
+      // move right from left edge, or the other way around
+      A0 = value(ix);
+      ix[n] += increment; A1 = value(ix);
+      ix[n] += increment; A2 = value(ix);
+      return (-1.5 * cvm::real(A0) + 2. * cvm::real(A1)
+          - 0.5 * cvm::real(A2)) * increment / widths[n];
+    }
+  }
 };


@ -1289,27 +1359,57 @@ public:
    has_data = true;
  }

-  /// Return the gradient of the scalar field from finite differences
-  inline const cvm::real * gradient_finite_diff( const std::vector<int> &ix0 )
+  /// \brief Return the gradient of the scalar field from finite differences
+  /// Input coordinates are those of gradient grid, shifted wrt scalar grid
+  /// Should not be called on edges of scalar grid, provided the latter has margins
+  /// wrt gradient grid
+  inline void vector_gradient_finite_diff( const std::vector<int> &ix0, std::vector<cvm::real> &grad)
  {
    cvm::real A0, A1;
    std::vector<int> ix;
-    if (nd != 2) {
-      cvm::error("Finite differences available in dimension 2 only.");
-      return grad;
-    }
-    for (unsigned int n = 0; n < nd; n++) {
+    size_t i, j, k, n;
+
+    if (nd == 2) {
+      for (n = 0; n < 2; n++) {
+        ix = ix0;
+        A0 = value(ix);
+        ix[n]++; wrap(ix);
+        A1 = value(ix);
+        ix[1-n]++; wrap(ix);
+        A1 += value(ix);
+        ix[n]--; wrap(ix);
+        A0 += value(ix);
+        grad[n] = 0.5 * (A1 - A0) / widths[n];
+      }
+    } else if (nd == 3) {
+
+      cvm::real p[8]; // potential values within cube, indexed in binary (4 i + 2 j + k)
      ix = ix0;
-      A0 = data[address(ix)];
-      ix[n]++; wrap(ix);
-      A1 = data[address(ix)];
-      ix[1-n]++; wrap(ix);
-      A1 += data[address(ix)];
-      ix[n]--; wrap(ix);
-      A0 += data[address(ix)];
-      grad[n] = 0.5 * (A1 - A0) / widths[n];
+      int index = 0;
+      for (i = 0; i<2; i++) {
+        ix[1] = ix0[1];
+        for (j = 0; j<2; j++) {
+          ix[2] = ix0[2];
+          for (k = 0; k<2; k++) {
+            wrap(ix);
+            p[index++] = value(ix);
+            ix[2]++;
+          }
+          ix[1]++;
+        }
+        ix[0]++;
+      }
+
+      // The following would be easier to read using binary literals
+      //                  100    101    110    111      000    001    010   011
+      grad[0] = 0.25 * ((p[4] + p[5] + p[6] + p[7]) - (p[0] + p[1] + p[2] + p[3])) / widths[0];
+      //                  010     011    110   111      000    001    100   101
+      grad[1] = 0.25 * ((p[2] + p[3] + p[6] + p[7]) - (p[0] + p[1] + p[4] + p[5])) / widths[0];
+      //                  001    011     101   111      000    010   100    110
+      grad[2] = 0.25 * ((p[1] + p[3] + p[5] + p[7]) - (p[0] + p[2] + p[4] + p[6])) / widths[0];
+    } else {
+      cvm::error("Finite differences available in dimension 2 and 3 only.");
    }
-    return grad;
  }

  /// \brief Return the value of the function at ix divided by its
@ -1373,10 +1473,6 @@ public:
  /// \brief Assuming that the map is a normalized probability density,
  ///        calculates the entropy (uses widths if they are defined)
  cvm::real entropy() const;
-
-private:
-  // gradient
-  cvm::real * grad;
 };


@ -1390,6 +1486,10 @@ public:
  /// should be divided
  colvar_grid_count *samples;

+  /// \brief Provide the floating point weights by which each binned value
+  /// should be divided (alternate to samples, only one should be non-null)
+  colvar_grid_scalar *weights;
+
  /// Default constructor
  colvar_grid_gradient();

@ -1403,6 +1503,29 @@ public:
  /// Constructor from a vector of colvars
  colvar_grid_gradient(std::vector<colvar *>  &colvars);

+  /// \brief Get a vector with the binned value(s) indexed by ix, normalized if applicable
+  inline void vector_value(std::vector<int> const &ix, std::vector<cvm::real> &v) const
+  {
+    cvm::real const * p = &value(ix);
+    if (samples) {
+      int count = samples->value(ix);
+      if (count) {
+        cvm::real invcount = 1.0 / count;
+        for (size_t i = 0; i < mult; i++) {
+          v[i] = invcount * p[i];
+        }
+      } else {
+        for (size_t i = 0; i < mult; i++) {
+          v[i] = 0.0;
+        }
+      }
+    } else {
+      for (size_t i = 0; i < mult; i++) {
+        v[i] = p[i];
+      }
+    }
+  }
+
  /// \brief Accumulate the value
  inline void acc_value(std::vector<int> const &ix, std::vector<colvarvalue> const &values) {
    for (size_t imult = 0; imult < mult; imult++) {
@ -1412,15 +1535,6 @@ public:
      samples->incr_count(ix);
  }

-  /// \brief Accumulate the gradient
-  inline void acc_grad(std::vector<int> const &ix, cvm::real const *grads) {
-    for (size_t imult = 0; imult < mult; imult++) {
-      data[address(ix) + imult] += grads[imult];
-    }
-    if (samples)
-      samples->incr_count(ix);
-  }
-
  /// \brief Accumulate the gradient based on the force (i.e. sums the
  /// opposite of the force)
  inline void acc_force(std::vector<int> const &ix, cvm::real const *forces) {
@ -1431,6 +1545,17 @@ public:
      samples->incr_count(ix);
  }

+  /// \brief Accumulate the gradient based on the force (i.e. sums the
+  /// opposite of the force) with a non-integer weight
+  inline void acc_force_weighted(std::vector<int> const &ix,
+                                 cvm::real const *forces,
+                                 cvm::real weight) {
+    for (size_t imult = 0; imult < mult; imult++) {
+      data[address(ix) + imult] -= forces[imult] * weight;
+    }
+    weights->acc_value(ix, weight);
+  }
+
  /// \brief Return the value of the function at ix divided by its
  /// number of samples (if the count grid is defined)
  virtual inline cvm::real value_output(std::vector<int> const &ix,
@ -1498,5 +1623,70 @@ public:
 };


+
+/// Integrate (1D, 2D or 3D) gradients
+
+class integrate_potential : public colvar_grid_scalar
+{
+  public:
+
+  integrate_potential();
+
+  virtual ~integrate_potential()
+  {}
+
+  /// Constructor from a vector of colvars + gradient grid
+  integrate_potential (std::vector<colvar *> &colvars, colvar_grid_gradient * gradients);
+
+  /// \brief Calculate potential from divergence (in 2D); return number of steps
+  int integrate (const int itmax, const cvm::real & tol, cvm::real & err);
+
+  /// \brief Update matrix containing divergence and boundary conditions
+  /// based on new gradient point value, in neighboring bins
+  void update_div_neighbors(const std::vector<int> &ix);
+
+  /// \brief Set matrix containing divergence and boundary conditions
+  /// based on complete gradient grid
+  void set_div();
+
+  /// \brief Add constant to potential so that its minimum value is zero
+  /// Useful e.g. for output
+  inline void set_zero_minimum() {
+    add_constant(-1.0 * minimum_value());
+  }
+
+  protected:
+
+  // Reference to gradient grid
+  colvar_grid_gradient *gradients;
+
+  /// Array holding divergence + boundary terms (modified Neumann) if not periodic
+  std::vector<cvm::real> divergence;
+
+//   std::vector<cvm::real> inv_lap_diag; // Inverse of the diagonal of the Laplacian; for conditioning
+
+  /// \brief Update matrix containing divergence and boundary conditions
+  /// called by update_div_neighbors
+  void update_div_local(const std::vector<int> &ix);
+
+  /// Obtain the gradient vector at given location ix, if available
+  /// or zero if it is on the edge of the gradient grid
+  /// ix gets wrapped in PBC
+  void get_grad(cvm::real * g, std::vector<int> &ix);
+
+  /// \brief Solve linear system based on CG, valid for symmetric matrices only
+  void nr_linbcg_sym(const std::vector<cvm::real> &b, std::vector<cvm::real> &x,
+                     const cvm::real &tol, const int itmax, int &iter, cvm::real &err);
+
+  /// l2 norm of a vector
+  cvm::real l2norm(const std::vector<cvm::real> &x);
+
+  /// Multiplication by sparse matrix representing Lagrangian (or its transpose)
+  void atimes(const std::vector<cvm::real> &x, std::vector<cvm::real> &r);
+
+//   /// Inversion of preconditioner matrix
+//   void asolve(const std::vector<cvm::real> &b, std::vector<cvm::real> &x);
+};
+
 #endif

--- a/lib/colvars/colvarmodule.cpp
+++ b/lib/colvars/colvarmodule.cpp
@ -24,6 +24,7 @@
 #include "colvaratoms.h"
 #include "colvarcomp.h"

+
 colvarmodule::colvarmodule(colvarproxy *proxy_in)
 {
  depth_s = 0;
@ -417,10 +418,10 @@ int colvarmodule::parse_biases(std::string const &conf)
             "Please ensure that their forces do not counteract each other.\n");
  }

-  if (biases.size() || use_scripted_forces) {
+  if (num_biases() || use_scripted_forces) {
    cvm::log(cvm::line_marker);
    cvm::log("Collective variables biases initialized, "+
-             cvm::to_str(biases.size())+" in total.\n");
+             cvm::to_str(num_biases())+" in total.\n");
  } else {
    if (!use_scripted_forces) {
      cvm::log("No collective variables biases were defined.\n");
@ -431,12 +432,37 @@ int colvarmodule::parse_biases(std::string const &conf)
 }


+int colvarmodule::num_variables() const
+{
+  return colvars.size();
+}
+
+
+int colvarmodule::num_variables_feature(int feature_id) const
+{
+  size_t n = 0;
+  for (std::vector<colvar *>::const_iterator cvi = colvars.begin();
+       cvi != colvars.end();
+       cvi++) {
+    if ((*cvi)->is_enabled(feature_id)) {
+      n++;
+    }
+  }
+  return n;
+}
+
+
+int colvarmodule::num_biases() const
+{
+  return biases.size();
+}
+
+
 int colvarmodule::num_biases_feature(int feature_id) const
 {
-  colvarmodule *cv = cvm::main();
  size_t n = 0;
-  for (std::vector<colvarbias *>::iterator bi = cv->biases.begin();
-       bi != cv->biases.end();
+  for (std::vector<colvarbias *>::const_iterator bi = biases.begin();
+       bi != biases.end();
       bi++) {
    if ((*bi)->is_enabled(feature_id)) {
      n++;
@ -448,10 +474,9 @@ int colvarmodule::num_biases_feature(int feature_id) const

 int colvarmodule::num_biases_type(std::string const &type) const
 {
-  colvarmodule *cv = cvm::main();
  size_t n = 0;
-  for (std::vector<colvarbias *>::iterator bi = cv->biases.begin();
-       bi != cv->biases.end();
+  for (std::vector<colvarbias *>::const_iterator bi = biases.begin();
+       bi != biases.end();
       bi++) {
    if ((*bi)->bias_type == type) {
      n++;
@ -465,7 +490,7 @@ std::vector<std::string> const colvarmodule::time_dependent_biases() const
 {
  size_t i;
  std::vector<std::string> biases_names;
-  for (i = 0; i < biases.size(); i++) {
+  for (i = 0; i < num_biases(); i++) {
    if (biases[i]->is_enabled(colvardeps::f_cvb_apply_force) &&
        biases[i]->is_enabled(colvardeps::f_cvb_active) &&
        (biases[i]->is_enabled(colvardeps::f_cvb_history_dependent) ||
@ -790,7 +815,7 @@ int colvarmodule::calc_biases()
 {
  // update the biases and communicate their forces to the collective
  // variables
-  if (cvm::debug() && biases.size())
+  if (cvm::debug() && num_biases())
    cvm::log("Updating collective variable biases.\n");

  std::vector<colvarbias *>::iterator bi;
@ -852,7 +877,7 @@ int colvarmodule::update_colvar_forces()
  std::vector<colvarbias *>::iterator bi;

  // sum the forces from all biases for each collective variable
-  if (cvm::debug() && biases.size())
+  if (cvm::debug() && num_biases())
    cvm::log("Collecting forces from all biases.\n");
  cvm::increase_depth();
  for (bi = biases_active()->begin(); bi != biases_active()->end(); bi++) {
@ -1073,8 +1098,6 @@ int colvarmodule::reset()

 int colvarmodule::setup_input()
 {
-  if (this->size() == 0) return cvm::get_error();
-
  std::string restart_in_name("");

  // read the restart configuration, if available
@ -1107,14 +1130,12 @@ int colvarmodule::setup_input()
    }
  }

-  return (cvm::get_error() ? COLVARS_ERROR : COLVARS_OK);
+  return cvm::get_error();
 }


 int colvarmodule::setup_output()
 {
-  if (this->size() == 0) return cvm::get_error();
-
  int error_code = COLVARS_OK;

  // output state file (restart)
@ -1123,7 +1144,8 @@ int colvarmodule::setup_output()
    std::string("");

  if (restart_out_name.size()) {
-    cvm::log("The restart output state file will be \""+restart_out_name+"\".\n");
+    cvm::log("The restart output state file will be \""+
+             restart_out_name+"\".\n");
  }

  output_prefix() = proxy->output_prefix();
@ -1154,7 +1176,7 @@ int colvarmodule::setup_output()
    set_error_bits(FILE_ERROR);
  }

-  return (cvm::get_error() ? COLVARS_ERROR : COLVARS_OK);
+  return cvm::get_error();
 }


@ -1738,6 +1760,89 @@ int cvm::load_coords_xyz(char const *filename,
 }


+
+// Wrappers to proxy functions: these may go in the future
+
+cvm::real cvm::unit_angstrom()
+{
+  return proxy->unit_angstrom();
+}
+
+
+cvm::real cvm::boltzmann()
+{
+  return proxy->boltzmann();
+}
+
+
+cvm::real cvm::temperature()
+{
+  return proxy->temperature();
+}
+
+
+cvm::real cvm::dt()
+{
+  return proxy->dt();
+}
+
+
+void cvm::request_total_force()
+{
+  proxy->request_total_force(true);
+}
+
+cvm::rvector cvm::position_distance(atom_pos const &pos1,
+                                            atom_pos const &pos2)
+{
+  return proxy->position_distance(pos1, pos2);
+}
+
+cvm::real cvm::position_dist2(cvm::atom_pos const &pos1,
+                                      cvm::atom_pos const &pos2)
+{
+  return proxy->position_dist2(pos1, pos2);
+}
+
+cvm::real cvm::rand_gaussian(void)
+{
+  return proxy->rand_gaussian();
+}
+
+
+
+bool cvm::replica_enabled()
+{
+  return proxy->replica_enabled();
+}
+
+int cvm::replica_index()
+{
+  return proxy->replica_index();
+}
+
+int cvm::replica_num()
+{
+  return proxy->replica_num();
+}
+
+void cvm::replica_comm_barrier()
+{
+  return proxy->replica_comm_barrier();
+}
+
+int cvm::replica_comm_recv(char* msg_data, int buf_len, int src_rep)
+{
+  return proxy->replica_comm_recv(msg_data,buf_len,src_rep);
+}
+
+int cvm::replica_comm_send(char* msg_data, int msg_len, int dest_rep)
+{
+  return proxy->replica_comm_send(msg_data,msg_len,dest_rep);
+}
+
+
+
 // shared pointer to the proxy object
 colvarproxy *colvarmodule::proxy = NULL;

--- a/lib/colvars/colvarmodule.h
+++ b/lib/colvars/colvarmodule.h
@ -39,16 +39,14 @@ You can browse the class hierarchy or the list of source files.
 #define FILE_ERROR      (1<<4)
 #define MEMORY_ERROR    (1<<5)
 #define FATAL_ERROR     (1<<6) // Should be set, or not, together with other bits
-#define DELETE_COLVARS  (1<<7) // Instruct the caller to delete cvm
+//#define DELETE_COLVARS  (1<<7) // Instruct the caller to delete cvm
 #define COLVARS_NO_SUCH_FRAME (1<<8) // Cannot load the requested frame

 #include <iostream>
 #include <iomanip>
-#include <string>
-#include <cstring>
-#include <sstream>
 #include <fstream>
-#include <cmath>
+#include <sstream>
+#include <string>
 #include <vector>
 #include <list>

@ -84,12 +82,18 @@ public:
  /// Defining an abstract real number allows to switch precision
  typedef  double    real;

-  /// Override std::pow with a product for n positive integer
-  static inline real integer_power(real x, int n)
+  /// Override std::pow with a product for n integer
+  static inline real integer_power(real const &x, int const n)
  {
-    real result = 1.0;
-    for (int i = 0; i < n; i++) result *= x;
-    return result;
+    // Original code: math_special.h in LAMMPS
+    double yy, ww;
+    if (x == 0.0) return 0.0;
+    int nn = (n > 0) ? n : -n;
+    ww = x;
+    for (yy = 1.0; nn != 0; nn >>= 1, ww *=ww) {
+      if (nn & 1) yy *= ww;
+    }
+    return (n > 0) ? yy : 1.0/yy;
  }

  /// Residue identifier
@ -301,13 +305,23 @@ private:

 public:

+  /// Return how many variables are defined
+  int num_variables() const;
+
+  /// Return how many variables have this feature enabled
+  int num_variables_feature(int feature_id) const;
+
+  /// Return how many biases are defined
+  int num_biases() const;
+
  /// Return how many biases have this feature enabled
  int num_biases_feature(int feature_id) const;

-  /// Return how many biases are defined with this type
+  /// Return how many biases of this type are defined
  int num_biases_type(std::string const &type) const;

-  /// Return the names of time-dependent biases with forces enabled
+  /// Return the names of time-dependent biases with forces enabled (ABF,
+  /// metadynamics, etc)
  std::vector<std::string> const time_dependent_biases() const;

 private:
@ -602,16 +616,14 @@ public:
 typedef colvarmodule cvm;


-#include "colvartypes.h"
-

 std::ostream & operator << (std::ostream &os, cvm::rvector const &v);
 std::istream & operator >> (std::istream &is, cvm::rvector &v);


 template<typename T> std::string cvm::to_str(T const &x,
-                                              size_t const &width,
-                                              size_t const &prec) {
+                                             size_t const &width,
+                                             size_t const &prec) {
  std::ostringstream os;
  if (width) os.width(width);
  if (prec) {
@ -622,9 +634,10 @@ template<typename T> std::string cvm::to_str(T const &x,
  return os.str();
 }

+
 template<typename T> std::string cvm::to_str(std::vector<T> const &x,
-                                              size_t const &width,
-                                              size_t const &prec) {
+                                             size_t const &width,
+                                             size_t const &prec) {
  if (!x.size()) return std::string("");
  std::ostringstream os;
  if (prec) {
@ -645,70 +658,4 @@ template<typename T> std::string cvm::to_str(std::vector<T> const &x,
 }


-#include "colvarproxy.h"
-
-
-inline cvm::real cvm::unit_angstrom()
-{
-  return proxy->unit_angstrom();
-}
-
-inline cvm::real cvm::boltzmann()
-{
-  return proxy->boltzmann();
-}
-
-inline cvm::real cvm::temperature()
-{
-  return proxy->temperature();
-}
-
-inline cvm::real cvm::dt()
-{
-  return proxy->dt();
-}
-
-// Replica exchange commands
-inline bool cvm::replica_enabled() {
-  return proxy->replica_enabled();
-}
-inline int cvm::replica_index() {
-  return proxy->replica_index();
-}
-inline int cvm::replica_num() {
-  return proxy->replica_num();
-}
-inline void cvm::replica_comm_barrier() {
-  return proxy->replica_comm_barrier();
-}
-inline int cvm::replica_comm_recv(char* msg_data, int buf_len, int src_rep) {
-  return proxy->replica_comm_recv(msg_data,buf_len,src_rep);
-}
-inline int cvm::replica_comm_send(char* msg_data, int msg_len, int dest_rep) {
-  return proxy->replica_comm_send(msg_data,msg_len,dest_rep);
-}
-
-
-inline void cvm::request_total_force()
-{
-  proxy->request_total_force(true);
-}
-
-inline cvm::rvector cvm::position_distance(atom_pos const &pos1,
-                                            atom_pos const &pos2)
-{
-  return proxy->position_distance(pos1, pos2);
-}
-
-inline cvm::real cvm::position_dist2(cvm::atom_pos const &pos1,
-                                      cvm::atom_pos const &pos2)
-{
-  return proxy->position_dist2(pos1, pos2);
-}
-
-inline cvm::real cvm::rand_gaussian(void)
-{
-  return proxy->rand_gaussian();
-}
-
 #endif
--- a/lib/colvars/colvarparse.cpp
+++ b/lib/colvars/colvarparse.cpp
@ -553,7 +553,8 @@ bool colvarparse::key_lookup(std::string const &conf,
                             size_t *save_pos)
 {
  if (cvm::debug()) {
-    cvm::log("Looking for the keyword \""+std::string(key_in)+"\" and its value.\n");
+    cvm::log("Looking for the keyword \""+std::string(key_in)+
+             "\" and its value.\n");
  }

  // add this keyword to the register (in its camelCase version)
--- a/lib/colvars/colvarproxy.cpp
+++ b/lib/colvars/colvarproxy.cpp
@ -14,6 +14,11 @@
 #include <omp.h>
 #endif

+#if defined(NAMD_TCL) || defined(VMDTCL)
+#define COLVARS_TCL
+#include <tcl.h>
+#endif
+
 #include "colvarmodule.h"
 #include "colvarproxy.h"
 #include "colvarscript.h"
@ -420,8 +425,10 @@ colvarproxy_script::colvarproxy_script()
 colvarproxy_script::~colvarproxy_script() {}


-char *colvarproxy_script::script_obj_to_str(unsigned char *obj)
+char const *colvarproxy_script::script_obj_to_str(unsigned char *obj)
 {
+  cvm::error("Error: trying to print a script object without a scripting "
+             "language interface.\n", BUG_ERROR);
  return reinterpret_cast<char *>(obj);
 }

@ -451,6 +458,140 @@ int colvarproxy_script::run_colvar_gradient_callback(



+colvarproxy_tcl::colvarproxy_tcl()
+{
+  _tcl_interp = NULL;
+}
+
+
+colvarproxy_tcl::~colvarproxy_tcl()
+{
+}
+
+
+void colvarproxy_tcl::init_tcl_pointers()
+{
+  cvm::error("Error: Tcl support is currently unavailable "
+             "outside NAMD or VMD.\n", COLVARS_NOT_IMPLEMENTED);
+}
+
+
+char const *colvarproxy_tcl::tcl_obj_to_str(unsigned char *obj)
+{
+#if defined(COLVARS_TCL)
+  return Tcl_GetString(reinterpret_cast<Tcl_Obj *>(obj));
+#else
+  return NULL;
+#endif
+}
+
+
+int colvarproxy_tcl::tcl_run_force_callback()
+{
+#if defined(COLVARS_TCL)
+  Tcl_Interp *const tcl_interp = reinterpret_cast<Tcl_Interp *>(_tcl_interp);
+  std::string cmd = std::string("calc_colvar_forces ")
+    + cvm::to_str(cvm::step_absolute());
+  int err = Tcl_Eval(tcl_interp, cmd.c_str());
+  if (err != TCL_OK) {
+    cvm::log(std::string("Error while executing calc_colvar_forces:\n"));
+    cvm::error(Tcl_GetStringResult(tcl_interp));
+    return COLVARS_ERROR;
+  }
+  return cvm::get_error();
+#else
+  return COLVARS_NOT_IMPLEMENTED;
+#endif
+}
+
+
+int colvarproxy_tcl::tcl_run_colvar_callback(
+                         std::string const &name,
+                         std::vector<const colvarvalue *> const &cvc_values,
+                         colvarvalue &value)
+{
+#if defined(COLVARS_TCL)
+
+  Tcl_Interp *const tcl_interp = reinterpret_cast<Tcl_Interp *>(_tcl_interp);
+  size_t i;
+  std::string cmd = std::string("calc_") + name;
+  for (i = 0; i < cvc_values.size(); i++) {
+    cmd += std::string(" {") + (*(cvc_values[i])).to_simple_string() +
+      std::string("}");
+  }
+  int err = Tcl_Eval(tcl_interp, cmd.c_str());
+  const char *result = Tcl_GetStringResult(tcl_interp);
+  if (err != TCL_OK) {
+    return cvm::error(std::string("Error while executing ")
+                      + cmd + std::string(":\n") +
+                      std::string(Tcl_GetStringResult(tcl_interp)), COLVARS_ERROR);
+  }
+  std::istringstream is(result);
+  if (value.from_simple_string(is.str()) != COLVARS_OK) {
+    cvm::log("Error parsing colvar value from script:");
+    cvm::error(result);
+    return COLVARS_ERROR;
+  }
+  return cvm::get_error();
+
+#else
+
+  return COLVARS_NOT_IMPLEMENTED;
+
+#endif
+}
+
+
+int colvarproxy_tcl::tcl_run_colvar_gradient_callback(
+                         std::string const &name,
+                         std::vector<const colvarvalue *> const &cvc_values,
+                         std::vector<cvm::matrix2d<cvm::real> > &gradient)
+{
+#if defined(COLVARS_TCL)
+
+  Tcl_Interp *const tcl_interp = reinterpret_cast<Tcl_Interp *>(_tcl_interp);
+  size_t i;
+  std::string cmd = std::string("calc_") + name + "_gradient";
+  for (i = 0; i < cvc_values.size(); i++) {
+    cmd += std::string(" {") + (*(cvc_values[i])).to_simple_string() +
+      std::string("}");
+  }
+  int err = Tcl_Eval(tcl_interp, cmd.c_str());
+  if (err != TCL_OK) {
+    return cvm::error(std::string("Error while executing ")
+                      + cmd + std::string(":\n") +
+                      std::string(Tcl_GetStringResult(tcl_interp)), COLVARS_ERROR);
+  }
+  Tcl_Obj **list;
+  int n;
+  Tcl_ListObjGetElements(tcl_interp, Tcl_GetObjResult(tcl_interp),
+                         &n, &list);
+  if (n != int(gradient.size())) {
+    cvm::error("Error parsing list of gradient values from script: found "
+               + cvm::to_str(n) + " values instead of " +
+               cvm::to_str(gradient.size()));
+    return COLVARS_ERROR;
+  }
+  for (i = 0; i < gradient.size(); i++) {
+    std::istringstream is(Tcl_GetString(list[i]));
+    if (gradient[i].from_simple_string(is.str()) != COLVARS_OK) {
+      cvm::log("Gradient matrix size: " + cvm::to_str(gradient[i].size()));
+      cvm::log("Gradient string: " + cvm::to_str(Tcl_GetString(list[i])));
+      cvm::error("Error parsing gradient value from script", COLVARS_ERROR);
+      return COLVARS_ERROR;
+    }
+  }
+
+  return cvm::get_error();
+
+#else
+
+  return COLVARS_NOT_IMPLEMENTED;
+
+#endif
+}
+
+

 colvarproxy_io::colvarproxy_io() {}

@ -541,6 +682,7 @@ colvarproxy::colvarproxy()
 {
  colvars = NULL;
  b_simulation_running = true;
+  b_delete_requested = false;
 }


@ -556,6 +698,14 @@ int colvarproxy::reset()
 }


+int colvarproxy::request_deletion()
+{
+  return cvm::error("Error: \"delete\" command is only available in VMD; "
+                    "please use \"reset\" instead.\n",
+                    COLVARS_NOT_IMPLEMENTED);
+}
+
+
 int colvarproxy::setup()
 {
  return COLVARS_OK;
@ -579,13 +729,3 @@ size_t colvarproxy::restart_frequency()
  return 0;
 }

-
-
-
-
-
-
-
-
-
-
--- a/lib/colvars/colvarproxy.h
+++ b/lib/colvars/colvarproxy.h
@ -415,7 +415,7 @@ public:
 };


-/// Method for scripting language interface (Tcl or Python)
+/// Methods for scripting language interface (Tcl or Python)
 class colvarproxy_script {

 public:
@ -427,7 +427,7 @@ public:
  virtual ~colvarproxy_script();

  /// Convert a script object (Tcl or Python call argument) to a C string
-  virtual char *script_obj_to_str(unsigned char *obj);
+  virtual char const *script_obj_to_str(unsigned char *obj);

  /// Pointer to the scripting interface object
  /// (does not need to be allocated in a new interface)
@ -454,6 +454,46 @@ public:
 };


+/// Methods for using Tcl within Colvars
+class colvarproxy_tcl {
+
+public:
+
+  /// Constructor
+  colvarproxy_tcl();
+
+  /// Destructor
+  virtual ~colvarproxy_tcl();
+
+  /// Is Tcl available? (trigger initialization if needed)
+  int tcl_available();
+
+  /// Tcl implementation of script_obj_to_str()
+  char const *tcl_obj_to_str(unsigned char *obj);
+
+  /// Run a user-defined colvar forces script
+  int tcl_run_force_callback();
+
+  int tcl_run_colvar_callback(
+              std::string const &name,
+              std::vector<const colvarvalue *> const &cvcs,
+              colvarvalue &value);
+
+  int tcl_run_colvar_gradient_callback(
+              std::string const &name,
+              std::vector<const colvarvalue *> const &cvcs,
+              std::vector<cvm::matrix2d<cvm::real> > &gradient);
+
+protected:
+
+  /// Pointer to Tcl interpreter object
+  void *_tcl_interp;
+
+  /// Set Tcl pointers
+  virtual void init_tcl_pointers();
+};
+
+
 /// Methods for data input/output
 class colvarproxy_io {

@ -540,6 +580,7 @@ class colvarproxy
    public colvarproxy_smp,
    public colvarproxy_replicas,
    public colvarproxy_script,
+    public colvarproxy_tcl,
    public colvarproxy_io
 {

@ -554,6 +595,15 @@ public:
  /// Destructor
  virtual ~colvarproxy();

+  /// Request deallocation of the module (currently only implemented by VMD)
+  virtual int request_deletion();
+
+  /// Whether deallocation was requested
+  inline bool delete_requested()
+  {
+    return b_delete_requested;
+  }
+
  /// \brief Reset proxy state, e.g. requested atoms
  virtual int reset();

@ -591,6 +641,9 @@ protected:
  /// Whether a simulation is running (warn against irrecovarable errors)
  bool b_simulation_running;

+  /// Whether the entire module should be deallocated by the host engine
+  bool b_delete_requested;
+
 };


--- a/lib/colvars/colvars_version.h
+++ b/lib/colvars/colvars_version.h
@ -1,5 +1,5 @@
 #ifndef COLVARS_VERSION
-#define COLVARS_VERSION "2017-10-20"
+#define COLVARS_VERSION "2018-01-17"
 // This file is part of the Collective Variables module (Colvars).
 // The original version of Colvars and its updates are located at:
 // https://github.com/colvars/colvars
--- a/lib/colvars/colvarscript.cpp
+++ b/lib/colvars/colvarscript.cpp
@ -74,7 +74,9 @@ int colvarscript::run(int objc, unsigned char *const objv[])
  }

  if (objc < 2) {
-    return exec_command(cv_help, NULL, objc, objv);
+    set_str_result("No commands given: use \"cv help\" "
+                   "for a list of commands.");
+    return COLVARSCRIPT_ERROR;
  }

  std::string const cmd(obj_to_str(objv[1]));
@ -123,8 +125,7 @@ int colvarscript::run(int objc, unsigned char *const objv[])
  if (cmd == "delete") {
    // Note: the delete bit may be ignored by some backends
    // it is mostly useful in VMD
-    colvars->set_error_bits(DELETE_COLVARS);
-    return COLVARS_OK;
+    return proxy->request_deletion();
  }

  if (cmd == "update") {
@ -272,6 +273,11 @@ int colvarscript::proc_colvar(colvar *cv, int objc, unsigned char *const objv[])
    return COLVARS_OK;
  }

+  if (subcmd == "run_ave") {
+    result = (cv->run_ave()).to_simple_string();
+    return COLVARS_OK;
+  }
+
  if (subcmd == "width") {
    result = cvm::to_str(cv->width, 0, cvm::cv_prec);
    return COLVARS_OK;
--- a/lib/kokkos/CHANGELOG.md
+++ b/lib/kokkos/CHANGELOG.md
@ -1,5 +1,49 @@
 # Change Log

+## [2.6.00](https://github.com/kokkos/kokkos/tree/2.6.00) (2018-03-07)
+[Full Changelog](https://github.com/kokkos/kokkos/compare/2.5.00...2.6.00)
+
+**Part of the Kokkos C++ Performance Portability Programming EcoSystem 2.6**      
+
+**Implemented enhancements:**
+
+- Support NVIDIA Volta microarchitecture [\#1466](https://github.com/kokkos/kokkos/issues/1466)
+- Kokkos - Define empty functions when profiling disabled [\#1424](https://github.com/kokkos/kokkos/issues/1424)
+- Don't use \_\_constant\_\_ cache for lock arrays, enable once per run update instead of once per call [\#1385](https://github.com/kokkos/kokkos/issues/1385)
+- task dag enhancement. [\#1354](https://github.com/kokkos/kokkos/issues/1354)
+- Cuda task team collectives and stack size [\#1353](https://github.com/kokkos/kokkos/issues/1353)
+- Replace View operator acceptance of more than rank integers with 'access' function [\#1333](https://github.com/kokkos/kokkos/issues/1333)
+- Interoperability: Do not shut down backend execution space runtimes upon calling finalize. [\#1305](https://github.com/kokkos/kokkos/issues/1305)
+- shmem\_size for LayoutStride [\#1291](https://github.com/kokkos/kokkos/issues/1291)
+- Kokkos::resize performs poorly on 1D Views [\#1270](https://github.com/kokkos/kokkos/issues/1270)
+- stride\(\) is inconsistent with dimension\(\), extent\(\), etc. [\#1214](https://github.com/kokkos/kokkos/issues/1214)
+- Kokkos::sort defaults to std::sort on host [\#1208](https://github.com/kokkos/kokkos/issues/1208)
+- DynamicView with host size grow [\#1206](https://github.com/kokkos/kokkos/issues/1206)
+- Unmanaged View with Anonymous Memory Space [\#1175](https://github.com/kokkos/kokkos/issues/1175)
+- Sort subset of Kokkos::DynamicView [\#1160](https://github.com/kokkos/kokkos/issues/1160)
+- MDRange policy doesn't support lambda reductions [\#1054](https://github.com/kokkos/kokkos/issues/1054)
+- Add ability to set hook on Kokkos::finalize [\#714](https://github.com/kokkos/kokkos/issues/714)
+- Atomics with Serial Backend - Default should be Disable? [\#549](https://github.com/kokkos/kokkos/issues/549)
+- KOKKOS\_ENABLE\_DEPRECATED\_CODE [\#1359](https://github.com/kokkos/kokkos/issues/1359)
+
+**Fixed bugs:**
+
+- cuda\_internal\_maximum\_warp\_count returns 8, but I believe it should return 16 for P100  [\#1269](https://github.com/kokkos/kokkos/issues/1269)
+- Cuda: level 1 scratch memory bug \(reported by Stan Moore\) [\#1434](https://github.com/kokkos/kokkos/issues/1434)
+- MDRangePolicy Reduction requires value\_type typedef in Functor [\#1379](https://github.com/kokkos/kokkos/issues/1379)
+- Kokkos DeepCopy between empty views fails [\#1369](https://github.com/kokkos/kokkos/issues/1369)
+- Several issues with new CMake build infrastructure \(reported by Eric Phipps\) [\#1365](https://github.com/kokkos/kokkos/issues/1365)
+- deep\_copy between rank-1 host/device views of differing layouts without UVM no longer works \(reported by Eric Phipps\) [\#1363](https://github.com/kokkos/kokkos/issues/1363)
+- Profiling can't be disabled in CMake, and a parallel\_for is missing for tasks \(reported by Kyungjoo Kim\) [\#1349](https://github.com/kokkos/kokkos/issues/1349)
+- get\_work\_partition int overflow \(reported by berryj5\) [\#1327](https://github.com/kokkos/kokkos/issues/1327)
+- Kokkos::deep\_copy must fence even if the two views are the same [\#1303](https://github.com/kokkos/kokkos/issues/1303)
+- CudaUVMSpace::allocate/deallocate must fence [\#1302](https://github.com/kokkos/kokkos/issues/1302)
+- ViewResize on CUDA fails in Debug because of too many resources requested [\#1299](https://github.com/kokkos/kokkos/issues/1299)
+- Cuda 9 and intrepid2 calls from Panzer. [\#1183](https://github.com/kokkos/kokkos/issues/1183)
+- Slowdown due to tracking\_enabled\(\) in 2.04.00 \(found by Albany app\) [\#1016](https://github.com/kokkos/kokkos/issues/1016)
+- Bounds checking fails with zero-span Views \(reported by Stan Moore\) [\#1411](https://github.com/kokkos/kokkos/issues/1411)
+
+
 ## [2.5.00](https://github.com/kokkos/kokkos/tree/2.5.00) (2017-12-15)
 [Full Changelog](https://github.com/kokkos/kokkos/compare/2.04.11...2.5.00)

--- a/lib/kokkos/CMakeLists.txt
+++ b/lib/kokkos/CMakeLists.txt
@ -7,7 +7,7 @@ ELSE()
 ENDIF()

 IF(NOT KOKKOS_HAS_TRILINOS)
-  cmake_minimum_required(VERSION 3.1 FATAL_ERROR)
+  cmake_minimum_required(VERSION 3.3 FATAL_ERROR)

  # Define Project Name if this is a standalone build
  IF(NOT DEFINED ${PROJECT_NAME})
@ -37,9 +37,19 @@ IF(NOT KOKKOS_HAS_TRILINOS)
    COMMAND ${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings
    WORKING_DIRECTORY "${Kokkos_BINARY_DIR}"
    OUTPUT_FILE ${Kokkos_BINARY_DIR}/core_src_make.out
-    RESULT_VARIABLE res
+    RESULT_VARIABLE GEN_SETTINGS_RESULT
  )
+  if (GEN_SETTINGS_RESULT)
+    message(FATAL_ERROR "Kokkos settings generation failed:\n"
+        "${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings")
+  endif()
  include(${Kokkos_BINARY_DIR}/kokkos_generated_settings.cmake)
+  string(REPLACE " " ";" KOKKOS_TPL_INCLUDE_DIRS "${KOKKOS_GMAKE_TPL_INCLUDE_DIRS}")
+  string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_DIRS "${KOKKOS_GMAKE_TPL_LIBRARY_DIRS}")
+  string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_NAMES "${KOKKOS_GMAKE_TPL_LIBRARY_NAMES}")
+  list(REMOVE_ITEM KOKKOS_TPL_INCLUDE_DIRS "")
+  list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_DIRS "")
+  list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_NAMES "")
  set_kokkos_srcs(KOKKOS_SRC ${KOKKOS_SRC})

  #------------ NOW BUILD ------------------------------------------------------
--- a/lib/kokkos/Copyright.txt
+++ b/lib/kokkos/Copyright.txt
@ -34,7 +34,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 // 
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/HOW_TO_SNAPSHOT
+++ b/lib/kokkos/HOW_TO_SNAPSHOT
@ -19,7 +19,7 @@ snapshot Kokkos from github.com/kokkos to Trilinos.

 3) Snapshot the current commit in the Kokkos clone into the Trilinos clone.
   This overwrites ${TRILINOS}/packages/kokkos with the content of ${KOKKOS}:
-	${KOKKOS}/config/snapshot.py --verbose ${KOKKOS} ${TRILINOS}/packages
+	${KOKKOS}/scripts/snapshot.py --verbose ${KOKKOS} ${TRILINOS}/packages

 4) Verify the snapshot commit happened as expected
 	cd ${TRILINOS}/packages/kokkos
--- a/lib/kokkos/LICENSE
+++ b/lib/kokkos/LICENSE
@ -36,7 +36,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 // 
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/Makefile.kokkos
+++ b/lib/kokkos/Makefile.kokkos
@ -9,8 +9,8 @@ KOKKOS_DEVICES ?= "OpenMP"
 #KOKKOS_DEVICES ?= "Pthreads"
 # Options: 
 # Intel:    KNC,KNL,SNB,HSW,BDW,SKX
-# NVIDIA:   Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61
-# ARM:      ARMv80,ARMv81,ARMv8-ThunderX
+# NVIDIA:   Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72
+# ARM:      ARMv80,ARMv81,ARMv8-ThunderX,ARMv8-TX2
 # IBM:      BGQ,Power7,Power8,Power9
 # AMD-GPUS: Kaveri,Carrizo,Fiji,Vega
 # AMD-CPUS: AMDAVX,Ryzen,Epyc
@ -21,7 +21,7 @@ KOKKOS_DEBUG ?= "no"
 KOKKOS_USE_TPLS ?= ""
 # Options: c++11,c++1z
 KOKKOS_CXX_STANDARD ?= "c++11"
-# Options: aggressive_vectorization,disable_profiling
+# Options: aggressive_vectorization,disable_profiling,disable_deprecated_code
 KOKKOS_OPTIONS ?= ""

 # Default settings specific options.
@ -48,6 +48,7 @@ KOKKOS_INTERNAL_USE_MEMKIND := $(call kokkos_has_string,$(KOKKOS_USE_TPLS),exper
 KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS := $(call kokkos_has_string,$(KOKKOS_OPTIONS),compiler_warnings)
 KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION := $(call kokkos_has_string,$(KOKKOS_OPTIONS),aggressive_vectorization)
 KOKKOS_INTERNAL_DISABLE_PROFILING := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_profiling)
+KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_deprecated_code)
 KOKKOS_INTERNAL_DISABLE_DUALVIEW_MODIFY_CHECK := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_dualview_modify_check)
 KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_profile_load_print)
 KOKKOS_INTERNAL_CUDA_USE_LDG := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),use_ldg)
@ -93,7 +94,7 @@ KOKKOS_INTERNAL_COMPILER_INTEL       := $(call kokkos_has_string,$(KOKKOS_CXX_VE
 KOKKOS_INTERNAL_COMPILER_PGI         := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),PGI)
 KOKKOS_INTERNAL_COMPILER_XL          := $(strip $(shell $(CXX) -qversion       2>&1 | grep XL                  | wc -l))
 KOKKOS_INTERNAL_COMPILER_CRAY        := $(strip $(shell $(CXX) -craype-verbose 2>&1 | grep "CC-"               | wc -l))
-KOKKOS_INTERNAL_COMPILER_NVCC        := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); $(CXX) --version       2>&1 | grep nvcc                | wc -l))
+KOKKOS_INTERNAL_COMPILER_NVCC        := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); $(CXX) --version 2>&1 | grep nvcc | wc -l))
 KOKKOS_INTERNAL_COMPILER_CLANG       := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),clang)
 KOKKOS_INTERNAL_COMPILER_APPLE_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),apple-darwin)
 KOKKOS_INTERNAL_COMPILER_HCC         := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),HCC)
@ -229,12 +230,16 @@ KOKKOS_INTERNAL_USE_ARCH_MAXWELL52 := $(call kokkos_has_string,$(KOKKOS_ARCH),Ma
 KOKKOS_INTERNAL_USE_ARCH_MAXWELL53 := $(call kokkos_has_string,$(KOKKOS_ARCH),Maxwell53)
 KOKKOS_INTERNAL_USE_ARCH_PASCAL61 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal61)
 KOKKOS_INTERNAL_USE_ARCH_PASCAL60 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal60)
+KOKKOS_INTERNAL_USE_ARCH_VOLTA70 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta70)
+KOKKOS_INTERNAL_USE_ARCH_VOLTA72 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta72)
 KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLER30)  \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER32)  \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER35)  \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37)  \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61)  \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60)  \
+					      + $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
+					      + $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -249,6 +254,8 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
                                                + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37)  \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61)  \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60)  \
+						+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
+						+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -267,7 +274,8 @@ endif
 KOKKOS_INTERNAL_USE_ARCH_ARMV80 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv80)
 KOKKOS_INTERNAL_USE_ARCH_ARMV81 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv81)
 KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-ThunderX)
-KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX) | bc))
+KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-TX2)
+KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2) | bc))

 # IBM based.
 KOKKOS_INTERNAL_USE_ARCH_BGQ := $(call kokkos_has_string,$(KOKKOS_ARCH),BGQ)
@ -316,6 +324,9 @@ endif
 # Generating the list of Flags.

 KOKKOS_CPPFLAGS = -I./ -I$(KOKKOS_PATH)/core/src -I$(KOKKOS_PATH)/containers/src -I$(KOKKOS_PATH)/algorithms/src
+KOKKOS_TPL_INCLUDE_DIRS =
+KOKKOS_TPL_LIBRARY_DIRS =
+KOKKOS_TPL_LIBRARY_NAMES =

 KOKKOS_CXXFLAGS =
 ifeq ($(KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS), 1)
@ -323,7 +334,9 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS), 1)
 endif

 KOKKOS_LIBS = -ldl
+KOKKOS_TPL_LIBRARY_NAMES += dl
 KOKKOS_LDFLAGS = -L$(shell pwd)
+KOKKOS_LINK_FLAGS = 
 KOKKOS_SRC =
 KOKKOS_HEADERS =

@ -437,21 +450,32 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT), 1)
 endif

 ifeq ($(KOKKOS_INTERNAL_USE_HWLOC), 1)
-  KOKKOS_CPPFLAGS += -I$(HWLOC_PATH)/include
-  KOKKOS_LDFLAGS += -L$(HWLOC_PATH)/lib
+  ifneq ($(HWLOC_PATH),)
+    KOKKOS_CPPFLAGS += -I$(HWLOC_PATH)/include
+    KOKKOS_LDFLAGS += -L$(HWLOC_PATH)/lib
+    KOKKOS_TPL_INCLUDE_DIRS += $(HWLOC_PATH)/include
+    KOKKOS_TPL_LIBRARY_DIRS += $(HWLOC_PATH)/lib
+  endif
  KOKKOS_LIBS += -lhwloc
+  KOKKOS_TPL_LIBRARY_NAMES += hwloc
  tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HWLOC")
 endif

 ifeq ($(KOKKOS_INTERNAL_USE_LIBRT), 1)
  tmp := $(call kokkos_append_header,"\#define KOKKOS_USE_LIBRT")
  KOKKOS_LIBS += -lrt
+  KOKKOS_TPL_LIBRARY_NAMES += rt
 endif

 ifeq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1)
-  KOKKOS_CPPFLAGS += -I$(MEMKIND_PATH)/include
-  KOKKOS_LDFLAGS += -L$(MEMKIND_PATH)/lib
+  ifneq ($(MEMKIND_PATH),)
+    KOKKOS_CPPFLAGS += -I$(MEMKIND_PATH)/include
+    KOKKOS_LDFLAGS += -L$(MEMKIND_PATH)/lib
+    KOKKOS_TPL_INCLUDE_DIRS += $(MEMKIND_PATH)/include
+    KOKKOS_TPL_LIBRARY_DIRS += $(MEMKIND_PATH)/lib
+  endif
  KOKKOS_LIBS += -lmemkind -lnuma
+  KOKKOS_TPL_LIBRARY_NAMES += memkind numa
  tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HBWSPACE")
 endif

@ -459,6 +483,10 @@ ifeq ($(KOKKOS_INTERNAL_DISABLE_PROFILING), 0)
  tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_PROFILING")
 endif

+ifeq ($(KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE), 0)
+  tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_DEPRECATED_CODE")
+endif
+
 tmp := $(call kokkos_append_header,"/* Optimization Settings */")

 ifeq ($(KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION), 1)
@ -560,6 +588,24 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX), 1)
  endif
 endif

+ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2), 1)
+  tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV81")
+  tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV8_THUNDERX2")
+
+  ifeq ($(KOKKOS_INTERNAL_COMPILER_CRAY), 1)
+    KOKKOS_CXXFLAGS +=
+    KOKKOS_LDFLAGS +=
+  else
+    ifeq ($(KOKKOS_INTERNAL_COMPILER_PGI), 1)
+      KOKKOS_CXXFLAGS +=
+      KOKKOS_LDFLAGS +=
+    else
+      KOKKOS_CXXFLAGS += -mtune=thunderx2t99 -mcpu=thunderx2t99
+      KOKKOS_LDFLAGS += -mtune=thunderx2t99 -mcpu=thunderx2t99
+    endif
+  endif
+endif
+
 ifeq ($(KOKKOS_INTERNAL_USE_ARCH_SSE42), 1)
  tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_SSE42")

@ -754,10 +800,11 @@ endif
 ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
  ifeq ($(KOKKOS_INTERNAL_COMPILER_NVCC), 1)
    KOKKOS_INTERNAL_CUDA_ARCH_FLAG=-arch
-  endif
-  ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
-    KOKKOS_INTERNAL_CUDA_ARCH_FLAG=--cuda-gpu-arch
-    KOKKOS_CXXFLAGS += -x cuda
+  else ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
+		KOKKOS_INTERNAL_CUDA_ARCH_FLAG=--cuda-gpu-arch
+		KOKKOS_CXXFLAGS += -x cuda
+  else
+    $(error Makefile.kokkos: CUDA is enabled but the compiler is neither NVCC nor Clang)
  endif

  ifeq ($(KOKKOS_INTERNAL_USE_ARCH_KEPLER30), 1)
@ -805,6 +852,16 @@ ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_PASCAL61")
    KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_61
  endif
+  ifeq ($(KOKKOS_INTERNAL_USE_ARCH_VOLTA70), 1)
+    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA")
+    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA70")
+    KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_70
+  endif
+  ifeq ($(KOKKOS_INTERNAL_USE_ARCH_VOLTA72), 1)
+    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA")
+    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA72")
+    KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_72
+  endif

  ifneq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
    KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)
@ -850,6 +907,7 @@ ifeq ($(KOKKOS_INTERNAL_USE_ROCM), 1)

  KOKKOS_CXXFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --cxxflags) 
  KOKKOS_LDFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --ldflags) -lhc_am -lm 
+  KOKKOS_TPL_LIBRARY_NAMES += hc_am m
  KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_ROCM_ARCH_FLAG)

  KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/ROCm/*.cpp)
@ -880,13 +938,17 @@ KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/containers/src/impl/*.cpp)
 ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
  KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.cpp)
  KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.hpp)
-  KOKKOS_CPPFLAGS += -I$(CUDA_PATH)/include
-  KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib64
-  KOKKOS_LIBS += -lcudart -lcuda
-
-  ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
-    KOKKOS_CXXFLAGS += --cuda-path=$(CUDA_PATH)
+  ifneq ($(CUDA_PATH),)
+    KOKKOS_CPPFLAGS += -I$(CUDA_PATH)/include
+    KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib64
+    KOKKOS_TPL_INCLUDE_DIRS += $(CUDA_PATH)/include
+    KOKKOS_TPL_LIBRARY_DIRS += $(CUDA_PATH)/lib64
+    ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
+      KOKKOS_CXXFLAGS += --cuda-path=$(CUDA_PATH)
+    endif
  endif
+  KOKKOS_LIBS += -lcudart -lcuda
+  KOKKOS_TPL_LIBRARY_NAMES += cudart cuda
 endif

 ifeq ($(KOKKOS_INTERNAL_USE_OPENMPTARGET), 1)
@ -911,20 +973,27 @@ ifeq ($(KOKKOS_INTERNAL_USE_OPENMP), 1)
  endif

  KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_OPENMP_FLAG)
+  KOKKOS_LINK_FLAGS += $(KOKKOS_INTERNAL_OPENMP_FLAG)
 endif

 ifeq ($(KOKKOS_INTERNAL_USE_PTHREADS), 1)
  KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.cpp)
  KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.hpp)
  KOKKOS_LIBS += -lpthread
+  KOKKOS_TPL_LIBRARY_NAMES += pthread
 endif

 ifeq ($(KOKKOS_INTERNAL_USE_QTHREADS), 1)
  KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.cpp)
  KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.hpp)
-  KOKKOS_CPPFLAGS += -I$(QTHREADS_PATH)/include
-  KOKKOS_LDFLAGS += -L$(QTHREADS_PATH)/lib
+  ifneq ($(QTHREADS_PATH),)
+    KOKKOS_CPPFLAGS += -I$(QTHREADS_PATH)/include
+    KOKKOS_LDFLAGS += -L$(QTHREADS_PATH)/lib
+    KOKKOS_TPL_INCLUDE_DIRS += $(QTHREADS_PATH)/include
+    KOKKOS_TPL_LIBRARY_DIRS += $(QTHREADS_PATH)/lib64
+  endif
  KOKKOS_LIBS += -lqthread
+  KOKKOS_TPL_LIBRARY_NAMES += qthread
 endif

 # Explicitly set the GCC Toolchain for Clang.
@ -940,11 +1009,6 @@ ifneq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1)
  KOKKOS_SRC := $(filter-out $(KOKKOS_PATH)/core/src/impl/Kokkos_HBWSpace.cpp,$(KOKKOS_SRC))
 endif

-# Don't include Kokkos_Profiling_Interface.cpp if not using profiling to avoid a link warning.
-ifeq ($(KOKKOS_INTERNAL_DISABLE_PROFILING), 1)
-  KOKKOS_SRC := $(filter-out $(KOKKOS_PATH)/core/src/impl/Kokkos_Profiling_Interface.cpp,$(KOKKOS_SRC))
-endif
-
 # Don't include Kokkos_Serial.cpp or Kokkos_Serial_Task.cpp if not using Serial
 # device to avoid a link warning.
 ifneq ($(KOKKOS_INTERNAL_USE_SERIAL), 1)
--- a/lib/kokkos/README
+++ b/lib/kokkos/README
@ -1,87 +1,101 @@
-Kokkos implements a programming model in C++ for writing performance portable
+Kokkos Core implements a programming model in C++ for writing performance portable
 applications targeting all major HPC platforms. For that purpose it provides
 abstractions for both parallel execution of code and data management.
 Kokkos is designed to target complex node architectures with N-level memory
 hierarchies and multiple types of execution resources. It currently can use
 OpenMP, Pthreads and CUDA as backend programming models.

-Kokkos is licensed under standard 3-clause BSD terms of use. For specifics
-see the LICENSE file contained in the repository or distribution.
+Kokkos Core is part of the Kokkos C++ Performance Portability Programming EcoSystem,
+which also provides math kernels (https://github.com/kokkos/kokkos-kernels), as well as 
+profiling and debugging tools (https://github.com/kokkos/kokkos-tools).  

-The core developers of Kokkos are Carter Edwards and Christian Trott
-at the Computer Science Research Institute of the Sandia National
-Laboratories.
+# Learning about Kokkos

-The KokkosP interface and associated tools are developed by the Application
-Performance Team and Kokkos core developers at Sandia National Laboratories.
+A programming guide can be found on the Wiki, the API reference is under development.

-To learn more about Kokkos consider watching one of our presentations:
-GTC 2015:
-  http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
-  http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf
+For questions find us on Slack: https://kokkosteam.slack.com or open a github issue.

-A programming guide can be found under doc/Kokkos_PG.pdf. This is an initial version
-and feedback is greatly appreciated.
+For non-public questions send an email to
+crtrott(at)sandia.gov

 A separate repository with extensive tutorial material can be found under 
 https://github.com/kokkos/kokkos-tutorials.

-If you have a patch to contribute please feel free to issue a pull request against
-the develop branch. For major contributions it is better to contact us first
-for guidance.
+Furthermore, the 'example/tutorial' directory provides step by step tutorial
+examples which explain many of the features of Kokkos. They work with
+simple Makefiles. To build with g++ and OpenMP simply type 'make'
+in the 'example/tutorial' directory. This will build all examples in the
+subfolders. To change the build options refer to the Programming Guide
+in the compilation section.

-For questions please send an email to
-kokkos-users@software.sandia.gov
+To learn more about Kokkos consider watching one of our presentations:
+* GTC 2015:
+  - http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
+  - http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf

-For non-public questions send an email to
-hcedwar(at)sandia.gov and crtrott(at)sandia.gov

-============================================================================
-====Requirements============================================================
-============================================================================
+# Contributing to Kokkos

-Primary tested compilers on X86 are:
-  GCC 4.8.4
-  GCC 4.9.3
-  GCC 5.1.0
-  GCC 5.3.0
-  GCC 6.1.0
-  Intel 15.0.2
-  Intel 16.0.1
-  Intel 17.1.043
-  Intel 17.4.196
-  Intel 18.0.128
-  Clang 3.5.2
-  Clang 3.6.1
-  Clang 3.7.1
-  Clang 3.8.1
-  Clang 3.9.0
-  Clang 4.0.0
-  Clang 4.0.0 for CUDA (CUDA Toolkit 8.0.44)
-  PGI 17.10
-  NVCC 7.0 for CUDA (with gcc 4.8.4)
-  NVCC 7.5 for CUDA (with gcc 4.8.4)
-  NVCC 8.0.44 for CUDA (with gcc 5.3.0)
+We are open and try to encourage contributions from external developers. 
+To do so please first open an issue describing the contribution and then issue
+a pull request against the develop branch. For larger features it may be good
+to get guidance from the core development team first through the github issue. 

-Primary tested compilers on Power 8 are:
-  GCC 5.4.0 (OpenMP,Serial)
-  IBM XL 13.1.5 (OpenMP, Serial) (There is a workaround in place to avoid a compiler bug)
-  NVCC 8.0.44 for CUDA (with gcc 5.4.0)
-  NVCC 9.0.103 for CUDA (with gcc 6.3.0)
+Note that Kokkos Core is licensed under standard 3-clause BSD terms of use. 
+Which means contributing to Kokkos allows anyone else to use your contributions
+not just for public purposes but also for closed source commercial projects.
+For specifics see the LICENSE file contained in the repository or distribution.

-Primary tested compilers on Intel KNL are:
-   GCC 6.2.0
-   Intel 16.4.258 (with gcc 4.7.2)
-   Intel 17.2.174 (with gcc 4.9.3)
-   Intel 18.0.128 (with gcc 4.9.3)
+# Requirements

-Other compilers working:
-  X86:
-   Cygwin 2.1.0 64bit with gcc 4.9.3
+### Primary tested compilers on X86 are:
+  * GCC 4.8.4
+  * GCC 4.9.3
+  * GCC 5.1.0
+  * GCC 5.3.0
+  * GCC 6.1.0
+  * Intel 15.0.2
+  * Intel 16.0.1
+  * Intel 17.1.043
+  * Intel 17.4.196
+  * Intel 18.0.128
+  * Clang 3.6.1
+  * Clang 3.7.1
+  * Clang 3.8.1
+  * Clang 3.9.0
+  * Clang 4.0.0
+  * Clang 4.0.0 for CUDA (CUDA Toolkit 8.0.44)
+  * Clang 6.0.0 for CUDA (CUDA Toolkit 9.1)
+  * PGI 17.10
+  * NVCC 7.0 for CUDA (with gcc 4.8.4)
+  * NVCC 7.5 for CUDA (with gcc 4.8.4)
+  * NVCC 8.0.44 for CUDA (with gcc 5.3.0)
+  * NVCC 9.1 for CUDA (with gcc 6.1.0)

-Known non-working combinations:
-  Power8:
-   Pthreads backend
+### Primary tested compilers on Power 8 are:
+  * GCC 5.4.0 (OpenMP,Serial)
+  * IBM XL 13.1.6 (OpenMP, Serial)
+  * NVCC 8.0.44 for CUDA (with gcc 5.4.0)
+  * NVCC 9.0.103 for CUDA (with gcc 6.3.0 and XL 13.1.6)
+
+### Primary tested compilers on Intel KNL are:
+  * GCC 6.2.0
+  * Intel 16.4.258 (with gcc 4.7.2)
+  * Intel 17.2.174 (with gcc 4.9.3)
+  * Intel 18.0.128 (with gcc 4.9.3)
+
+### Primary tested compilers on ARM
+  * GCC 6.1.0 
+  
+### Other compilers working:
+  * X86:
+   - Cygwin 2.1.0 64bit with gcc 4.9.3
+
+### Known non-working combinations:
+  * Power8:
+   - Pthreads backend
+  * ARM
+   - Pthreads backend


 Primary tested compiler are passing in release mode
@ -97,20 +111,7 @@ NVCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitiali
 Other compilers are tested occasionally, in particular when pushing from develop to 
 master branch, without -Werror and only for a select set of backends.

-============================================================================
-====Getting started=========================================================
-============================================================================
-
-In the 'example/tutorial' directory you will find step by step tutorial
-examples which explain many of the features of Kokkos. They work with
-simple Makefiles. To build with g++ and OpenMP simply type 'make'
-in the 'example/tutorial' directory. This will build all examples in the
-subfolders. To change the build options refer to the Programming Guide
-in the compilation section. 
-
-============================================================================
-====Running Unit Tests======================================================
-============================================================================
+# Running Unit Tests

 To run the unit tests create a build directory and run the following commands

@ -121,30 +122,35 @@ make test
 Run KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
 changing the device type for which to build.

-============================================================================
-====Install the library=====================================================
-============================================================================
+# Installing the library

 To install Kokkos as a library create a build directory and run the following

 KOKKOS_PATH/generate_makefile.bash --prefix=INSTALL_PATH
-make lib
+make kokkoslib
 make install

 KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
 changing the device type for which to build.

-============================================================================
-====CMakeFiles==============================================================
-============================================================================
+Note that in many cases it is preferable to build Kokkos inline with an 
+application. The main reason is that you may otherwise need many different
+configurations of Kokkos installed depending on the required compile time
+features an application needs. For example there is only one default 
+execution space, which means you need different installations to have OpenMP
+or Pthreads as the default space. Also for the CUDA backend there are certain
+choices, such as allowing relocatable device code, which must be made at 
+installation time. Building Kokkos inline uses largely the same process
+as compiling an application against an installed Kokkos library. See for 
+example benchmarks/bytes_and_flops/Makefile which can be used with an installed
+library and for an inline build.  

-The CMake files contained in this repository require Tribits and are used
-for integration with Trilinos. They do not currently support a standalone
-CMake build.
+### CMake

-===========================================================================
-====Kokkos and CUDA UVM====================================================
-===========================================================================
+Kokkos supports being build as part of a CMake applications. An example can 
+be found in example/cmake_build. 
+
+# Kokkos and CUDA UVM

 Kokkos does support UVM as a specific memory space called CudaUVMSpace. 
 Allocations made with that space are accessible from host and device. 
@ -154,25 +160,16 @@ In either case UVM comes with a number of restrictions:
 running. This will lead to segfaults. To avoid that you either need to 
 call Kokkos::Cuda::fence() (or just Kokkos::fence()), after kernels, or
 you can set the environment variable CUDA_LAUNCH_BLOCKING=1.
-Furthermore in multi socket multi GPU machines, UVM defaults to using 
-zero copy allocations for technical reasons related to using multiple
+Furthermore in multi socket multi GPU machines without NVLINK, UVM defaults 
+to using zero copy allocations for technical reasons related to using multiple
 GPUs from the same process. If an executable doesn't do that (e.g. each
 MPI rank of an application uses a single GPU [can be the same GPU for 
 multiple MPI ranks]) you can set CUDA_MANAGED_FORCE_DEVICE_ALLOC=1.
 This will enforce proper UVM allocations, but can lead to errors if 
 more than a single GPU is used by a single process.

-===========================================================================
-====Contributing===========================================================
-===========================================================================

-Contributions to Kokkos are welcome. In order to do so, please open an issue
-where a feature request or bug can be discussed. Then issue a pull request
-with your contribution. Pull requests must be issued against the develop branch. 
-
-===========================================================================
-====Citing Kokkos==========================================================
-===========================================================================
+# Citing Kokkos

 If you publish work which mentions Kokkos, please cite the following paper:

--- a/lib/kokkos/algorithms/src/Kokkos_Random.hpp
+++ b/lib/kokkos/algorithms/src/Kokkos_Random.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 // 
 // ************************************************************************
 //@HEADER
@ -1530,7 +1530,7 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,1,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0()))
+      if(idx<static_cast<IndexType>(a.extent(0)))
        a(idx) = Rand::draw(gen,range);
    }
    rand_pool.free_state(gen);
@ -1555,8 +1555,8 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,2,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
          a(idx,k) = Rand::draw(gen,range);
      }
    }
@ -1583,9 +1583,9 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,3,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
            a(idx,k,l) = Rand::draw(gen,range);
      }
    }
@ -1611,10 +1611,10 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,4, IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
              a(idx,k,l,m) = Rand::draw(gen,range);
      }
    }
@ -1640,11 +1640,11 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,5,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
-              for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
+              for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
              a(idx,k,l,m,n) = Rand::draw(gen,range);
      }
    }
@ -1670,12 +1670,12 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,6,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
-              for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
-                for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
+              for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
+                for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
              a(idx,k,l,m,n,o) = Rand::draw(gen,range);
      }
    }
@ -1701,13 +1701,13 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,7,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
-              for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
-                for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
-                  for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
+              for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
+                for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
+                  for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
              a(idx,k,l,m,n,o,p) = Rand::draw(gen,range);
      }
    }
@ -1733,14 +1733,14 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,8,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
-              for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
-                for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
-                  for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
-                    for(IndexType q=0;q<static_cast<IndexType>(a.dimension_7());q++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
+              for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
+                for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
+                  for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
+                    for(IndexType q=0;q<static_cast<IndexType>(a.extent(7));q++)
              a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,range);
      }
    }
@ -1765,7 +1765,7 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,1,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0()))
+      if(idx<static_cast<IndexType>(a.extent(0)))
        a(idx) = Rand::draw(gen,begin,end);
    }
    rand_pool.free_state(gen);
@ -1790,8 +1790,8 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,2,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
          a(idx,k) = Rand::draw(gen,begin,end);
      }
    }
@ -1818,9 +1818,9 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,3,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
            a(idx,k,l) = Rand::draw(gen,begin,end);
      }
    }
@ -1846,10 +1846,10 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,4,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
              a(idx,k,l,m) = Rand::draw(gen,begin,end);
      }
    }
@ -1875,11 +1875,11 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,5,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())){
-        for(IndexType l=0;l<static_cast<IndexType>(a.dimension_1());l++)
-          for(IndexType m=0;m<static_cast<IndexType>(a.dimension_2());m++)
-            for(IndexType n=0;n<static_cast<IndexType>(a.dimension_3());n++)
-              for(IndexType o=0;o<static_cast<IndexType>(a.dimension_4());o++)
+      if(idx<static_cast<IndexType>(a.extent(0))){
+        for(IndexType l=0;l<static_cast<IndexType>(a.extent(1));l++)
+          for(IndexType m=0;m<static_cast<IndexType>(a.extent(2));m++)
+            for(IndexType n=0;n<static_cast<IndexType>(a.extent(3));n++)
+              for(IndexType o=0;o<static_cast<IndexType>(a.extent(4));o++)
          a(idx,l,m,n,o) = Rand::draw(gen,begin,end);
      }
    }
@ -1905,12 +1905,12 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,6,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
-              for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
-                for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
+              for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
+                for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
          a(idx,k,l,m,n,o) = Rand::draw(gen,begin,end);
      }
    }
@ -1937,13 +1937,13 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,7,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
-              for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
-                for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
-                  for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
+              for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
+                for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
+                  for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
            a(idx,k,l,m,n,o,p) = Rand::draw(gen,begin,end);
      }
    }
@ -1969,14 +1969,14 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,8,IndexType>{
    typename RandomPool::generator_type gen = rand_pool.get_state();
    for(IndexType j=0;j<loops;j++) {
      const IndexType idx = i*loops+j;
-      if(idx<static_cast<IndexType>(a.dimension_0())) {
-        for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++)
-          for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++)
-            for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++)
-              for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++)
-                for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++)
-                  for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++)
-                    for(IndexType q=0;q<static_cast<IndexType>(a.dimension_7());q++)
+      if(idx<static_cast<IndexType>(a.extent(0))) {
+        for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
+          for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
+            for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
+              for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
+                for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
+                  for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
+                    for(IndexType q=0;q<static_cast<IndexType>(a.extent(7));q++)
              a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,begin,end);
      }
    }
@ -1988,14 +1988,14 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,8,IndexType>{

 template<class ViewType, class RandomPool, class IndexType = int64_t>
 void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type range) {
-  int64_t LDA = a.dimension_0();
+  int64_t LDA = a.extent(0);
  if(LDA>0)
    parallel_for((LDA+127)/128,Impl::fill_random_functor_range<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,range));
 }

 template<class ViewType, class RandomPool, class IndexType = int64_t>
 void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type begin,typename ViewType::const_value_type end ) {
-  int64_t LDA = a.dimension_0();
+  int64_t LDA = a.extent(0);
  if(LDA>0)
    parallel_for((LDA+127)/128,Impl::fill_random_functor_begin_end<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,begin,end));
 }
--- a/lib/kokkos/algorithms/src/Kokkos_Sort.hpp
+++ b/lib/kokkos/algorithms/src/Kokkos_Sort.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 // 
 // ************************************************************************
 //@HEADER
@ -120,7 +120,6 @@ public:

    KOKKOS_INLINE_FUNCTION
    void operator() (const int& i) const {
-      // printf("copy: dst(%i) src(%i)\n",i+dst_offset,i);
      copy_op::copy(dst_values,i+dst_offset,src_values,i);
    }
  };
@ -151,20 +150,22 @@ public:
    DstViewType     dst_values ;
    perm_view_type  sort_order ;
    src_view_type   src_values ;
+    int             src_offset ;

    copy_permute_functor( DstViewType     const & dst_values_
                        , PermuteViewType const & sort_order_
                        , SrcViewType     const & src_values_
+                        , int             const & src_offset_
                        )
      : dst_values( dst_values_ )
      , sort_order( sort_order_ )
      , src_values( src_values_ )
+      , src_offset( src_offset_ )
      {}

    KOKKOS_INLINE_FUNCTION
    void operator() (const int& i)  const {
-      // printf("copy_permute: dst(%i) src(%i)\n",i,sort_order(i));
-      copy_op::copy(dst_values,i,src_values,sort_order(i));
+      copy_op::copy(dst_values,i,src_values,src_offset+sort_order(i));
    }
  };

@ -259,19 +260,21 @@ public:
  // Create the permutation vector, the bin_offset array and the bin_count array. Can be called again if keys changed
  void create_permute_vector() {
    const size_t len = range_end - range_begin ;
-    Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_count_tag>    (0,len),*this);
-    Kokkos::parallel_scan(Kokkos::RangePolicy<execution_space,bin_offset_tag>   (0,bin_op.max_bins()) ,*this);
+    Kokkos::parallel_for ("Kokkos::Sort::BinCount",Kokkos::RangePolicy<execution_space,bin_count_tag>    (0,len),*this);
+    Kokkos::parallel_scan("Kokkos::Sort::BinOffset",Kokkos::RangePolicy<execution_space,bin_offset_tag>   (0,bin_op.max_bins()) ,*this);

    Kokkos::deep_copy(bin_count_atomic,0);
-    Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_binning_tag>  (0,len),*this);
+    Kokkos::parallel_for ("Kokkos::Sort::BinBinning",Kokkos::RangePolicy<execution_space,bin_binning_tag>  (0,len),*this);

    if(sort_within_bins)
-      Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_sort_bins_tag>(0,bin_op.max_bins()) ,*this);
+      Kokkos::parallel_for ("Kokkos::Sort::BinSort",Kokkos::RangePolicy<execution_space,bin_sort_bins_tag>(0,bin_op.max_bins()) ,*this);
  }

-  // Sort a view with respect ot the first dimension using the permutation array
+  // Sort a subset of a view with respect to the first dimension using the permutation array
  template<class ValuesViewType>
-  void sort( ValuesViewType const & values)
+  void sort( ValuesViewType const & values
+           , int values_range_begin
+           , int values_range_end) const
  {
    typedef
      Kokkos::View< typename ValuesViewType::data_type,
@ -280,6 +283,10 @@ public:
        scratch_view_type ;

    const size_t len = range_end - range_begin ;
+    const size_t values_len = values_range_end - values_range_begin ;
+    if (len != values_len) {
+      Kokkos::abort("BinSort::sort: values range length != permutation vector length");
+    }

    scratch_view_type
      sorted_values("Scratch",
@ -297,19 +304,25 @@ public:
                          , offset_type       /* PermuteViewType */
                          , ValuesViewType    /* SrcViewType */
                          >
-        functor( sorted_values , sort_order , values );
+        functor( sorted_values , sort_order , values, values_range_begin - range_begin );

-      parallel_for( Kokkos::RangePolicy<execution_space>(0,len),functor);
+      parallel_for("Kokkos::Sort::CopyPermute", Kokkos::RangePolicy<execution_space>(0,len),functor);
    }

    {
      copy_functor< ValuesViewType , scratch_view_type >
        functor( values , range_begin , sorted_values );

-      parallel_for( Kokkos::RangePolicy<execution_space>(0,len),functor);
+      parallel_for("Kokkos::Sort::Copy", Kokkos::RangePolicy<execution_space>(0,len),functor);
    }
  }

+  template<class ValuesViewType>
+  void sort( ValuesViewType const & values ) const
+  {
+    this->sort( values, 0, /*values.extent(0)*/ range_end - range_begin );
+  }
+
  // Get the permutation vector
  KOKKOS_INLINE_FUNCTION
  offset_type get_permute_vector() const { return sort_order;}
@ -327,7 +340,7 @@ public:
  KOKKOS_INLINE_FUNCTION
  void operator() (const bin_count_tag& tag, const int& i) const {
    const int j = range_begin + i ;
-    bin_count_atomic(bin_op.bin(keys,j))++;
+    bin_count_atomic(bin_op.bin(keys, j))++;
  }

  KOKKOS_INLINE_FUNCTION
@ -512,7 +525,7 @@ void sort( ViewType const & view , bool const always_use_kokkos_sort = false)

  Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result;
  Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result);
-  parallel_reduce(Kokkos::RangePolicy<typename ViewType::execution_space>(0,view.extent(0)),
+  parallel_reduce("Kokkos::Sort::FindExtent",Kokkos::RangePolicy<typename ViewType::execution_space>(0,view.extent(0)),
                  Impl::min_max_functor<ViewType>(view),reducer);
  if(result.min_val == result.max_val) return;
  BinSort<ViewType, CompType> bin_sort(view,CompType(view.extent(0)/2,result.min_val,result.max_val),true);
@ -532,7 +545,7 @@ void sort( ViewType view
  Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result;
  Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result);

-  parallel_reduce( range_policy( begin , end )
+  parallel_reduce("Kokkos::Sort::FindExtent", range_policy( begin , end )
                 , Impl::min_max_functor<ViewType>(view),reducer );

  if(result.min_val == result.max_val) return;
@ -541,8 +554,9 @@ void sort( ViewType view
    bin_sort(view,begin,end,CompType((end-begin)/2,result.min_val,result.max_val),true);

  bin_sort.create_permute_vector();
-  bin_sort.sort(view);
+  bin_sort.sort(view,begin,end);
 }
+
 }

 #endif
--- a/lib/kokkos/algorithms/unit_tests/TestCuda.cpp
+++ b/lib/kokkos/algorithms/unit_tests/TestCuda.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
@ -61,14 +61,9 @@ class cuda : public ::testing::Test {
 protected:
  static void SetUpTestCase()
  {
-    std::cout << std::setprecision(5) << std::scientific;
-    Kokkos::HostSpace::execution_space::initialize();
-    Kokkos::Cuda::initialize( Kokkos::Cuda::SelectDevice(0) );
  }
  static void TearDownTestCase()
  {
-    Kokkos::Cuda::finalize();
-    Kokkos::HostSpace::execution_space::finalize();
  }
 };

--- a/lib/kokkos/algorithms/unit_tests/TestOpenMP.cpp
+++ b/lib/kokkos/algorithms/unit_tests/TestOpenMP.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
@ -60,25 +60,10 @@ protected:
  static void SetUpTestCase()
  {
    std::cout << std::setprecision(5) << std::scientific;
-
-    int threads_count = 0;
-    #pragma omp parallel
-    {
-      #pragma omp atomic
-      ++threads_count;
-    }
-
-    if (threads_count > 3) {
-      threads_count /= 2;
-    }
-
-    Kokkos::OpenMP::initialize( threads_count );
-    Kokkos::OpenMP::print_configuration( std::cout );
  }

  static void TearDownTestCase()
  {
-    Kokkos::OpenMP::finalize();
  }
 };

--- a/lib/kokkos/algorithms/unit_tests/TestROCm.cpp
+++ b/lib/kokkos/algorithms/unit_tests/TestROCm.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
@ -62,13 +62,9 @@ protected:
  static void SetUpTestCase()
  {
    std::cout << std::setprecision(5) << std::scientific;
-    Kokkos::HostSpace::execution_space::initialize();
-    Kokkos::Experimental::ROCm::initialize( Kokkos::Experimental::ROCm::SelectDevice(0) );
  }
  static void TearDownTestCase()
  {
-    Kokkos::Experimental::ROCm::finalize();
-    Kokkos::HostSpace::execution_space::finalize();
  }
 };

--- a/lib/kokkos/algorithms/unit_tests/TestRandom.hpp
+++ b/lib/kokkos/algorithms/unit_tests/TestRandom.hpp
@ -34,7 +34,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/algorithms/unit_tests/TestSerial.cpp
+++ b/lib/kokkos/algorithms/unit_tests/TestSerial.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
@ -62,13 +62,10 @@ class serial : public ::testing::Test {
 protected:
  static void SetUpTestCase()
  {
-    std::cout << std::setprecision (5) << std::scientific;
-    Kokkos::Serial::initialize ();
  }

  static void TearDownTestCase ()
  {
-    Kokkos::Serial::finalize ();
  }
 };

--- a/lib/kokkos/algorithms/unit_tests/TestSort.hpp
+++ b/lib/kokkos/algorithms/unit_tests/TestSort.hpp
@ -34,7 +34,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
@ -171,10 +171,10 @@ void test_3D_sort(unsigned int n) {
  double sum_after = 0.0;
  unsigned int sort_fails = 0;

-  Kokkos::parallel_reduce(keys.dimension_0(),sum3D<ExecutionSpace, KeyType>(keys),sum_before);
+  Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_before);

  int bin_1d = 1;
-  while( bin_1d*bin_1d*bin_1d*4< (int) keys.dimension_0() ) bin_1d*=2;
+  while( bin_1d*bin_1d*bin_1d*4< (int) keys.extent(0) ) bin_1d*=2;
  int bin_max[3] = {bin_1d,bin_1d,bin_1d};
  typename KeyViewType::value_type min[3] = {0,0,0};
  typename KeyViewType::value_type max[3] = {100,100,100};
@ -186,8 +186,8 @@ void test_3D_sort(unsigned int n) {
  Sorter.create_permute_vector();
  Sorter.template sort< KeyViewType >(keys);

-  Kokkos::parallel_reduce(keys.dimension_0(),sum3D<ExecutionSpace, KeyType>(keys),sum_after);
-  Kokkos::parallel_reduce(keys.dimension_0()-1,bin3d_is_sorted_struct<ExecutionSpace, KeyType>(keys,bin_1d,min[0],max[0]),sort_fails);
+  Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_after);
+  Kokkos::parallel_reduce(keys.extent(0)-1,bin3d_is_sorted_struct<ExecutionSpace, KeyType>(keys,bin_1d,min[0],max[0]),sort_fails);

  double ratio = sum_before/sum_after;
  double epsilon = 1e-10;
@ -205,24 +205,13 @@ void test_3D_sort(unsigned int n) {
 template<class ExecutionSpace, typename KeyType>
 void test_dynamic_view_sort(unsigned int n )
 {
-  typedef typename ExecutionSpace::memory_space memory_space ;
  typedef Kokkos::Experimental::DynamicView<KeyType*,ExecutionSpace> KeyDynamicViewType;
  typedef Kokkos::View<KeyType*,ExecutionSpace> KeyViewType;

  const size_t upper_bound = 2 * n ;
+  const size_t min_chunk_size = 1024;

-  const size_t total_alloc_size = n * sizeof(KeyType) * 1.2 ;
-  const size_t superblock_size  = std::min(total_alloc_size, size_t(1000000));
-
-  typename KeyDynamicViewType::memory_pool
-    pool( memory_space()
-        , n * sizeof(KeyType) * 1.2
-        ,     500 /* min block size in bytes */
-        ,   30000 /* max block size in bytes */
-        , superblock_size
-        );
-
-  KeyDynamicViewType keys("Keys",pool,upper_bound);
+  KeyDynamicViewType keys("Keys", min_chunk_size, upper_bound);

  keys.resize_serial(n);

@ -230,13 +219,15 @@ void test_dynamic_view_sort(unsigned int n )

  // Test sorting array with all numbers equal
  Kokkos::deep_copy(keys_view,KeyType(1));
-  Kokkos::Experimental::deep_copy(keys,keys_view);
+  Kokkos::deep_copy(keys,keys_view);
  Kokkos::sort(keys, 0 /* begin */ , n /* end */ );

  Kokkos::Random_XorShift64_Pool<ExecutionSpace> g(1931);
  Kokkos::fill_random(keys_view,g,Kokkos::Random_XorShift64_Pool<ExecutionSpace>::generator_type::MAX_URAND);

-  Kokkos::Experimental::deep_copy(keys,keys_view);
+  ExecutionSpace::fence();
+  Kokkos::deep_copy(keys,keys_view);
+  //ExecutionSpace::fence();

  double sum_before = 0.0;
  double sum_after = 0.0;
@ -246,7 +237,9 @@ void test_dynamic_view_sort(unsigned int n )

  Kokkos::sort(keys, 0 /* begin */ , n /* end */ );

-  Kokkos::Experimental::deep_copy( keys_view , keys );
+  ExecutionSpace::fence(); // Need this fence to prevent BusError with Cuda
+  Kokkos::deep_copy( keys_view , keys );
+  //ExecutionSpace::fence();

  Kokkos::parallel_reduce(n,sum<ExecutionSpace, KeyType>(keys_view),sum_after);
  Kokkos::parallel_reduce(n-1,is_sorted_struct<ExecutionSpace, KeyType>(keys_view),sort_fails);
@ -269,6 +262,74 @@ void test_dynamic_view_sort(unsigned int n )

 //----------------------------------------------------------------------------

+template<class ExecutionSpace>
+void test_issue_1160()
+{
+  Kokkos::View<int*, ExecutionSpace> element_("element", 10);
+  Kokkos::View<double*, ExecutionSpace> x_("x", 10);
+  Kokkos::View<double*, ExecutionSpace> v_("y", 10);
+
+  auto h_element = Kokkos::create_mirror_view(element_);
+  auto h_x = Kokkos::create_mirror_view(x_);
+  auto h_v = Kokkos::create_mirror_view(v_);
+
+  h_element(0) = 9;
+  h_element(1) = 8;
+  h_element(2) = 7;
+  h_element(3) = 6;
+  h_element(4) = 5;
+  h_element(5) = 4;
+  h_element(6) = 3;
+  h_element(7) = 2;
+  h_element(8) = 1;
+  h_element(9) = 0;
+
+  for (int i = 0; i < 10; ++i) {
+    h_v.access(i, 0) = h_x.access(i, 0) = double(h_element(i));
+  }
+  Kokkos::deep_copy(element_, h_element);
+  Kokkos::deep_copy(x_, h_x);
+  Kokkos::deep_copy(v_, h_v);
+
+  typedef decltype(element_) KeyViewType;
+  typedef Kokkos::BinOp1D< KeyViewType > BinOp;
+
+  int begin = 3;
+  int end = 8;
+  auto max = h_element(begin);
+  auto min = h_element(end - 1);
+  BinOp binner(end - begin, min, max);
+
+  Kokkos::BinSort<KeyViewType , BinOp > Sorter(element_,begin,end,binner,false);
+  Sorter.create_permute_vector();
+  Sorter.sort(element_,begin,end);
+
+  Sorter.sort(x_,begin,end);
+  Sorter.sort(v_,begin,end);
+
+  Kokkos::deep_copy(h_element, element_);
+  Kokkos::deep_copy(h_x, x_);
+  Kokkos::deep_copy(h_v, v_);
+
+  ASSERT_EQ(h_element(0), 9);
+  ASSERT_EQ(h_element(1), 8);
+  ASSERT_EQ(h_element(2), 7);
+  ASSERT_EQ(h_element(3), 2);
+  ASSERT_EQ(h_element(4), 3);
+  ASSERT_EQ(h_element(5), 4);
+  ASSERT_EQ(h_element(6), 5);
+  ASSERT_EQ(h_element(7), 6);
+  ASSERT_EQ(h_element(8), 1);
+  ASSERT_EQ(h_element(9), 0);
+
+  for (int i = 0; i < 10; ++i) {
+    ASSERT_EQ(h_element(i), int(h_x.access(i, 0)));
+    ASSERT_EQ(h_element(i), int(h_v.access(i, 0)));
+  }
+}
+
+//----------------------------------------------------------------------------
+
 template<class ExecutionSpace, typename KeyType>
 void test_sort(unsigned int N)
 {
@ -278,6 +339,7 @@ void test_sort(unsigned int N)
  test_3D_sort<ExecutionSpace,KeyType>(N);
  test_dynamic_view_sort<ExecutionSpace,KeyType>(N*N);
 #endif
+  test_issue_1160<ExecutionSpace>();
 }

 }
--- a/lib/kokkos/algorithms/unit_tests/TestThreads.cpp
+++ b/lib/kokkos/algorithms/unit_tests/TestThreads.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
@ -63,25 +63,10 @@ protected:
  static void SetUpTestCase()
  {
    std::cout << std::setprecision(5) << std::scientific;
-
-    unsigned num_threads = 4;
-
-    if (Kokkos::hwloc::available()) {
-      num_threads = Kokkos::hwloc::get_available_numa_count()
-                    * Kokkos::hwloc::get_available_cores_per_numa()
-                 // * Kokkos::hwloc::get_available_threads_per_core()
-                    ;
-
-    }
-
-    std::cout << "Threads: " << num_threads << std::endl;
-
-    Kokkos::Threads::initialize( num_threads );
  }

  static void TearDownTestCase()
  {
-    Kokkos::Threads::finalize();
  }
 };

--- a/lib/kokkos/algorithms/unit_tests/UnitTestMain.cpp
+++ b/lib/kokkos/algorithms/unit_tests/UnitTestMain.cpp
@ -35,16 +35,20 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
 */

 #include <gtest/gtest.h>
+#include <Kokkos_Core.hpp>

 int main(int argc, char *argv[]) {
+  Kokkos::initialize(argc,argv);
  ::testing::InitGoogleTest(&argc,argv);
-  return RUN_ALL_TESTS();
+  int result = RUN_ALL_TESTS();
+  Kokkos::finalize();
+  return result;
 }

--- a/lib/kokkos/benchmarks/atomic/Makefile
+++ b/lib/kokkos/benchmarks/atomic/Makefile
@ -10,7 +10,7 @@ default: build


 ifneq (,$(findstring Cuda,$(KOKKOS_DEVICES)))
-CXX = ${KOKKOS_PATH}/config/nvcc_wrapper
+CXX = ${KOKKOS_PATH}/bin/nvcc_wrapper
 EXE = ${EXE_NAME}.cuda
 KOKKOS_CUDA_OPTIONS = "enable_lambda"
 else
--- a/lib/kokkos/benchmarks/benchmark_suite/scripts/run_tests.bash
+++ b/lib/kokkos/benchmarks/benchmark_suite/scripts/run_tests.bash
@ -3,7 +3,7 @@
 # BytesAndFlops
 cd build/bytes_and_flops

-USE_CUDA=`grep "_CUDA 1" KokkosCore_config.h | wc -l`
+USE_CUDA=`grep "_CUDA" KokkosCore_config.h | wc -l`

 if [[ ${USE_CUDA} > 0 ]]; then
  BAF_EXE=bytes_and_flops.cuda
@ -41,4 +41,4 @@ cd ../..
 echo "MiniFE: ${FE_PERF_1} ${FE_PERF_2}"

 PERF_RESULT=`echo "${BAF_PERF_1} ${BAF_PERF_2} ${MD_PERF_1} ${MD_PERF_2} ${FE_PERF_1} ${FE_PERF_2}" | awk '{print ($1+$2+$3+$4+$5+$6)/6}'`
-echo "Total Result: " ${PERF_RESULT}
+echo "Total Result: " ${PERF_RESULT}
--- a/lib/kokkos/benchmarks/bytes_and_flops/bench.hpp
+++ b/lib/kokkos/benchmarks/bytes_and_flops/bench.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/bytes_and_flops/bench_stride.hpp
+++ b/lib/kokkos/benchmarks/bytes_and_flops/bench_stride.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/bytes_and_flops/bench_unroll_stride.hpp
+++ b/lib/kokkos/benchmarks/bytes_and_flops/bench_unroll_stride.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/bytes_and_flops/main.cpp
+++ b/lib/kokkos/benchmarks/bytes_and_flops/main.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/gather/gather.hpp
+++ b/lib/kokkos/benchmarks/gather/gather.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/gather/gather_unroll.hpp
+++ b/lib/kokkos/benchmarks/gather/gather_unroll.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/gather/main.cpp
+++ b/lib/kokkos/benchmarks/gather/main.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/policy_performance/main.cpp
+++ b/lib/kokkos/benchmarks/policy_performance/main.cpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/benchmarks/policy_performance/policy_perf_test.hpp
+++ b/lib/kokkos/benchmarks/policy_performance/policy_perf_test.hpp
@ -35,7 +35,7 @@
 // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //
-// Questions? Contact  H. Carter Edwards (hcedwar@sandia.gov)
+// Questions? Contact Christian R. Trott (crtrott@sandia.gov)
 //
 // ************************************************************************
 //@HEADER
--- a/lib/kokkos/cmake/Modules/FindHWLOC.cmake
+++ b/lib/kokkos/cmake/Modules/FindHWLOC.cmake
@ -2,7 +2,7 @@
 # FindHWLOC
 # ----------
 #
-# Try to find HWLOC.
+# Try to find HWLOC, based on KOKKOS_HWLOC_DIR
 #
 # The following variables are defined:
 #
@ -10,8 +10,8 @@
 #   HWLOC_INCLUDE_DIR - HWLOC include directory
 #   HWLOC_LIBRARIES - Libraries needed to use HWLOC

-find_path(HWLOC_INCLUDE_DIR hwloc.h)
-find_library(HWLOC_LIBRARIES hwloc)
+find_path(HWLOC_INCLUDE_DIR hwloc.h PATHS "${KOKKOS_HWLOC_DIR}/include")
+find_library(HWLOC_LIBRARIES hwloc PATHS "${KOKKOS_HWLOC_DIR}/lib")

 include(FindPackageHandleStandardArgs)
 find_package_handle_standard_args(HWLOC DEFAULT_MSG
--- a/lib/kokkos/cmake/kokkos_build.cmake
+++ b/lib/kokkos/cmake/kokkos_build.cmake
@ -1,7 +1,3 @@
-# kokkos_generated_settings.cmake includes the kokkos library itself in KOKKOS_LIBS
-# which we do not want to use for the cmake builds so clean this up
-string(REGEX REPLACE "-lkokkos" "" KOKKOS_LIBS ${KOKKOS_LIBS})
-
 ############################ Detect if submodule ###############################
 #
 # With thanks to StackOverflow:  
@ -73,6 +69,19 @@ IF(KOKKOS_SEPARATE_LIBS)
    PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
  )

+  target_include_directories(
+    kokkoscore
+    PUBLIC
+    ${KOKKOS_TPL_INCLUDE_DIRS}
+  )
+
+  foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
+    find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
+    target_link_libraries(kokkoscore PUBLIC ${LIB_${lib}})
+  endforeach()
+
+  target_link_libraries(kokkoscore PUBLIC "${KOKKOS_LINK_FLAGS}")
+
  # Install the kokkoscore library
  INSTALL (TARGETS kokkoscore
           EXPORT KokkosTargets
@ -81,12 +90,6 @@ IF(KOKKOS_SEPARATE_LIBS)
           RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin
  )

-  TARGET_LINK_LIBRARIES(
-    kokkoscore
-    ${KOKKOS_LD_FLAGS}
-    ${KOKKOS_EXTRA_LIBS_LIST}
-  )
-
  # kokkoscontainers
  if (DEFINED KOKKOS_CONTAINERS_SRCS)
    ADD_LIBRARY(
@ -144,12 +147,19 @@ ELSE()
    PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
  )

-  TARGET_LINK_LIBRARIES(
+  target_include_directories(
    kokkos
-    ${KOKKOS_LD_FLAGS}
-    ${KOKKOS_EXTRA_LIBS_LIST}
+    PUBLIC
+    ${KOKKOS_TPL_INCLUDE_DIRS}
  )

+  foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
+    find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
+    target_link_libraries(kokkos PUBLIC ${LIB_${lib}})
+  endforeach()
+
+  target_link_libraries(kokkos PUBLIC "${KOKKOS_LINK_FLAGS}")
+
  # Install the kokkos library
  INSTALL (TARGETS kokkos
           EXPORT KokkosTargets
--- a/lib/kokkos/cmake/kokkos_options.cmake
+++ b/lib/kokkos/cmake/kokkos_options.cmake
@ -25,11 +25,12 @@ list(APPEND KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST
     Cuda_LDG_Intrinsic
     Debug
     Debug_DualView_Modify_Check
-     Debug_Bounds_Checkt
+     Debug_Bounds_Check
     Compiler_Warnings
     Profiling
     Profiling_Load_Print
     Aggressive_Vectorization
+     Deprecated_Code
     )

 #-------------------------------------------------------------------------------
@ -263,7 +264,8 @@ set(KOKKOS_ENABLE_PROFILING ${KOKKOS_INTERNAL_ENABLE_PROFILING_DEFAULT} CACHE BO
 set_kokkos_default_default(PROFILING_LOAD_PRINT OFF)
 set(KOKKOS_ENABLE_PROFILING_LOAD_PRINT ${KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT_DEFAULT} CACHE BOOL "Enable profile load print.")

-
+set_kokkos_default_default(DEPRECATED_CODE ON)
+set(KOKKOS_ENABLE_DEPRECATED_CODE ${KOKKOS_INTERNAL_ENABLE_DEPRECATED_CODE_DEFAULT} CACHE BOOL "Enable deprecated code.")


 #-------------------------------------------------------------------------------
--- a/lib/kokkos/cmake/kokkos_settings.cmake
+++ b/lib/kokkos/cmake/kokkos_settings.cmake
@ -14,6 +14,13 @@
 #-------------------------------------------------------------------------------

 # Ensure that KOKKOS_ARCH is in the ARCH_LIST
+if (KOKKOS_ARCH MATCHES ",")
+  message("-- Detected a comma in: KOKKOS_ARCH=${KOKKOS_ARCH}")
+  message("-- Although we prefer KOKKOS_ARCH to be semicolon-delimited, we do allow")
+  message("-- comma-delimited values for compatibility with scripts (see github.com/trilinos/Trilinos/issues/2330)")
+  string(REPLACE "," ";" KOKKOS_ARCH "${KOKKOS_ARCH}")
+  message("-- Commas were changed to semicolons, now KOKKOS_ARCH=${KOKKOS_ARCH}")
+endif()
 foreach(arch ${KOKKOS_ARCH})
  list(FIND KOKKOS_ARCH_LIST ${arch} indx)
  if (indx EQUAL -1)
@ -23,14 +30,13 @@ foreach(arch ${KOKKOS_ARCH})
 endforeach()

 # KOKKOS_SETTINGS uses KOKKOS_ARCH
-string(REPLACE ";" "," KOKKOS_ARCH "${KOKKOS_ARCH}")
-set(KOKKOS_ARCH ${KOKKOS_ARCH})
+string(REPLACE ";" "," KOKKOS_GMAKE_ARCH "${KOKKOS_ARCH}")

 # From Makefile.kokkos: Options: yes,no
 if(${KOKKOS_ENABLE_DEBUG})
-  set(KOKKOS_DEBUG yes)
+  set(KOKKOS_GMAKE_DEBUG yes)
 else()
-  set(KOKKOS_DEBUG no)
+  set(KOKKOS_GMAKE_DEBUG no)
 endif()

 #------------------------------- KOKKOS_DEVICES --------------------------------
@ -43,10 +49,10 @@ foreach(devopt ${KOKKOS_DEVICES_LIST})
  endif ()
 endforeach()
 # List needs to be comma-delmitted
-string(REPLACE ";" "," KOKKOS_DEVICES "${KOKKOS_DEVICESl}")
+string(REPLACE ";" "," KOKKOS_GMAKE_DEVICES "${KOKKOS_DEVICESl}")

 #------------------------------- KOKKOS_OPTIONS --------------------------------
-# From Makefile.kokkos: Options: aggressive_vectorization,disable_profiling
+# From Makefile.kokkos: Options: aggressive_vectorization,disable_profiling,disable_deprecated_code
 #compiler_warnings, aggressive_vectorization, disable_profiling, disable_dualview_modify_check, enable_profile_load_print

 set(KOKKOS_OPTIONSl)
@ -57,7 +63,10 @@ if(${KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION})
      list(APPEND KOKKOS_OPTIONSl aggressive_vectorization)
 endif()
 if(NOT ${KOKKOS_ENABLE_PROFILING})
-      list(APPEND KOKKOS_OPTIONSl disable_vectorization)
+      list(APPEND KOKKOS_OPTIONSl disable_profiling)
+endif()
+if(NOT ${KOKKOS_ENABLE_DEPRECATED_CODE})
+      list(APPEND KOKKOS_OPTIONSl disable_deprecated_code)
 endif()
 if(NOT ${KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK})
      list(APPEND KOKKOS_OPTIONSl disable_dualview_modify_check)
@ -66,7 +75,7 @@ if(${KOKKOS_ENABLE_PROFILING_LOAD_PRINT})
      list(APPEND KOKKOS_OPTIONSl enable_profile_load_print)
 endif()
 # List needs to be comma-delimitted
-string(REPLACE ";" "," KOKKOS_OPTIONS "${KOKKOS_OPTIONSl}")
+string(REPLACE ";" "," KOKKOS_GMAKE_OPTIONS "${KOKKOS_OPTIONSl}")


 #------------------------------- KOKKOS_USE_TPLS -------------------------------
@ -78,19 +87,19 @@ foreach(tplopt ${KOKKOS_USE_TPLS_LIST})
  endif ()
 endforeach()
 # List needs to be comma-delimitted
-string(REPLACE ";" "," KOKKOS_USE_TPLS "${KOKKOS_USE_TPLSl}")
+string(REPLACE ";" "," KOKKOS_GMAKE_USE_TPLS "${KOKKOS_USE_TPLSl}")


 #------------------------------- KOKKOS_CUDA_OPTIONS ---------------------------
 # Construct the Makefile options
-set(KOKKOS_CUDA_OPTIONS)
+set(KOKKOS_CUDA_OPTIONSl)
 foreach(cudaopt ${KOKKOS_CUDA_OPTIONS_LIST})
  if (${KOKKOS_ENABLE_CUDA_${cudaopt}})
    list(APPEND KOKKOS_CUDA_OPTIONSl ${KOKKOS_INTERNAL_${cudaopt}})
  endif ()
 endforeach()
 # List needs to be comma-delmitted
-string(REPLACE ";" "," KOKKOS_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")
+string(REPLACE ";" "," KOKKOS_GMAKE_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")

 #------------------------------- PATH VARIABLES --------------------------------
 #  Want makefile to use same executables specified which means modifying
@ -100,10 +109,10 @@ string(REPLACE ";" "," KOKKOS_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")

 set(KOKKOS_INTERNAL_PATHS)
 set(addpathl)
-foreach(kvar "CUDA;QTHREADS;${KOKKOS_USE_TPLS_LIST}")
+foreach(kvar IN LISTS KOKKOS_USE_TPLS_LIST ITEMS CUDA QTHREADS)
  if(${KOKKOS_ENABLE_${kvar}})
    if(DEFINED KOKKOS_${kvar}_DIR)
-      set(KOKKOS_INTERNAL_PATHS "${KOKKOS_INTERNAL_PATHS} ${kvar}_PATH=${KOKKOS_${kvar}_DIR}")
+      set(KOKKOS_INTERNAL_PATHS ${KOKKOS_INTERNAL_PATHS} "${kvar}_PATH=${KOKKOS_${kvar}_DIR}")
      if(IS_DIRECTORY ${KOKKOS_${kvar}_DIR}/bin)
        list(APPEND addpathl ${KOKKOS_${kvar}_DIR}/bin)
      endif()
@ -124,10 +133,9 @@ set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_INSTALL_PATH=${CMAKE_INSTALL_PREFI

 # Form of KOKKOS_foo=$KOKKOS_foo
 foreach(kvar ARCH;DEVICES;DEBUG;OPTIONS;CUDA_OPTIONS;USE_TPLS)
-  set(KOKKOS_VAR KOKKOS_${kvar})
-  if(DEFINED KOKKOS_${kvar})
-    if (NOT "${${KOKKOS_VAR}}" STREQUAL "")
-      set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_VAR}=${${KOKKOS_VAR}})
+  if(DEFINED KOKKOS_GMAKE_${kvar})
+    if (NOT "${KOKKOS_GMAKE_${kvar}}" STREQUAL "")
+      set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_${kvar}=${KOKKOS_GMAKE_${kvar}})
    endif()
  endif()
 endforeach()
@ -147,7 +155,7 @@ if (NOT "${KOKKOS_INTERNAL_PATHS}" STREQUAL "")
  set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_INTERNAL_PATHS})
 endif()
 if (NOT "${KOKKOS_INTERNAL_ADDTOPATH}" STREQUAL "")
-  set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} PATH=${KOKKOS_INTERNAL_ADDTOPATH}:\${PATH})
+  set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} "PATH=\"${KOKKOS_INTERNAL_ADDTOPATH}:$ENV{PATH}\"")
 endif()

 # Final form that gets passed to make
@ -185,7 +193,7 @@ if(KOKKOS_CMAKE_VERBOSE)

  message(STATUS "")
  message(STATUS "Architectures:")
-  message(STATUS "    ${KOKKOS_ARCH}")
+  message(STATUS "    ${KOKKOS_GMAKE_ARCH}")

  message(STATUS "")
  message(STATUS "Enabled options")
@ -194,43 +202,14 @@ if(KOKKOS_CMAKE_VERBOSE)
    message(STATUS "  KOKKOS_SEPARATE_LIBS")
  endif()

-  if(KOKKOS_ENABLE_HWLOC)
-    message(STATUS "  KOKKOS_ENABLE_HWLOC")
-  endif()
-
-  if(KOKKOS_ENABLE_MEMKIND)
-    message(STATUS "  KOKKOS_ENABLE_MEMKIND")
-  endif()
-
-  if(KOKKOS_ENABLE_DEBUG)
-    message(STATUS "  KOKKOS_ENABLE_DEBUG")
-  endif()
-
-  if(KOKKOS_ENABLE_PROFILING)
-    message(STATUS "  KOKKOS_ENABLE_PROFILING")
-  endif()
-
-  if(KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION)
-    message(STATUS "  KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION")
-  endif()
+  foreach(opt IN LISTS KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST)
+    string(TOUPPER ${opt} OPT)
+    if (KOKKOS_ENABLE_${OPT})
+      message(STATUS "  KOKKOS_ENABLE_${OPT}")
+    endif()
+  endforeach()

  if(KOKKOS_ENABLE_CUDA)
-    if(KOKKOS_ENABLE_CUDA_LDG_INTRINSIC)
-      message(STATUS "  KOKKOS_ENABLE_CUDA_LDG_INTRINSIC")
-    endif()
-
-    if(KOKKOS_ENABLE_CUDA_UVM)
-      message(STATUS "  KOKKOS_ENABLE_CUDA_UVM")
-    endif()
-
-    if(KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE)
-      message(STATUS "  KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE")
-    endif()
-
-    if(KOKKOS_ENABLE_CUDA_LAMBDA)
-      message(STATUS "  KOKKOS_ENABLE_CUDA_LAMBDA")
-    endif()
-
    if(KOKKOS_CUDA_DIR)
      message(STATUS "  KOKKOS_CUDA_DIR: ${KOKKOS_CUDA_DIR}")
    endif()
--- a/lib/kokkos/cmake/tribits.cmake
+++ b/lib/kokkos/cmake/tribits.cmake
@ -3,7 +3,7 @@ INCLUDE(CTest)

 cmake_policy(SET CMP0054 NEW)

-MESSAGE(WARNING "The project name is: ${PROJECT_NAME}")
+MESSAGE(STATUS "The project name is: ${PROJECT_NAME}")

 IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_OpenMP)
  SET(${PROJECT_NAME}_ENABLE_OpenMP OFF)
@ -84,9 +84,6 @@ ENDFUNCTION()


 MACRO(TRIBITS_ADD_TEST_DIRECTORIES)
-  message(STATUS "ProjectName: " ${PROJECT_NAME})
-  message(STATUS "Tests: " ${${PROJECT_NAME}_ENABLE_TESTS})
-  
  IF(${${PROJECT_NAME}_ENABLE_TESTS})
    FOREACH(TEST_DIR ${ARGN})
      ADD_SUBDIRECTORY(${TEST_DIR})
@ -95,13 +92,11 @@ MACRO(TRIBITS_ADD_TEST_DIRECTORIES)
 ENDMACRO()

 MACRO(TRIBITS_ADD_EXAMPLE_DIRECTORIES)
-
  IF(${PACKAGE_NAME}_ENABLE_EXAMPLES OR ${PARENT_PACKAGE_NAME}_ENABLE_EXAMPLES)
    FOREACH(EXAMPLE_DIR ${ARGN})
      ADD_SUBDIRECTORY(${EXAMPLE_DIR})
    ENDFOREACH()
  ENDIF()
-
 ENDMACRO()


--- a/lib/kokkos/config/configure_compton_cpu.sh
+++ b/lib/kokkos/config/configure_compton_cpu.sh
@ -1,190 +0,0 @@
-#!/bin/sh
-#
-# Copy this script, put it outside the Trilinos source directory, and
-# build there.
-#
-# Additional command-line arguments given to this script will be
-# passed directly to CMake.
-#
-
-#
-# Force CMake to re-evaluate build options.
-#
-rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
-
-#-----------------------------------------------------------------------------
-# Incrementally construct cmake configure options:
-
-CMAKE_CONFIGURE=""
-
-#-----------------------------------------------------------------------------
-# Location of Trilinos source tree:
-
-CMAKE_PROJECT_DIR="${HOME}/Trilinos"
-
-# Location for installation:
-
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/projects/kokkos/host/`date +%F`"
-
-#-----------------------------------------------------------------------------
-# General build options.
-# Use a variable so options can be propagated to CUDA compiler.
-
-CMAKE_VERBOSE_MAKEFILE=OFF
-CMAKE_BUILD_TYPE=RELEASE
-# CMAKE_BUILD_TYPE=DEBUG
-
-#-----------------------------------------------------------------------------
-# Build for CUDA architecture:
-
-CUDA_ARCH=""
-# CUDA_ARCH="20"
-# CUDA_ARCH="30"
-# CUDA_ARCH="35"
-
-# Build with Intel compiler
-
-INTEL=ON
-
-# Build for MIC architecture:
-
-# INTEL_XEON_PHI=ON
-
-# Build with HWLOC at location:
-
-HWLOC_BASE_DIR="/home/projects/libraries/host/hwloc/1.6.2"
-
-# Location for MPI to use in examples:
-
-MPI_BASE_DIR=""
-
-#-----------------------------------------------------------------------------
-# MPI configuation only used for examples:
-#
-# Must have the MPI_BASE_DIR so that the
-# include path can be passed to the Cuda compiler
-
-if [ -n "${MPI_BASE_DIR}" ] ;
-then
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
-else
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
-fi
-
-#-----------------------------------------------------------------------------
-# Pthread configuation:
-
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
-# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
-
-#-----------------------------------------------------------------------------
-# OpenMP configuation:
-
-# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
-
-#-----------------------------------------------------------------------------
-#-----------------------------------------------------------------------------
-# Configure packages for kokkos-only:
-
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
-
-#-----------------------------------------------------------------------------
-#-----------------------------------------------------------------------------
-# Hardware locality cmake configuration:
-
-if [ -n "${HWLOC_BASE_DIR}" ] ;
-then
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
-fi
-
-#-----------------------------------------------------------------------------
-# Cuda cmake configuration:
-
-if [ -n "${CUDA_ARCH}" ] ;
-then
-
-  # Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
-  # this is different than the standard CMAKE_CXX_FLAGS syntax.
-
-  CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
-  CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
-
-  if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
-  then
-    CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
-  else
-    CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
-  fi
-
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
-
-fi
-
-#-----------------------------------------------------------------------------
-
-if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
-then
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
-fi
-
-#-----------------------------------------------------------------------------
-
-# Cross-compile for Intel Xeon Phi:
-
-if [ "${INTEL_XEON_PHI}" = "ON" ] ;
-then
-
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
-
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
-
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
-
-  # Cannot cross-compile fortran compatibility checks on the MIC:
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
-
-  # Tell cmake the answers to compile-and-execute tests
-  # to prevent cmake from executing a cross-compiled program.
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
-  CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
-
-fi
-
-#-----------------------------------------------------------------------------
-
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
-CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
-
-#-----------------------------------------------------------------------------
-
-echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
-
-cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
-
-#-----------------------------------------------------------------------------
-
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Steven J. Plimpton	f6c76e04b8	patch 16Mar18	2018-03-19 08:26:58 -06:00
Steve Plimpton	3befd4b603	Merge pull request #843 from akohlmey/whitespace-cleanup Whitespace cleanup for stable release	2018-03-16 14:44:30 -06:00
Axel Kohlmeyer	e9ac8ba01e	cleanup embedded or trailing tabs	2018-03-16 13:21:54 -04:00
Axel Kohlmeyer	59dbb49cf9	remove trailing whitespace	2018-03-16 12:37:27 -04:00
Axel Kohlmeyer	ee862d8bf5	replace leading tabs	2018-03-16 12:34:33 -04:00
Steve Plimpton	fc3de22c17	Merge pull request #841 from stanmoore1/compiler_warnings Fix compiler warnings	2018-03-16 09:26:59 -06:00
Steve Plimpton	ab914a9220	Merge pull request #840 from akohlmey/collected-small-fixes Collected small fixes for stable release	2018-03-16 09:25:59 -06:00
Steve Plimpton	7c300eebd5	Merge pull request #837 from akohlmey/reaxff-bugfix-from-scm reaxff corrected bond order bugfix	2018-03-16 09:25:38 -06:00
Axel Kohlmeyer	94a923191a	more whitespace cleanup	2018-03-15 22:02:02 -04:00
Axel Kohlmeyer	7d2ada9d80	whitespace cleanup	2018-03-15 21:57:45 -04:00
Stan Moore	15a9600569	Fix compiler warnings	2018-03-14 13:27:03 -06:00
Axel Kohlmeyer	d62534665f	correct potential out-of-bounds memory access	2018-03-14 12:11:58 -04:00
Axel Kohlmeyer	d00908ea3e	whitespace cleanup	2018-03-13 23:02:55 -04:00
Axel Kohlmeyer	6965307250	print warning when "compress yes" is ignored with delete_atoms	2018-03-13 22:58:39 -04:00
Steve Plimpton	d9c6278844	Merge pull request #838 from zozo123/replaced-strcmpi-with-strncmpi-to-limit-number-of-chars-compared Tools/Matlab: Allow to read LAMMPS output fields	2018-03-12 16:39:35 -06:00
Axel Kohlmeyer	821b18641d	kokkos version of reaxff corrected bond order bugfix from Tomáš Trnka trnka@scm.com posted on lammps-users	2018-03-12 16:58:03 -04:00
Steve Plimpton	ce4ffe5933	Merge pull request #833 from stanmoore1/kk_update_2.6 Update Kokkos library to v2.6.00	2018-03-12 13:51:33 -06:00
Yossi Eliaz	9c3296aad2	Tools/Matlab: Allow to read LAMPPS output field Some output fields have attributes attached on the same line. e.g. "ITEM: BOX BOUNDS pp pp pp". This patch replaced all the strcmpi to strncmpi in order to limit the number of character compared with LAMPPS outputs. Signed-off-by: Yossi Eliaz <eliaz123@gmail.com>	2018-03-12 13:45:13 -05:00
Axel Kohlmeyer	b2c8c40204	reaxff corrected bond order bugfix from Tomáš Trnka trnka@scm.com posted on lammps-users	2018-03-12 12:15:37 -04:00
Axel Kohlmeyer	25c46593ee	protect OpenMP header include with ifdefs	2018-03-12 11:56:54 -04:00
Steve Plimpton	35abbab966	Merge pull request #835 from junghans/fix_python lammps.py: inconsistent use of tabs and spaces in indentation	2018-03-09 08:42:15 -07:00
Steve Plimpton	d358e886c5	Merge pull request #834 from akohlmey/new-reax-logs provide new reference outputs for various reaxff examples	2018-03-09 08:41:44 -07:00
Christoph Junghans	62d446668c	lammps.py: inconsistent use of tabs and spaces in indentation	2018-03-08 16:23:44 -07:00
Axel Kohlmeyer	fcfbdb13ab	provide new reference outputs for various reaxff examples	2018-03-08 18:10:28 -05:00
Stan Moore	39786b1740	Update Kokkos library to r2.6.00	2018-03-08 10:57:08 -07:00
Steven J. Plimpton	0c4c002f34	patch 8Mar18	2018-03-08 08:19:46 -07:00
Steve Plimpton	bad1cdde78	Merge pull request #831 from lammps/mpi4py-version allow for mpi4py version 2 or 3 in Python wrapper	2018-03-07 14:01:22 -07:00
Steve Plimpton	626ca25d05	Merge pull request #830 from akohlmey/more-fixes-for-stable More small fixes for stable release	2018-03-07 14:00:32 -07:00
Axel Kohlmeyer	a1bb877d55	correct commented out MPI examples	2018-03-07 11:17:03 -05:00
Axel Kohlmeyer	63c0a35fab	remove code that has no effect	2018-03-07 11:12:08 -05:00
Axel Kohlmeyer	812572ea97	update dependencies for colvars library	2018-03-07 10:57:56 -05:00
Axel Kohlmeyer	f91c36878c	remove dead code	2018-03-07 10:57:07 -05:00
Steven J. Plimpton	fd1edaf04f	allow for mpi4py version 2 or 3 in Python wrapper	2018-03-07 08:52:46 -07:00
Axel Kohlmeyer	47e2ca6eb2	apply bugfix to reaxff taper function as described in issue #828	2018-03-07 09:52:14 -05:00
Steve Plimpton	8d6fbd9829	Merge pull request #829 from lammps/restartinfo add restartinfo=0 to manbody file it was missing from	2018-03-06 15:17:05 -07:00
Steven J. Plimpton	070e85b44b	add restartinfo=0 to manbody file it was missing from	2018-03-06 12:45:40 -07:00
Steve Plimpton	3e535633e6	Merge pull request #827 from akohlmey/fixes-for-stable Fixes for stable release	2018-03-06 12:33:33 -07:00
Axel Kohlmeyer	64779eb576	documentation update for MEAM to clarify the I,J,K indices in the MEAM parameter file	2018-03-06 13:21:34 -05:00
Axel Kohlmeyer	1ca928b331	dead code removal	2018-03-05 20:33:19 -05:00
Axel Kohlmeyer	a1bdea1dd8	avoid division by zero for pair styles meam and meam/c	2018-03-05 14:03:10 -05:00
Steve Plimpton	45555b017d	Merge pull request #728 from danicholson/cluster-fragment-aggregate-fixes Cluster/fragment/aggregate bugfixes	2018-03-02 15:52:26 -07:00
Steve Plimpton	54f58faab5	Merge pull request #822 from andeplane/gcmc_mpi_error Added error if gcmc is used with molecules on more than one processor	2018-03-02 14:41:03 -07:00
Steve Plimpton	22b6764304	Merge pull request #819 from stanmoore1/package_installed Add make package-installed command	2018-03-02 14:40:36 -07:00
Steve Plimpton	39a09d3a54	Merge pull request #814 from stanmoore1/kk_snap_workaround Workaround issue in pair_snap_kokkos	2018-03-02 14:40:20 -07:00
Steve Plimpton	812a45451a	Merge pull request #816 from giacomofiorin/colvars-update-2018-02-23 Collected fixes and updates to Colvars library	2018-03-02 13:15:56 -07:00
Steve Plimpton	0666607ceb	Merge pull request #815 from akohlmey/collected-small-fixes Collected small cleanups, fixes, and enhancements	2018-03-02 13:15:35 -07:00
Anders Hafreager	d18ba3b188	Added error if gcmc is used with molecules on more than one processor	2018-03-02 11:23:34 -08:00
Axel Kohlmeyer	b1d3b56a17	apply bugfix reported in issue #820	2018-03-02 04:33:13 -05:00
Stan Moore	8d0fdb17a6	Add make package-installed command	2018-03-01 10:39:06 -07:00
Axel Kohlmeyer	eadac15466	avoid multiple calls to delete [] on the same pointer. thanks to @ExHP for pointing out this issue	2018-02-28 14:02:16 +01:00
Axel Kohlmeyer	58e01a9eee	plug memory leak in pair style lj/class2/coul/long with coulomb tables	2018-02-25 14:03:07 +01:00
Axel Kohlmeyer	5fb2979da7	allow dynamic groups for some standard walls interacting with point particles	2018-02-24 13:50:42 -05:00
Axel Kohlmeyer	948f4783aa	ring communication is called with outbut set to NULL, so don't error out on that.	2018-02-24 17:17:45 +01:00
Axel Kohlmeyer	fb6e7e8aea	add sanity checks for ring communication we do not call memcpy() unless nbytes != 0 and source/target pointer is not NULL we error out on illegal combinations of nbytes and inbuf/outbuf	2018-02-24 16:41:10 +01:00
Axel Kohlmeyer	bba4bd1489	support offsets for molecule IDs (if available) in read_data similar to atomIDs suggested by felipe perez in https://sourceforge.net/p/lammps/mailman/message/36236631/	2018-02-23 18:02:05 -05:00
Stan Moore	4a875dc67d	Workaround for compiler bug in gcc v4.9.3, manifest in KOKKOS SNAP	2018-02-23 09:01:34 -07:00
Giacomo Fiorin	f3cf407a21	Collected fixes and updates to Colvars library This commit includes several fixes to moving restraints; also added is support for runtime integration of 2D and 3D PMFs from ABF. Mostly changes to existing member functions, with few additions in classes not directly accessible by LAMMPS. Also removed are calls to std::pow(), replaced by a copy of MathSpecial::powint(). Relevant commits in Colvars repository: 7307b5c 2017-12-14 Doc improvements [Giacomo Fiorin] 7f86f37 2017-12-14 Allow K-changing restraints computing accumulated work; fix staged-k TI estimator [Giacomo Fiorin] 7c1c175 2017-12-14 Fix 1D ABF trying to do pABF [Jérôme Hénin] b94aa7e 2017-11-16 Unify PMF output for 1D, 2D and 3D in ABF [Jérôme Hénin] 771a88f 2017-11-15 Poisson integration for all BC in 2d and 3d [Jérôme Hénin] 6af4d60 2017-12-01 Print message when issuing cv delete in VMD [Giacomo Fiorin] 4413972 2017-11-30 Check for homogeneous colvar to set it periodic [Jérôme Hénin] 95fe4b2 2017-11-06 Allow abf_integrate to start in bin with 1 sample [Jérôme Hénin] 06eea27 2017-10-23 Shorten a few constructs by using the power function [Giacomo Fiorin] 3165dfb 2017-10-20 Move includes of colvarproxy.h from headers to files [Giacomo Fiorin] 32a867b 2017-10-20 Add optimized powint function from LAMMPS headers [Giacomo Fiorin] 3ad070a 2017-10-20 Remove some unused includes, isolate calls to std::pow() [Giacomo Fiorin] 0aaf540 2017-10-20 Replace all calls to std::pow() where the exponent is not an integer [Giacomo Fiorin]	2018-02-23 08:34:53 -05:00
Axel Kohlmeyer	0003bb6766	merge capture regions, so the library interface code can compiled with exceptions	2018-02-23 14:20:39 +01:00
Axel Kohlmeyer	523978b4c7	dead code and uninitialized variables detected by clang	2018-02-23 12:04:15 +01:00
Stan Moore	939b1b2d05	Workaround issue in pair_snap_kokkos_impl	2018-02-22 14:27:23 -07:00
Steve Plimpton	77efd3dfb3	Merge pull request #813 from akohlmey/correct-neighbor-build Make default argument for virtual method Neighbor::build() explicit	2018-02-22 08:48:06 -07:00
Steve Plimpton	feb9f29fad	Merge pull request #812 from akohlmey/correct-integrate-setup Make default argument for pure method Integrate::setup() explicit	2018-02-22 08:47:45 -07:00
Axel Kohlmeyer	99d5957a01	make default argument of virtual function Neighbor::build() explicit	2018-02-22 08:42:36 -05:00
Axel Kohlmeyer	65acd233ce	forgot to remove one default argument on a method derived from Integrate::setup()	2018-02-22 08:13:54 -05:00
Axel Kohlmeyer	cf3887c5e0	default arguments on polymorph/pure methods can lead to unexpected overloading in derived classes argument for Integrate::setup() made explicit	2018-02-22 07:53:58 -05:00
David Nicholson	e55c90cc44	Moved rerun bug fix to individual affected styles	2017-11-14 14:01:07 -05:00
David Nicholson	751465aad3	Merge branch 'master' into cluster-fragment-aggregate-fixes	2017-11-13 14:32:26 -05:00
David Nicholson	a085ee0c55	Always build occasional lists on first step	2017-11-13 04:53:16 -05:00
David Nicholson	c16b7a3273	Multiple run fix for cluster/aggregate computes	2017-11-12 15:57:53 -05:00
David Nicholson	858065029d	Reverse communication compute fragment/aggregate	2017-11-12 15:57:02 -05:00