Merge pull request #1971 from charlessievers/chem_snap

Chemical specificity for SNAP
2020-06-18 10:11:15 -04:00
parent 586c81b1be a221f13308
commit 77f6fecc86
27 changed files with 2060 additions and 624 deletions
--- a/doc/src/compute_sna_atom.rst
+++ b/doc/src/compute_sna_atom.rst
@ -30,7 +30,7 @@ Syntax
 * R_1, R_2,... = list of cutoff radii, one for each type (distance units)
 * w_1, w_2,... = list of neighbor weights, one for each type
 * zero or more keyword/value pairs may be appended
-* keyword = *rmin0* or *switchflag* or *bzeroflag* or *quadraticflag*
+* keyword = *rmin0* or *switchflag* or *bzeroflag* or *quadraticflag* or *chem* or *bnormflag* or *wselfallflag*

  .. parsed-literal::

@ -44,6 +44,15 @@ Syntax
       *quadraticflag* value = *0* or *1*
          *0* = do not generate quadratic terms
          *1* = generate quadratic terms
+       *chem* values = *nelements* *elementlist*
+          *nelements* = number of SNAP elements
+          *elementlist* = *ntypes* integers in range [0, *nelements*)
+       *bnormflag* value = *0* or *1*
+          *0* = do not normalize
+          *1* = normalize bispectrum components
+       *wselfallflag* value = *0* or *1*
+          *0* = self-contribution only for element of central atom
+          *1* = self-contribution for all elements

 Examples
 """"""""
@ -54,6 +63,7 @@ Examples
   compute db all sna/atom 1.4 0.95 6 2.0 1.0
   compute vb all sna/atom 1.4 0.95 6 2.0 1.0
   compute snap all snap 1.4 0.95 6 2.0 1.0
+   compute snap all snap 1.0 0.99363 6 3.81 3.83 1.0 0.93 chem 2 0 1

 Description
 """""""""""
@ -71,27 +81,26 @@ mathematical definition is given in the paper by Thompson et
 al. :ref:`(Thompson) <Thompson20141>`

 The position of a neighbor atom *i'* relative to a central atom *i* is
-a point within the 3D ball of radius *R_ii' = rcutfac\*(R_i + R_i')*
+a point within the 3D ball of radius :math:`R_{ii'}` = *rcutfac* :math:`(R_i + R_i')`

 Bartok et al. :ref:`(Bartok) <Bartok20101>`, proposed mapping this 3D ball
 onto the 3-sphere, the surface of the unit ball in a four-dimensional
-space.  The radial distance *r* within *R_ii'* is mapped on to a third
-polar angle *theta0* defined by,
+space.  The radial distance *r*  within *R_ii'* is mapped on to a third
+polar angle :math:`\theta_0` defined by,

 .. math::

-  \theta_0 = {\tt rfac0} \frac{r-r_{min0}}{R_{ii'}-r_{min0}} \pi
+  \theta_0 = {\sf rfac0} \frac{r-r_{min0}}{R_{ii'}-r_{min0}} \pi

 In this way, all possible neighbor positions are mapped on to a subset
-of the 3-sphere.  Points south of the latitude *theta0max=rfac0\*Pi*
+of the 3-sphere.  Points south of the latitude :math:`\theta_0` = *rfac0* :math:`\pi`
 are excluded.

-The natural basis for functions on the 3-sphere is formed by the 4D
-hyperspherical harmonics *U\^j_m,m'(theta, phi, theta0).*  These
-functions are better known as *D\^j_m,m',* the elements of the Wigner
+The natural basis for functions on the 3-sphere is formed by the
+representatives of *SU(2)*, the matrices :math:`U^j_{m,m'}(\theta, \phi, \theta_0)`.
+These functions are better known as :math:`D^j_{m,m'}`, the elements of the Wigner
 *D*\ -matrices :ref:`(Meremianin <Meremianin2006>`,
-:ref:`Varshalovich) <Varshalovich1987>`.
-
+:ref:`Varshalovich <Varshalovich1987>`, :ref:`Mason) <Mason2009>`
 The density of neighbors on the 3-sphere can be written as a sum of
 Dirac-delta functions, one for each neighbor, weighted by species and
 radial distance. Expanding this density function as a generalized
@ -100,20 +109,20 @@ coefficient as

 .. math::

-  u^j_{m,m'} = U^j_{m,m'}(0,0,0) + \sum_{r_{ii'} < R_{ii'}}{f_c(r_{ii'}) w_{i'} U^j_{m,m'}(\theta_0,\theta,\phi)}
+  u^j_{m,m'} = U^j_{m,m'}(0,0,0) + \sum_{r_{ii'} < R_{ii'}}{f_c(r_{ii'}) w_{\mu_{i'}} U^j_{m,m'}(\theta_0,\theta,\phi)}

-The *w_i'* neighbor weights are dimensionless numbers that are chosen
-to distinguish atoms of different types, while the central atom is
-arbitrarily assigned a unit weight.  The function *fc(r)* ensures that
+The :math:`w_{\mu_{i'}}` neighbor weights are dimensionless numbers that depend on
+:math:`\mu_{i'}`, the SNAP element of atom *i'*, while the central atom is
+arbitrarily assigned a unit weight.  The function :math:`f_c(r)` ensures that
 the contribution of each neighbor atom goes smoothly to zero at
-*R_ii'*:
+:math:`R_{ii'}`:

 .. math::

  f_c(r)   = & \frac{1}{2}(\cos(\pi \frac{r-r_{min0}}{R_{ii'}-r_{min0}}) + 1), r \leq R_{ii'} \\
           = & 0,  r > R_{ii'}

-The expansion coefficients *u\^j_m,m'* are complex-valued and they are
+The expansion coefficients :math:`u^j_{m,m'}` are complex-valued and they are
 not directly useful as descriptors, because they are not invariant
 under rotation of the polar coordinate frame. However, the following
 scalar triple products of expansion coefficients can be shown to be
@ -128,7 +137,8 @@ real-valued and invariant under rotation :ref:`(Bartok) <Bartok20101>`.
        {j_2} {m_2} {m'_2} \end{array}}
        u^{j_1}_{m_1,m'_1} u^{j_2}_{m_2,m'_2}

-The constants *H\^jmm'_j1m1m1'_j2m2m2'* are coupling coefficients,
+The constants :math:`H^{jmm'}_{j_1 m_1 m_{1'},j_2 m_ 2m_{2'}}`
+are coupling coefficients,
 analogous to Clebsch-Gordan coefficients for rotations on the
 2-sphere. These invariants are the components of the bispectrum and
 these are the quantities calculated by the compute *sna/atom*\ . They
@ -136,13 +146,12 @@ characterize the strength of density correlations at three points on
 the 3-sphere. The j2=0 subset form the power spectrum, which
 characterizes the correlations of two points. The lowest-order
 components describe the coarsest features of the density function,
-while higher-order components reflect finer detail.  Note that the
-central atom is included in the expansion, so three point-correlations
-can be either due to three neighbors, or two neighbors and the central
-atom.
+while higher-order components reflect finer detail. Each bispectrum
+component contains terms that depend on the positions of up to 4
+atoms (3 neighbors and the central atom).

 Compute *snad/atom* calculates the derivative of the bispectrum components
-summed separately for each atom type:
+summed separately for each LAMMPS atom type:

 .. math::

@ -165,7 +174,7 @@ Again, the sum is over all atoms *i'* of atom type *I*\ .  For each atom
 virial components, each atom type, and each bispectrum component.  See
 section below on output for a detailed explanation.

-Compute *snap* calculates a global array contains information related
+Compute *snap* calculates a global array containing information related
 to all three of the above per-atom computes *sna/atom*\ , *snad/atom*\ ,
 and *snav/atom*\ . The first row of the array contains the summation of
 *sna/atom* over all atoms, but broken out by type. The last six rows
@ -201,8 +210,8 @@ The argument *rcutfac* is a scale factor that controls the ratio of
 atomic radius to radial cutoff distance.

 The argument *rfac0* and the optional keyword *rmin0* define the
-linear mapping from radial distance to polar angle *theta0* on the
-3-sphere.
+linear mapping from radial distance to polar angle :math:`theta_0` on the
+3-sphere, given above.

 The argument *twojmax* defines which
 bispectrum components are generated. See section below on output for a
@ -210,7 +219,7 @@ detailed explanation of the number of bispectrum components and the
 ordered in which they are listed.

 The keyword *switchflag* can be used to turn off the switching
-function.
+function :math:`f_c(r)`.

 The keyword *bzeroflag* determines whether or not *B0*\ , the bispectrum
 components of an atom with no neighbors, are subtracted from
@ -219,13 +228,72 @@ normally only affects compute *sna/atom*\ . However, when
 *quadraticflag* is on, it also affects *snad/atom* and *snav/atom*\ .

 The keyword *quadraticflag* determines whether or not the
-quadratic analogs to the bispectrum quantities are generated.
+quadratic combinations of bispectrum quantities are generated.
 These are formed by taking the outer product of the vector
 of bispectrum components with itself.
 See section below on output for a
 detailed explanation of the number of quadratic terms and the
 ordered in which they are listed.

+The keyword *chem* activates the explicit multi-element variant
+of the SNAP bispectrum components. The argument *nelements*
+specifies the number of SNAP elements that will be handled.
+This is followed by *elementlist*, a list of integers of
+length *ntypes*, with values in the range [0, *nelements* ),
+which maps each LAMMPS type to one of the SNAP elements.
+Note that multiple LAMMPS types can be mapped to the same element,
+and some elements may be mapped by no LAMMPS type. However, in typical
+use cases (training SNAP potentials) the mapping from LAMMPS types
+to elements is one-to-one.
+
+The explicit multi-element variant invoked by the *chem* keyword
+partitions the density of neighbors into partial densities
+for each chemical element.  This is described in detail in the
+paper by :ref:`Cusentino et al. <Cusentino2020>`
+The bispectrum components are indexed on
+ordered triplets of elements:
+
+.. math::
+
+   B_{j_1,j_2,j}^{\kappa\lambda\mu} =
+   \sum_{m_1,m'_1=-j_1}^{j_1}\sum_{m_2,m'_2=-j_2}^{j_2}\sum_{m,m'=-j}^{j} (u^{\mu}_{j,m,m'})^*
+   H {\scriptscriptstyle \begin{array}{l} {j} {m} {m'} \\
+        {j_1} {m_1} {m'_1} \\
+        {j_2} {m_2} {m'_2} \end{array}}
+        u^{\kappa}_{j_1,m_1,m'_1} u^{\lambda}_{j_2,m_2,m'_2}
+
+where :math:`u^{\mu}_{j,m,m'}` is an expansion coefficient for the partial density of neighbors
+of element :math:`\mu`
+
+.. math::
+
+  u^{\mu}_{j,m,m'} =  w^{self}_{\mu_{i}\mu} U^{j,m,m'}(0,0,0) + \sum_{r_{ii'} < R_{ii'}}{\delta_{\mu\mu_{i'}}f_c(r_{ii'}) w_{\mu_{i'}} U^{j,m,m'}(\theta_0,\theta,\phi)}
+
+where :math:`w^{self}_{\mu_{i}\mu}` is the self-contribution, which is either 1 or 0
+(see keyword *wselfallflag* below), :math:`\delta_{\mu\mu_{i'}}` indicates
+that the sum is only over neighbor atoms of element :math:`\mu`,
+and all other quantities are the same as those appearing in the
+original equation for :math:`u^j_{m,m'}` given above.
+
+The keyword *wselfallflag* defines the rule used for the self-contribution.
+If *wselfallflag* is on, then :math:`w^{self}_{\mu_{i}\mu}` = 1. If it is
+off then :math:`w^{self}_{\mu_{i}\mu}` = 0, except in the case
+of :math:`{\mu_{i}=\mu}`, when :math:`w^{self}_{\mu_{i}\mu}` = 1.
+When the *chem* keyword is not used, this keyword has no effect.
+
+The keyword *bnormflag* determines whether or not the bispectrum
+component :math:`B_{j_1,j_2,j}` is divided by a factor of :math:`2j+1`.
+This normalization simplifies force calculations because of the
+following symmetry relation
+
+.. math::
+
+ \frac{B_{j_1,j_2,j}}{2j+1} = \frac{B_{j,j_2,j_1}}{2j_1+1} = \frac{B_{j_1,j,j_2}}{2j_2+1}
+
+This option is typically used in conjunction with the *chem* keyword,
+and LAMMPS will generate a warning if both *chem* and *bnormflag*
+are not both set or not both unset.
+
 .. note::

   If you have a bonded system, then the settings of
@ -257,6 +325,8 @@ described by the following piece of python code:
           for j in range(j1-j2,min(twojmax,j1+j2)+1,2):
               if (j>=j1): print j1/2.,j2/2.,j/2.

+For even twojmax = 2(*m*\ -1), :math:`K = m(m+1)(2m+1)/6`, the *m*\ -th pyramidal number. For odd twojmax = 2 *m*\ -1, :math:`K = m(m+1)(m+2)/3`, twice the *m*\ -th tetrahedral number.
+
 .. note::

   the *diagonal* keyword allowing other possible choices
@ -267,16 +337,15 @@ described by the following piece of python code:
 Compute *snad/atom* evaluates a per-atom array. The columns are
 arranged into *ntypes* blocks, listed in order of atom type *I*\ .  Each
 block contains three sub-blocks corresponding to the *x*\ , *y*\ , and *z*
-components of the atom position.  Each of these sub-blocks contains
-one column for each bispectrum component, the same as for compute
-*sna/atom*
+components of the atom position.  Each of these sub-blocks contains *K*
+columns for the *K* bispectrum components, the same as for compute *sna/atom*

 Compute *snav/atom* evaluates a per-atom array. The columns are
 arranged into *ntypes* blocks, listed in order of atom type *I*\ .  Each
 block contains six sub-blocks corresponding to the *xx*\ , *yy*\ , *zz*\ ,
 *yz*\ , *xz*\ , and *xy* components of the virial tensor in Voigt
-notation.  Each of these sub-blocks contains one column for each
-bispectrum component, the same as for compute *sna/atom*
+notation.  Each of these sub-blocks contains *K*
+columns for the *K* bispectrum components, the same as for compute *sna/atom*

 Compute *snap* evaluates a global array.
 The columns are arranged into
@ -312,6 +381,14 @@ of linear terms i.e. linear and quadratic terms are contiguous.
 So the nesting order from inside to outside is bispectrum component,
 linear then quadratic, vector/tensor component, type.

+If the *chem* keyword is used, then the data is arranged into :math:`N_{elem}^3`
+sub-blocks, each sub-block corresponding to a particular chemical labeling
+:math:`\kappa\lambda\mu` with the last label changing fastest.
+Each sub-block contains *K* bispectrum components. For the purposes
+of handling contributions to force, virial, and quadratic combinations,
+these :math:`N_{elem}^3` sub-blocks are treated as a single block
+of :math:`K N_{elem}^3` columns.
+
 These values can be accessed by any command that uses per-atom values
 from a compute as input.  See the :doc:`Howto output <Howto_output>` doc
 page for an overview of LAMMPS output options.
@ -320,7 +397,8 @@ Restrictions
 """"""""""""

 These computes are part of the SNAP package.  They are only enabled if
-LAMMPS was built with that package.  See the :doc:`Build package <Build_package>` doc page for more info.
+LAMMPS was built with that package.  See the :doc:`Build package <Build_package>`
+doc page for more info.

 Related commands
 """"""""""""""""
@ -332,6 +410,7 @@ Default

 The optional keyword defaults are *rmin0* = 0,
 *switchflag* = 1, *bzeroflag* = 1, *quadraticflag* = 0,
+*bnormflag* = 0, *wselfallflag* = 0

 ----------

@ -352,3 +431,12 @@ available at `arXiv:1409.3880 <http://arxiv.org/abs/1409.3880>`_

 **(Varshalovich)** Varshalovich, Moskalev, Khersonskii, Quantum Theory
 of Angular Momentum, World Scientific, Singapore (1987).
+.. _Varshalovich1987:
+
+.. _Mason2009:
+
+**(Mason)** J. K. Mason, Acta Cryst A65, 259 (2009).
+
+.. _Cusentino2020:
+
+**(Cusentino)** Cusentino, Wood, and Thompson, J Phys Chem A, xxx, xxxxx, (2020)
--- a/doc/src/pair_snap.rst
+++ b/doc/src/pair_snap.rst
@ -24,27 +24,30 @@ Examples
 Description
 """""""""""

-Pair style *snap* computes interactions using the spectral
-neighbor analysis potential (SNAP) :ref:`(Thompson) <Thompson20142>`.
-Like the GAP framework of Bartok et al. :ref:`(Bartok2010) <Bartok20102>`,
-:ref:`(Bartok2013) <Bartok2013>` which uses bispectrum components
+Pair style *snap* defines the spectral
+neighbor analysis potential (SNAP), a machine-learning 
+interatomic potential :ref:`(Thompson) <Thompson20142>`.
+Like the GAP framework of Bartok et al. :ref:`(Bartok2010) <Bartok20102>`, 
+SNAP uses bispectrum components
 to characterize the local neighborhood of each atom
 in a very general way. The mathematical definition of the
-bispectrum calculation used by SNAP is identical
-to that used by :doc:`compute sna/atom <compute_sna_atom>`.
+bispectrum calculation and its derivatives w.r.t. atom positions
+is identical to that used by :doc:`compute snap <compute_sna_atom>`,
+which is used to fit SNAP potentials to *ab initio* energy, force,
+and stress data.
 In SNAP, the total energy is decomposed into a sum over
 atom energies. The energy of atom *i* is
 expressed as a weighted sum over bispectrum components.

 .. math::

-   E^i_{SNAP}(B_1^i,...,B_K^i) = \beta^{\alpha_i}_0 + \sum_{k=1}^K \beta_k^{\alpha_i} B_k^i
+   E^i_{SNAP}(B_1^i,...,B_K^i) = \beta^{\mu_i}_0 + \sum_{k=1}^K \beta_k^{\mu_i} B_k^i

 where :math:`B_k^i` is the *k*\ -th bispectrum component of atom *i*\ ,
-and :math:`\beta_k^{\alpha_i}` is the corresponding linear coefficient
-that depends on :math:\alpha_i`, the SNAP element of atom *i*\ . The
+and :math:`\beta_k^{\mu_i}` is the corresponding linear coefficient
+that depends on :math:`\mu_i`, the SNAP element of atom *i*\ . The
 number of bispectrum components used and their definitions
-depend on the value of *twojmax*
+depend on the value of *twojmax* and other parameters
 defined in the SNAP parameter file described below.
 The bispectrum calculation is described in more detail
 in :doc:`compute sna/atom <compute_sna_atom>`.
@ -136,17 +139,51 @@ The SNAP parameter file can contain blank and comment lines (start
 with #) anywhere. Each non-blank non-comment line must contain one
 keyword/value pair. The required keywords are *rcutfac* and
 *twojmax*\ . Optional keywords are *rfac0*\ , *rmin0*\ ,
-*switchflag*\ , *bzeroflag*\, and *chunksize*\.
+*switchflag*\ , *bzeroflag*\ , *quadraticflag*\ , *chemflag*\ , 
+*bnormflag*\ , *wselfallflag*\ , and *chunksize*\ .

 The default values for these keywords are

 * *rfac0* = 0.99363
 * *rmin0* = 0.0
-* *switchflag* = 0
+* *switchflag* = 1
 * *bzeroflag* = 1
-* *quadraticflag* = 1
+* *quadraticflag* = 0
+* *chemflag* = 0
+* *bnormflag* = 0
+* *wselfallflag* = 0
 * *chunksize* = 2000

+If *quadraticflag* is set to 1, then the SNAP energy expression includes additional quadratic terms 
+that have been shown to increase the overall accuracy of the potential without much increase
+in computational cost :ref:`(Wood) <Wood20182>`. 
+
+.. math::
+
+   E^i_{SNAP}(\mathbf{B}^i) = \beta^{\mu_i}_0 + \boldsymbol{\beta}^{\mu_i} \cdot \mathbf{B}_i + \frac{1}{2}\mathbf{B}^t_i \cdot \boldsymbol{\alpha}^{\mu_i} \cdot \mathbf{B}_i
+
+where :math:`\mathbf{B}_i` is the *K*-vector of bispectrum components, 
+:math:`\boldsymbol{\beta}^{\mu_i}` is the *K*-vector of linear coefficients 
+for element :math:`\mu_i`, and :math:`\boldsymbol{\alpha}^{\mu_i}` 
+is the symmetric *K* by *K* matrix of quadratic coefficients.
+The SNAP element file should contain *K*\ (\ *K*\ +1)/2 additional coefficients
+for each element, the upper-triangular elements of :math:`\boldsymbol{\alpha}^{\mu_i}`.
+
+If *chemflag* is set to 1, then the energy expression is written in terms of explicit multi-element bispectrum
+components indexed on ordered triplets of elements, which has been shown to increase the ability of the SNAP
+potential to capture energy differences in chemically complex systems, 
+at the expense of a significant increase in computational cost :ref:`(Cusentino) <Cusentino20202>`.
+
+.. math::
+
+   E^i_{SNAP}(\mathbf{B}^i) = \beta^{\mu_i}_0 + \sum_{\kappa,\lambda,\mu} \boldsymbol{\beta}^{\kappa\lambda\mu}_{\mu_i} \cdot \mathbf{B}^{\kappa\lambda\mu}_i 
+
+where :math:`\mathbf{B}^{\kappa\lambda\mu}_i` is the *K*-vector of bispectrum components 
+for neighbors of elements :math:`\kappa`, :math:`\lambda`, and :math:`\mu` and 
+:math:`\boldsymbol{\beta}^{\kappa\lambda\mu}_{\mu_i}` is the corresponding *K*-vector 
+of linear coefficients for element :math:`\mu_i`. The SNAP element file should contain 
+a total of :math:`K N_{elem}^3` coefficients for each of the :math:`N_{elem}` elements.
+
 The keyword *chunksize* is only applicable when using the
 pair style *snap* with the KOKKOS package and is ignored otherwise.
 This keyword controls
@ -159,10 +196,6 @@ into two passes.
 Detailed definitions for all the other keywords
 are given on the :doc:`compute sna/atom <compute_sna_atom>` doc page.

-If *quadraticflag* is set to 1, then the SNAP energy expression includes the quadratic term, 0.5\*B\^t.alpha.B, where alpha is a symmetric *K* by *K* matrix.
-The SNAP element file should contain *K*\ (\ *K*\ +1)/2 additional coefficients
-for each element, the upper-triangular elements of alpha.
-
 .. note::

   The previously used *diagonalstyle* keyword was removed in 2019,
@ -221,7 +254,8 @@ Related commands

 :doc:`compute sna/atom <compute_sna_atom>`,
 :doc:`compute snad/atom <compute_sna_atom>`,
-:doc:`compute snav/atom <compute_sna_atom>`
+:doc:`compute snav/atom <compute_sna_atom>`,
+:doc:`compute snap <compute_sna_atom>`

 **Default:** none

@ -235,6 +269,10 @@ Related commands

 **(Bartok2010)** Bartok, Payne, Risi, Csanyi, Phys Rev Lett, 104, 136403 (2010).

-.. _Bartok2013:
+.. _Wood20182:

-**(Bartok2013)** Bartok, Gillan, Manby, Csanyi, Phys Rev B 87, 184115 (2013).
+**(Wood)** Wood and Thompson, J Chem Phys, 148, 241721, (2018)
+
+.. _Cusentino20202:
+
+**(Cusentino)** Cusentino, Wood, and Thompson, J Phys Chem A, xxx, xxxxx, (2020)