various documentation fixups, dedup references, wrap paragraphs, adjust underlines, add missing index

2024-06-26 07:26:03 -04:00
parent 8173142950
commit 44b66cb56b
4 changed files with 131 additions and 102 deletions
--- a/doc/src/compute_pod_atom.rst
+++ b/doc/src/compute_pod_atom.rst
@ -10,10 +10,10 @@ compute podd/atom command
 =========================

 compute pod/local command
-=======================
+=========================

 compute pod/global command
-=======================
+==========================

 Syntax
 """"""
@ -50,41 +50,50 @@ Description
 Define a computation that calculates a set of quantities related to the
 POD descriptors of the atoms in a group. These computes are used
 primarily for calculating the dependence of energy and force components
-on the linear coefficients in the :doc:`pod pair_style
-<pair_pod>`, which is useful when training a POD potential to match
-target data. POD descriptors of an atom are characterized by the
-radial and angular distribution of neighbor atoms. The detailed
-mathematical definition is given in the papers by :ref:`(Nguyen and Rohskopf) <Nguyen20222>`,
-:ref:`(Nguyen2023) <Nguyen20232>`, :ref:`(Nguyen2024) <Nguyen20242>`, and :ref:`(Nguyen and Sema) <Nguyen20243>`.
+on the linear coefficients in the :doc:`pod pair_style <pair_pod>`,
+which is useful when training a POD potential to match target data. POD
+descriptors of an atom are characterized by the radial and angular
+distribution of neighbor atoms. The detailed mathematical definition is
+given in the papers by :ref:`(Nguyen and Rohskopf) <Nguyen20222c>`,
+:ref:`(Nguyen2023) <Nguyen20232c>`, :ref:`(Nguyen2024) <Nguyen20242c>`,
+and :ref:`(Nguyen and Sema) <Nguyen20243c>`.

 Compute *pod/atom* calculates the per-atom POD descriptors.

-Compute *podd/atom* calculates derivatives of the per-atom POD descriptors with respect to atom positions.
+Compute *podd/atom* calculates derivatives of the per-atom POD
+descriptors with respect to atom positions.

-Compute *pod/local* calculates the per-atom POD descriptors and their derivatives with respect to atom positions.
+Compute *pod/local* calculates the per-atom POD descriptors and their
+derivatives with respect to atom positions.

-Compute *pod/global* calculates the global POD descriptors and their derivatives with respect to atom positions.
+Compute *pod/global* calculates the global POD descriptors and their
+derivatives with respect to atom positions.

-Examples how to use Compute POD commands are found in the directory lammps/examples/PACKAGES/pod.
+Examples how to use Compute POD commands are found in the directory
+``examples/PACKAGES/pod``.

 ----------

 Output info
 """""""""""

-Compute *pod/atom* produces an 2D array of size :math:`N \times M`, where :math:`N` is the number of atoms
-and :math:`M` is the number of descriptors. Each column corresponds to a particular POD descriptor.
+Compute *pod/atom* produces an 2D array of size :math:`N \times M`,
+where :math:`N` is the number of atoms and :math:`M` is the number of
+descriptors. Each column corresponds to a particular POD descriptor.

-Compute *podd/atom* produces an 2D array of size :math:`N \times (M * 3 N)`. Each column
-corresponds to a particular derivative of a POD descriptor.
+Compute *podd/atom* produces an 2D array of size :math:`N \times (M * 3
+N)`. Each column corresponds to a particular derivative of a POD
+descriptor.

-Compute *pod/local* produces an 2D array of size :math:`(1 + 3N) \times (M * N)`.
-The first row contains the per-atom descriptors, and the last 3N rows contain the derivatives
-of the per-atom descriptors with respect to atom positions.
+Compute *pod/local* produces an 2D array of size :math:`(1 + 3N) \times
+(M * N)`.  The first row contains the per-atom descriptors, and the last
+3N rows contain the derivatives of the per-atom descriptors with respect
+to atom positions.

-Compute *pod/global* produces an 2D array of size :math:`(1 + 3N) \times (M)`.
-The first row contains the global descriptors, and the last 3N rows contain the derivatives
-of the global descriptors with respect to atom positions.
+Compute *pod/global* produces an 2D array of size :math:`(1 + 3N) \times
+(M)`.  The first row contains the global descriptors, and the last 3N
+rows contain the derivatives of the global descriptors with respect to
+atom positions.

 Restrictions
 """"""""""""
@ -107,19 +116,19 @@ none

 ----------

-.. _Nguyen20222:
+.. _Nguyen20222c:

 **(Nguyen and Rohskopf)** Nguyen and Rohskopf,  Journal of Computational Physics, 480, 112030, (2023).

-.. _Nguyen20232:
+.. _Nguyen20232c:

 **(Nguyen2023)** Nguyen, Physical Review B, 107(14), 144103, (2023).

-.. _Nguyen20242:
+.. _Nguyen20242c:

 **(Nguyen2024)** Nguyen, Journal of Computational Physics, 113102, (2024).

-.. _Nguyen20243:
+.. _Nguyen20243c:

 **(Nguyen and Sema)** Nguyen and Sema, https://arxiv.org/abs/2405.00306, (2024).

--- a/doc/src/fitpod_command.rst
+++ b/doc/src/fitpod_command.rst
@ -1,7 +1,7 @@
 .. index:: fitpod

 fitpod command
-======================
+==============

 Syntax
 """"""
@ -28,15 +28,19 @@ Description
 .. versionadded:: 22Dec2022

 Fit a machine-learning interatomic potential (ML-IAP) based on proper
-orthogonal descriptors (POD); please see :ref:`(Nguyen and Rohskopf) <Nguyen20222>`,
-:ref:`(Nguyen2023) <Nguyen20232>`, :ref:`(Nguyen2024) <Nguyen20242>`, and :ref:`(Nguyen and Sema) <Nguyen20243>` for details.
-The fitted POD potential can be used to run MD simulations via :doc:`pair_style pod <pair_pod>`.
+orthogonal descriptors (POD); please see :ref:`(Nguyen and Rohskopf)
+<Nguyen20222a>`, :ref:`(Nguyen2023) <Nguyen20232a>`, :ref:`(Nguyen2024)
+<Nguyen20242a>`, and :ref:`(Nguyen and Sema) <Nguyen20243a>` for details.
+The fitted POD potential can be used to run MD simulations via
+:doc:`pair_style pod <pair_pod>`.

-Two input files are required for this command. The first input file describes a POD potential parameter
-settings, while the second input file specifies the DFT data used for
-the fitting procedure. All keywords except *species* have default values. If a keyword is not
-set in the input file, its default value is used. The table below has one-line descriptions of all the keywords that can
-be used in the first input file  (i.e. ``Ta_param.pod``)
+Two input files are required for this command. The first input file
+describes a POD potential parameter settings, while the second input
+file specifies the DFT data used for the fitting procedure. All keywords
+except *species* have default values. If a keyword is not set in the
+input file, its default value is used. The table below has one-line
+descriptions of all the keywords that can be used in the first input
+file (i.e. ``Ta_param.pod``)

 .. list-table::
   :header-rows: 1
@ -127,8 +131,10 @@ be used in the first input file  (i.e. ``Ta_param.pod``)
     - INT
     - angular degree for seven-body potential

-Note that both the number of radial basis functions and angular degree must decrease as the body order increases. The next table describes all keywords that can be used in the second input file
-(i.e. ``Ta_data.pod`` in the example above):
+Note that both the number of radial basis functions and angular degree
+must decrease as the body order increases. The next table describes all
+keywords that can be used in the second input file (i.e. ``Ta_data.pod``
+in the example above):


 .. list-table::
@ -218,17 +224,19 @@ successful training, a number of output files are produced, if enabled:
 * ``<basename>_test_analysis.pod`` reports detailed errors for all test configurations
 * ``<basename>_coefficients.pod`` contains the coefficients of the POD potential

-After training the POD potential, ``Ta_param.pod`` and ``<basename>_coefficients.pod``
-are the two files needed to use the POD potential in LAMMPS.
-See :doc:`pair_style pod <pair_pod>` for using the POD potential. Examples
-about training and using POD potentials are found in the directory
-lammps/examples/PACKAGES/pod and the Github repo https://github.com/cesmix-mit/pod-examples.
+After training the POD potential, ``Ta_param.pod`` and
+``<basename>_coefficients.pod`` are the two files needed to use the POD
+potential in LAMMPS.  See :doc:`pair_style pod <pair_pod>` for using the
+POD potential. Examples about training and using POD potentials are
+found in the directory lammps/examples/PACKAGES/pod and the Github repo
+https://github.com/cesmix-mit/pod-examples.

 Loss Function Group Weights
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^

-The ``group_weights`` keyword in the ``data.pod`` file is responsible for weighting certain groups
-of configurations in the loss function. For example:
+The *group_weights* keyword in the ``data.pod`` file is responsible for
+weighting certain groups of configurations in the loss function. For
+example:

 .. code-block:: LAMMPS

@ -246,9 +254,10 @@ of configurations in the loss function. For example:
    Volume_BCC    100.0 1.0
    Volume_FCC    100.0 1.0

-This will apply an energy weight of ``100.0`` and a force weight of ``1.0`` for all groups in the
-``Ta`` example. The groups are named by their respective filename. If certain groups are left out of
-this table, then the globally defined weights from the ``fitting_weight_energy`` and
+This will apply an energy weight of ``100.0`` and a force weight of
+``1.0`` for all groups in the ``Ta`` example. The groups are named by
+their respective filename. If certain groups are left out of this table,
+then the globally defined weights from the ``fitting_weight_energy`` and
 ``fitting_weight_force`` keywords will be used.

 POD Potential
@ -269,38 +278,43 @@ POD potential is expressed as :math:`E(\boldsymbol R, \boldsymbol Z) =
    E_i(\boldsymbol R_i, \boldsymbol Z_i) \ = \ \sum_{m=1}^M c_m \mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)


-Here :math:`c_m` are trainable coefficients and :math:`\mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)`
-are per-atom POD descriptors. Summing the per-atom descriptors over :math:`i` yields the
-global descriptors :math:`d_m(\boldsymbol R, \boldsymbol Z) = \sum_{i=1}^N \mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)`.
-It thus follows that :math:`E(\boldsymbol R, \boldsymbol Z) =
-\sum_{m=1}^M c_m d_m(\boldsymbol R, \boldsymbol Z)`.
+Here :math:`c_m` are trainable coefficients and
+:math:`\mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)` are per-atom
+POD descriptors. Summing the per-atom descriptors over :math:`i` yields
+the global descriptors :math:`d_m(\boldsymbol R, \boldsymbol Z) =
+\sum_{i=1}^N \mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)`.  It
+thus follows that :math:`E(\boldsymbol R, \boldsymbol Z) = \sum_{m=1}^M
+c_m d_m(\boldsymbol R, \boldsymbol Z)`.

-The per-atom POD descriptors include one, two, three, four, five, six, and seven-body
-descriptors, which can be specified in the first input file. Furthermore, the per-atom POD descriptors
-also depend on the number of environment clusters specified in the first input file.
-Please see :ref:`(Nguyen2024) <Nguyen20242>` and :ref:`(Nguyen and Sema) <Nguyen20243>` for the detailed description of the per-atom POD descriptors.
+The per-atom POD descriptors include one, two, three, four, five, six,
+and seven-body descriptors, which can be specified in the first input
+file. Furthermore, the per-atom POD descriptors also depend on the
+number of environment clusters specified in the first input file.
+Please see :ref:`(Nguyen2024) <Nguyen20242a>` and :ref:`(Nguyen and Sema)
+<Nguyen20243a>` for the detailed description of the per-atom POD
+descriptors.

 Training
 """"""""

-POD potential is trained using the least-squares regression against
+A POD potential is trained using the least-squares regression against
 density functional theory (DFT) data.  Let :math:`J` be the number of
 training configurations, with :math:`N_j` being the number of atoms in
 the j-th configuration. The training configurations are extracted from
-the extended XYZ files located in a directory (i.e., path_to_training_data_set
-in the second input file).  Let :math:`\{E^{\star}_j\}_{j=1}^{J}` and
-:math:`\{\boldsymbol F^{\star}_j\}_{j=1}^{J}` be the DFT energies and
-forces for :math:`J` configurations. Next, we calculate the global
-descriptors and their derivatives for all training configurations. Let
-:math:`d_{jm}, 1 \le m \le M`, be the global descriptors associated with
-the j-th configuration, where :math:`M` is the number of global
-descriptors. We then form a matrix :math:`\boldsymbol A \in
-\mathbb{R}^{J \times M}` with entries :math:`A_{jm} = d_{jm}/ N_j` for
-:math:`j=1,\ldots,J` and :math:`m=1,\ldots,M`.  Moreover, we form a
-matrix :math:`\boldsymbol B \in \mathbb{R}^{\mathcal{N} \times M}` by
-stacking the derivatives of the global descriptors for all training
-configurations from top to bottom, where :math:`\mathcal{N} =
-3\sum_{j=1}^{J} N_j`.
+the extended XYZ files located in a directory (i.e.,
+path_to_training_data_set in the second input file).  Let
+:math:`\{E^{\star}_j\}_{j=1}^{J}` and :math:`\{\boldsymbol
+F^{\star}_j\}_{j=1}^{J}` be the DFT energies and forces for :math:`J`
+configurations. Next, we calculate the global descriptors and their
+derivatives for all training configurations. Let :math:`d_{jm}, 1 \le m
+\le M`, be the global descriptors associated with the j-th
+configuration, where :math:`M` is the number of global descriptors. We
+then form a matrix :math:`\boldsymbol A \in \mathbb{R}^{J \times M}`
+with entries :math:`A_{jm} = d_{jm}/ N_j` for :math:`j=1,\ldots,J` and
+:math:`m=1,\ldots,M`.  Moreover, we form a matrix :math:`\boldsymbol B
+\in \mathbb{R}^{\mathcal{N} \times M}` by stacking the derivatives of
+the global descriptors for all training configurations from top to
+bottom, where :math:`\mathcal{N} = 3\sum_{j=1}^{J} N_j`.

 The coefficient vector :math:`\boldsymbol c` of the POD potential is
 found by solving the following least-squares problem
@ -311,20 +325,22 @@ found by solving the following least-squares problem

 where :math:`w_E` and :math:`w_F` are weights for the energy
 (*fitting_weight_energy*) and force (*fitting_weight_force*),
-respectively; and :math:`w_R` is the regularization parameter (*fitting_regularization_parameter*).  Here :math:`\bar{\boldsymbol E}^{\star} \in
-\mathbb{R}^{J}` is a vector of with entries :math:`\bar{E}^{\star}_j =
-E^{\star}_j/N_j` and :math:`\boldsymbol F^{\star}` is a vector of
-:math:`\mathcal{N}` entries obtained by stacking :math:`\{\boldsymbol
-F^{\star}_j\}_{j=1}^{J}` from top to bottom.
+respectively; and :math:`w_R` is the regularization parameter
+(*fitting_regularization_parameter*).  Here :math:`\bar{\boldsymbol
+E}^{\star} \in \mathbb{R}^{J}` is a vector of with entries
+:math:`\bar{E}^{\star}_j = E^{\star}_j/N_j` and :math:`\boldsymbol
+F^{\star}` is a vector of :math:`\mathcal{N}` entries obtained by
+stacking :math:`\{\boldsymbol F^{\star}_j\}_{j=1}^{J}` from top to
+bottom.

 Validation
 """"""""""

-POD potential can be validated on a test dataset in a directory specified
-by setting path_to_test_data_set in the second input file. It is possible to
-validate the POD potential after the training is complete. This is done by
-providing the coefficient file as an input to :doc:`fitpod <fitpod_command>`,
-for example,
+POD potential can be validated on a test dataset in a directory
+specified by setting path_to_test_data_set in the second input file.  It
+is possible to validate the POD potential after the training is
+complete.  This is done by providing the coefficient file as an input to
+:doc:`fitpod <fitpod_command>`, for example,

 .. code-block:: LAMMPS

@ -353,19 +369,19 @@ The keyword defaults are also given in the description of the input files.

 ----------

-.. _Nguyen20222:
+.. _Nguyen20222a:

 **(Nguyen and Rohskopf)** Nguyen and Rohskopf,  Journal of Computational Physics, 480, 112030, (2023).

-.. _Nguyen20232:
+.. _Nguyen20232a:

 **(Nguyen2023)** Nguyen, Physical Review B, 107(14), 144103, (2023).

-.. _Nguyen20242:
+.. _Nguyen20242a:

 **(Nguyen2024)** Nguyen, Journal of Computational Physics, 113102, (2024).

-.. _Nguyen20243:
+.. _Nguyen20243a:

 **(Nguyen and Sema)** Nguyen and Sema, https://arxiv.org/abs/2405.00306, (2024).

--- a/doc/src/pair_pod.rst
+++ b/doc/src/pair_pod.rst
@ -1,4 +1,5 @@
 .. index:: pair_style pod
+.. index:: pair_style pod/kk

 pair_style pod command
 ========================
@ -26,23 +27,25 @@ Description
 .. versionadded:: 22Dec2022

 Pair style *pod* defines the proper orthogonal descriptor (POD)
-potential :ref:`(Nguyen and Rohskopf) <Nguyen20222>`,
-:ref:`(Nguyen2023) <Nguyen20232>`, :ref:`(Nguyen2024) <Nguyen20242>`, and :ref:`(Nguyen and Sema) <Nguyen20243>`.
-The :doc:`fitpod <fitpod_command>` is used to fit the POD potential.
+potential :ref:`(Nguyen and Rohskopf) <Nguyen20222b>`,
+:ref:`(Nguyen2023) <Nguyen20232b>`, :ref:`(Nguyen2024) <Nguyen20242b>`,
+and :ref:`(Nguyen and Sema) <Nguyen20243b>`.  The :doc:`fitpod
+<fitpod_command>` is used to fit the POD potential.

 Only a single pair_coeff command is used with the *pod* style which
-specifies a POD parameter file followed by a coefficient file,
-a projection matrix file, and a centroid file.
+specifies a POD parameter file followed by a coefficient file, a
+projection matrix file, and a centroid file.

-The POD parameter file (``Ta_param.pod``) can contain blank and comment lines
-(start with #) anywhere. Each non-blank non-comment line must contain
-one keyword/value pair. See :doc:`fitpod <fitpod_command>` for the description
-of all the keywords that can be assigned in the parameter file.
+The POD parameter file (``Ta_param.pod``) can contain blank and comment
+lines (start with #) anywhere. Each non-blank non-comment line must
+contain one keyword/value pair. See :doc:`fitpod <fitpod_command>` for
+the description of all the keywords that can be assigned in the
+parameter file.

-The coefficient file (``Ta_coefficients.pod``) contains coefficients for the
-POD potential. The top of the coefficient file can contain any number of
-blank and comment lines (start with #), but follows a strict format
-after that. The first non-blank non-comment line must contain:
+The coefficient file (``Ta_coefficients.pod``) contains coefficients for
+the POD potential. The top of the coefficient file can contain any
+number of blank and comment lines (start with #), but follows a strict
+format after that. The first non-blank non-comment line must contain:

 * model_coefficients: *ncoeff* *nproj* *ncentroid*

@ -124,19 +127,19 @@ none

 ----------

-.. _Nguyen20222:
+.. _Nguyen20222b:

 **(Nguyen and Rohskopf)** Nguyen and Rohskopf,  Journal of Computational Physics, 480, 112030, (2023).

-.. _Nguyen20232:
+.. _Nguyen20232b:

 **(Nguyen2023)** Nguyen, Physical Review B, 107(14), 144103, (2023).

-.. _Nguyen20242:
+.. _Nguyen20242b:

 **(Nguyen2024)** Nguyen, Journal of Computational Physics, 113102, (2024).

-.. _Nguyen20243:
+.. _Nguyen20243b:

 **(Nguyen and Sema)** Nguyen and Sema, https://arxiv.org/abs/2405.00306, (2024).

--- a/doc/utils/sphinx-config/false_positives.txt
+++ b/doc/utils/sphinx-config/false_positives.txt
@ -3816,6 +3816,7 @@ typeJ
 typelabel
 typeN
 typesafe
+typestr
 Tz
 Tzou
 ub