various documentation fixups, dedup references, wrap paragraphs, adjust underlines, add missing index

2024-06-26 07:26:03 -04:00
parent 8173142950
commit 44b66cb56b
4 changed files with 131 additions and 102 deletions
--- a/doc/src/compute_pod_atom.rst
+++ b/doc/src/compute_pod_atom.rst
@ -10,10 +10,10 @@ compute podd/atom command
 =========================
 compute pod/local command
-=======================
+=========================
 compute pod/global command
-=======================
+==========================
 Syntax
 """"""
@ -50,41 +50,50 @@ Description
 Define a computation that calculates a set of quantities related to the
 POD descriptors of the atoms in a group. These computes are used
 primarily for calculating the dependence of energy and force components
-on the linear coefficients in the :doc:`pod pair_style
+on the linear coefficients in the :doc:`pod pair_style <pair_pod>`,
-<pair_pod>`, which is useful when training a POD potential to match
+which is useful when training a POD potential to match target data. POD
-target data. POD descriptors of an atom are characterized by the
+descriptors of an atom are characterized by the radial and angular
-radial and angular distribution of neighbor atoms. The detailed
+distribution of neighbor atoms. The detailed mathematical definition is
-mathematical definition is given in the papers by :ref:`(Nguyen and Rohskopf) <Nguyen20222>`,
+given in the papers by :ref:`(Nguyen and Rohskopf) <Nguyen20222c>`,
-:ref:`(Nguyen2023) <Nguyen20232>`, :ref:`(Nguyen2024) <Nguyen20242>`, and :ref:`(Nguyen and Sema) <Nguyen20243>`.
+:ref:`(Nguyen2023) <Nguyen20232c>`, :ref:`(Nguyen2024) <Nguyen20242c>`,
 and :ref:`(Nguyen and Sema) <Nguyen20243c>`.
 Compute *pod/atom* calculates the per-atom POD descriptors.
-Compute *podd/atom* calculates derivatives of the per-atom POD descriptors with respect to atom positions.
+Compute *podd/atom* calculates derivatives of the per-atom POD
 descriptors with respect to atom positions.
-Compute *pod/local* calculates the per-atom POD descriptors and their derivatives with respect to atom positions.
+Compute *pod/local* calculates the per-atom POD descriptors and their
 derivatives with respect to atom positions.
-Compute *pod/global* calculates the global POD descriptors and their derivatives with respect to atom positions.
+Compute *pod/global* calculates the global POD descriptors and their
 derivatives with respect to atom positions.
-Examples how to use Compute POD commands are found in the directory lammps/examples/PACKAGES/pod.
+Examples how to use Compute POD commands are found in the directory
 ``examples/PACKAGES/pod``.
 ----------
 Output info
 """""""""""
-Compute *pod/atom* produces an 2D array of size :math:`N \times M`, where :math:`N` is the number of atoms
+Compute *pod/atom* produces an 2D array of size :math:`N \times M`,
-and :math:`M` is the number of descriptors. Each column corresponds to a particular POD descriptor.
+where :math:`N` is the number of atoms and :math:`M` is the number of
 descriptors. Each column corresponds to a particular POD descriptor.
-Compute *podd/atom* produces an 2D array of size :math:`N \times (M * 3 N)`. Each column
+Compute *podd/atom* produces an 2D array of size :math:`N \times (M * 3
-corresponds to a particular derivative of a POD descriptor.
+N)`. Each column corresponds to a particular derivative of a POD
 descriptor.
-Compute *pod/local* produces an 2D array of size :math:`(1 + 3N) \times (M * N)`.
+Compute *pod/local* produces an 2D array of size :math:`(1 + 3N) \times
-The first row contains the per-atom descriptors, and the last 3N rows contain the derivatives
+(M * N)`.  The first row contains the per-atom descriptors, and the last
-of the per-atom descriptors with respect to atom positions.
+3N rows contain the derivatives of the per-atom descriptors with respect
 to atom positions.
-Compute *pod/global* produces an 2D array of size :math:`(1 + 3N) \times (M)`.
+Compute *pod/global* produces an 2D array of size :math:`(1 + 3N) \times
-The first row contains the global descriptors, and the last 3N rows contain the derivatives
+(M)`.  The first row contains the global descriptors, and the last 3N
-of the global descriptors with respect to atom positions.
+rows contain the derivatives of the global descriptors with respect to
 atom positions.
 Restrictions
 """"""""""""
@ -107,19 +116,19 @@ none
 ----------
-.. _Nguyen20222:
+.. _Nguyen20222c:
 **(Nguyen and Rohskopf)** Nguyen and Rohskopf,  Journal of Computational Physics, 480, 112030, (2023).
-.. _Nguyen20232:
+.. _Nguyen20232c:
 **(Nguyen2023)** Nguyen, Physical Review B, 107(14), 144103, (2023).
-.. _Nguyen20242:
+.. _Nguyen20242c:
 **(Nguyen2024)** Nguyen, Journal of Computational Physics, 113102, (2024).
-.. _Nguyen20243:
+.. _Nguyen20243c:
 **(Nguyen and Sema)** Nguyen and Sema, https://arxiv.org/abs/2405.00306, (2024).
--- a/doc/src/fitpod_command.rst
+++ b/doc/src/fitpod_command.rst
@ -1,7 +1,7 @@
 .. index:: fitpod
 fitpod command
-======================
+==============
 Syntax
 """"""
@ -28,15 +28,19 @@ Description
 .. versionadded:: 22Dec2022
 Fit a machine-learning interatomic potential (ML-IAP) based on proper
-orthogonal descriptors (POD); please see :ref:`(Nguyen and Rohskopf) <Nguyen20222>`,
+orthogonal descriptors (POD); please see :ref:`(Nguyen and Rohskopf)
-:ref:`(Nguyen2023) <Nguyen20232>`, :ref:`(Nguyen2024) <Nguyen20242>`, and :ref:`(Nguyen and Sema) <Nguyen20243>` for details.
+<Nguyen20222a>`, :ref:`(Nguyen2023) <Nguyen20232a>`, :ref:`(Nguyen2024)
-The fitted POD potential can be used to run MD simulations via :doc:`pair_style pod <pair_pod>`.
+<Nguyen20242a>`, and :ref:`(Nguyen and Sema) <Nguyen20243a>` for details.
 The fitted POD potential can be used to run MD simulations via
 :doc:`pair_style pod <pair_pod>`.
-Two input files are required for this command. The first input file describes a POD potential parameter
+Two input files are required for this command. The first input file
-settings, while the second input file specifies the DFT data used for
+describes a POD potential parameter settings, while the second input
-the fitting procedure. All keywords except *species* have default values. If a keyword is not
+file specifies the DFT data used for the fitting procedure. All keywords
-set in the input file, its default value is used. The table below has one-line descriptions of all the keywords that can
+except *species* have default values. If a keyword is not set in the
-be used in the first input file  (i.e. ``Ta_param.pod``)
+input file, its default value is used. The table below has one-line
 descriptions of all the keywords that can be used in the first input
 file (i.e. ``Ta_param.pod``)
 .. list-table::
   :header-rows: 1
@ -127,8 +131,10 @@ be used in the first input file  (i.e. ``Ta_param.pod``)
     - INT
     - angular degree for seven-body potential
-Note that both the number of radial basis functions and angular degree must decrease as the body order increases. The next table describes all keywords that can be used in the second input file
+Note that both the number of radial basis functions and angular degree
-(i.e. ``Ta_data.pod`` in the example above):
+must decrease as the body order increases. The next table describes all
 keywords that can be used in the second input file (i.e. ``Ta_data.pod``
 in the example above):
 .. list-table::
@ -218,17 +224,19 @@ successful training, a number of output files are produced, if enabled:
 * ``<basename>_test_analysis.pod`` reports detailed errors for all test configurations
 * ``<basename>_coefficients.pod`` contains the coefficients of the POD potential
-After training the POD potential, ``Ta_param.pod`` and ``<basename>_coefficients.pod``
+After training the POD potential, ``Ta_param.pod`` and
-are the two files needed to use the POD potential in LAMMPS.
+``<basename>_coefficients.pod`` are the two files needed to use the POD
-See :doc:`pair_style pod <pair_pod>` for using the POD potential. Examples
+potential in LAMMPS.  See :doc:`pair_style pod <pair_pod>` for using the
-about training and using POD potentials are found in the directory
+POD potential. Examples about training and using POD potentials are
-lammps/examples/PACKAGES/pod and the Github repo https://github.com/cesmix-mit/pod-examples.
+found in the directory lammps/examples/PACKAGES/pod and the Github repo
 https://github.com/cesmix-mit/pod-examples.
 Loss Function Group Weights
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
-The ``group_weights`` keyword in the ``data.pod`` file is responsible for weighting certain groups
+The *group_weights* keyword in the ``data.pod`` file is responsible for
-of configurations in the loss function. For example:
+weighting certain groups of configurations in the loss function. For
 example:
 .. code-block:: LAMMPS
@ -246,9 +254,10 @@ of configurations in the loss function. For example:
    Volume_BCC    100.0 1.0
    Volume_FCC    100.0 1.0
-This will apply an energy weight of ``100.0`` and a force weight of ``1.0`` for all groups in the
+This will apply an energy weight of ``100.0`` and a force weight of
-``Ta`` example. The groups are named by their respective filename. If certain groups are left out of
+``1.0`` for all groups in the ``Ta`` example. The groups are named by
-this table, then the globally defined weights from the ``fitting_weight_energy`` and
+their respective filename. If certain groups are left out of this table,
 then the globally defined weights from the ``fitting_weight_energy`` and
 ``fitting_weight_force`` keywords will be used.
 POD Potential
@ -269,38 +278,43 @@ POD potential is expressed as :math:`E(\boldsymbol R, \boldsymbol Z) =
    E_i(\boldsymbol R_i, \boldsymbol Z_i) \ = \ \sum_{m=1}^M c_m \mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)
-Here :math:`c_m` are trainable coefficients and :math:`\mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)`
+Here :math:`c_m` are trainable coefficients and
-are per-atom POD descriptors. Summing the per-atom descriptors over :math:`i` yields the
+:math:`\mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)` are per-atom
-global descriptors :math:`d_m(\boldsymbol R, \boldsymbol Z) = \sum_{i=1}^N \mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)`.
+POD descriptors. Summing the per-atom descriptors over :math:`i` yields
-It thus follows that :math:`E(\boldsymbol R, \boldsymbol Z) =
+the global descriptors :math:`d_m(\boldsymbol R, \boldsymbol Z) =
-\sum_{m=1}^M c_m d_m(\boldsymbol R, \boldsymbol Z)`.
+\sum_{i=1}^N \mathcal{D}_{im}(\boldsymbol R_i, \boldsymbol Z_i)`.  It
 thus follows that :math:`E(\boldsymbol R, \boldsymbol Z) = \sum_{m=1}^M
 c_m d_m(\boldsymbol R, \boldsymbol Z)`.
-The per-atom POD descriptors include one, two, three, four, five, six, and seven-body
+The per-atom POD descriptors include one, two, three, four, five, six,
-descriptors, which can be specified in the first input file. Furthermore, the per-atom POD descriptors
+and seven-body descriptors, which can be specified in the first input
-also depend on the number of environment clusters specified in the first input file.
+file. Furthermore, the per-atom POD descriptors also depend on the
-Please see :ref:`(Nguyen2024) <Nguyen20242>` and :ref:`(Nguyen and Sema) <Nguyen20243>` for the detailed description of the per-atom POD descriptors.
+number of environment clusters specified in the first input file.
 Please see :ref:`(Nguyen2024) <Nguyen20242a>` and :ref:`(Nguyen and Sema)
 <Nguyen20243a>` for the detailed description of the per-atom POD
 descriptors.
 Training
 """"""""
-POD potential is trained using the least-squares regression against
+A POD potential is trained using the least-squares regression against
 density functional theory (DFT) data.  Let :math:`J` be the number of
 training configurations, with :math:`N_j` being the number of atoms in
 the j-th configuration. The training configurations are extracted from
-the extended XYZ files located in a directory (i.e., path_to_training_data_set
+the extended XYZ files located in a directory (i.e.,
-in the second input file).  Let :math:`\{E^{\star}_j\}_{j=1}^{J}` and
+path_to_training_data_set in the second input file).  Let
-:math:`\{\boldsymbol F^{\star}_j\}_{j=1}^{J}` be the DFT energies and
+:math:`\{E^{\star}_j\}_{j=1}^{J}` and :math:`\{\boldsymbol
-forces for :math:`J` configurations. Next, we calculate the global
+F^{\star}_j\}_{j=1}^{J}` be the DFT energies and forces for :math:`J`
-descriptors and their derivatives for all training configurations. Let
+configurations. Next, we calculate the global descriptors and their
-:math:`d_{jm}, 1 \le m \le M`, be the global descriptors associated with
+derivatives for all training configurations. Let :math:`d_{jm}, 1 \le m
-the j-th configuration, where :math:`M` is the number of global
+\le M`, be the global descriptors associated with the j-th
-descriptors. We then form a matrix :math:`\boldsymbol A \in
+configuration, where :math:`M` is the number of global descriptors. We
-\mathbb{R}^{J \times M}` with entries :math:`A_{jm} = d_{jm}/ N_j` for
+then form a matrix :math:`\boldsymbol A \in \mathbb{R}^{J \times M}`
-:math:`j=1,\ldots,J` and :math:`m=1,\ldots,M`.  Moreover, we form a
+with entries :math:`A_{jm} = d_{jm}/ N_j` for :math:`j=1,\ldots,J` and
-matrix :math:`\boldsymbol B \in \mathbb{R}^{\mathcal{N} \times M}` by
+:math:`m=1,\ldots,M`.  Moreover, we form a matrix :math:`\boldsymbol B
-stacking the derivatives of the global descriptors for all training
+\in \mathbb{R}^{\mathcal{N} \times M}` by stacking the derivatives of
-configurations from top to bottom, where :math:`\mathcal{N} =
+the global descriptors for all training configurations from top to
-3\sum_{j=1}^{J} N_j`.
+bottom, where :math:`\mathcal{N} = 3\sum_{j=1}^{J} N_j`.
 The coefficient vector :math:`\boldsymbol c` of the POD potential is
 found by solving the following least-squares problem
@ -311,20 +325,22 @@ found by solving the following least-squares problem
 where :math:`w_E` and :math:`w_F` are weights for the energy
 (*fitting_weight_energy*) and force (*fitting_weight_force*),
-respectively; and :math:`w_R` is the regularization parameter (*fitting_regularization_parameter*).  Here :math:`\bar{\boldsymbol E}^{\star} \in
+respectively; and :math:`w_R` is the regularization parameter
-\mathbb{R}^{J}` is a vector of with entries :math:`\bar{E}^{\star}_j =
+(*fitting_regularization_parameter*).  Here :math:`\bar{\boldsymbol
-E^{\star}_j/N_j` and :math:`\boldsymbol F^{\star}` is a vector of
+E}^{\star} \in \mathbb{R}^{J}` is a vector of with entries
-:math:`\mathcal{N}` entries obtained by stacking :math:`\{\boldsymbol
+:math:`\bar{E}^{\star}_j = E^{\star}_j/N_j` and :math:`\boldsymbol
-F^{\star}_j\}_{j=1}^{J}` from top to bottom.
+F^{\star}` is a vector of :math:`\mathcal{N}` entries obtained by
 stacking :math:`\{\boldsymbol F^{\star}_j\}_{j=1}^{J}` from top to
 bottom.
 Validation
 """"""""""
-POD potential can be validated on a test dataset in a directory specified
+POD potential can be validated on a test dataset in a directory
-by setting path_to_test_data_set in the second input file. It is possible to
+specified by setting path_to_test_data_set in the second input file.  It
-validate the POD potential after the training is complete. This is done by
+is possible to validate the POD potential after the training is
-providing the coefficient file as an input to :doc:`fitpod <fitpod_command>`,
+complete.  This is done by providing the coefficient file as an input to
-for example,
+:doc:`fitpod <fitpod_command>`, for example,
 .. code-block:: LAMMPS
@ -353,19 +369,19 @@ The keyword defaults are also given in the description of the input files.
 ----------
-.. _Nguyen20222:
+.. _Nguyen20222a:
 **(Nguyen and Rohskopf)** Nguyen and Rohskopf,  Journal of Computational Physics, 480, 112030, (2023).
-.. _Nguyen20232:
+.. _Nguyen20232a:
 **(Nguyen2023)** Nguyen, Physical Review B, 107(14), 144103, (2023).
-.. _Nguyen20242:
+.. _Nguyen20242a:
 **(Nguyen2024)** Nguyen, Journal of Computational Physics, 113102, (2024).
-.. _Nguyen20243:
+.. _Nguyen20243a:
 **(Nguyen and Sema)** Nguyen and Sema, https://arxiv.org/abs/2405.00306, (2024).
--- a/doc/src/pair_pod.rst
+++ b/doc/src/pair_pod.rst
@ -1,4 +1,5 @@
 .. index:: pair_style pod
 .. index:: pair_style pod/kk
 pair_style pod command
 ========================
@ -26,23 +27,25 @@ Description
 .. versionadded:: 22Dec2022
 Pair style *pod* defines the proper orthogonal descriptor (POD)
-potential :ref:`(Nguyen and Rohskopf) <Nguyen20222>`,
+potential :ref:`(Nguyen and Rohskopf) <Nguyen20222b>`,
-:ref:`(Nguyen2023) <Nguyen20232>`, :ref:`(Nguyen2024) <Nguyen20242>`, and :ref:`(Nguyen and Sema) <Nguyen20243>`.
+:ref:`(Nguyen2023) <Nguyen20232b>`, :ref:`(Nguyen2024) <Nguyen20242b>`,
-The :doc:`fitpod <fitpod_command>` is used to fit the POD potential.
+and :ref:`(Nguyen and Sema) <Nguyen20243b>`.  The :doc:`fitpod
 <fitpod_command>` is used to fit the POD potential.
 Only a single pair_coeff command is used with the *pod* style which
-specifies a POD parameter file followed by a coefficient file,
+specifies a POD parameter file followed by a coefficient file, a
-a projection matrix file, and a centroid file.
+projection matrix file, and a centroid file.
-The POD parameter file (``Ta_param.pod``) can contain blank and comment lines
+The POD parameter file (``Ta_param.pod``) can contain blank and comment
-(start with #) anywhere. Each non-blank non-comment line must contain
+lines (start with #) anywhere. Each non-blank non-comment line must
-one keyword/value pair. See :doc:`fitpod <fitpod_command>` for the description
+contain one keyword/value pair. See :doc:`fitpod <fitpod_command>` for
-of all the keywords that can be assigned in the parameter file.
+the description of all the keywords that can be assigned in the
 parameter file.
-The coefficient file (``Ta_coefficients.pod``) contains coefficients for the
+The coefficient file (``Ta_coefficients.pod``) contains coefficients for
-POD potential. The top of the coefficient file can contain any number of
+the POD potential. The top of the coefficient file can contain any
-blank and comment lines (start with #), but follows a strict format
+number of blank and comment lines (start with #), but follows a strict
-after that. The first non-blank non-comment line must contain:
+format after that. The first non-blank non-comment line must contain:
 * model_coefficients: *ncoeff* *nproj* *ncentroid*
@ -124,19 +127,19 @@ none
 ----------
-.. _Nguyen20222:
+.. _Nguyen20222b:
 **(Nguyen and Rohskopf)** Nguyen and Rohskopf,  Journal of Computational Physics, 480, 112030, (2023).
-.. _Nguyen20232:
+.. _Nguyen20232b:
 **(Nguyen2023)** Nguyen, Physical Review B, 107(14), 144103, (2023).
-.. _Nguyen20242:
+.. _Nguyen20242b:
 **(Nguyen2024)** Nguyen, Journal of Computational Physics, 113102, (2024).
-.. _Nguyen20243:
+.. _Nguyen20243b:
 **(Nguyen and Sema)** Nguyen and Sema, https://arxiv.org/abs/2405.00306, (2024).
--- a/doc/utils/sphinx-config/false_positives.txt
+++ b/doc/utils/sphinx-config/false_positives.txt
@ -3816,6 +3816,7 @@ typeJ
 typelabel
 typeN
 typesafe
 typestr
 Tz
 Tzou
 ub