lammps/doc/src/Developer_utils.rst


Utility functions
-----------------

The ``utils`` sub-namespace inside the ``LAMMPS_NS`` namespace provides
a collection of convenience functions and utilities that perform common
tasks that are required repeatedly throughout the LAMMPS code like
reading or writing to files with error checking or translation of
strings into specific types of numbers with checking for validity.  This
reduces redundant implementations and encourages consistent behavior and
thus has some overlap with the :doc:`"platform" sub-namespace
<Developer_platform>`.

I/O with status check and similar functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The the first two functions are wrappers around the corresponding C
library calls ``fgets()`` or ``fread()``.  They will check if there
were errors on reading or an unexpected end-of-file state was reached.
In that case, the functions will stop with an error message, indicating
the name of the problematic file, if possible unless the *error* argument
is a NULL pointer.

The :cpp:func:`utils::fgets_trunc() <LAMMPS_NS::utils::fgets_trunc>`
function will work similar for ``fgets()`` but it will read in a whole
line (i.e. until the end of line or end of file), but store only as many
characters as will fit into the buffer including a final newline
character and the terminating NULL byte.  If the line in the file is
longer it will thus be truncated in the buffer.  This function is used
by :cpp:func:`utils::read_lines_from_file()
<LAMMPS_NS::utils::read_lines_from_file>` to read individual lines but
make certain they follow the size constraints.

The :cpp:func:`utils::read_lines_from_file()
<LAMMPS_NS::utils::read_lines_from_file>` function will read the
requested number of lines of a maximum length into a buffer and will
return 0 if successful or 1 if not. It also guarantees that all lines
are terminated with a newline character and the entire buffer with a
NULL character.

----------

.. doxygenfunction:: sfgets
   :project: progguide

.. doxygenfunction:: sfread
   :project: progguide

.. doxygenfunction:: fgets_trunc
   :project: progguide

.. doxygenfunction:: read_lines_from_file
   :project: progguide

----------

String to number conversions with validity check
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

These functions should be used to convert strings to numbers. They are
are strongly preferred over C library calls like ``atoi()`` or
``atof()`` since they check if the **entire** string is a valid
(floating-point or integer) number, and will error out instead of
silently returning the result of a partial conversion or zero in cases
where the string is not a valid number.  This behavior improves
detecting typos or issues when processing input files.

Similarly the :cpp:func:`utils::logical() <LAMMPS_NS::utils::logical>` function
will convert a string into a boolean and will only accept certain words.

The *do_abort* flag should be set to ``true`` in case  this function
is called only on a single MPI rank, as that will then trigger the
a call to ``Error::one()`` for errors instead of ``Error::all()``
and avoids a "hanging" calculation when run in parallel.

Please also see :cpp:func:`utils::is_integer() <LAMMPS_NS::utils::is_integer>`
and :cpp:func:`utils::is_double() <LAMMPS_NS::utils::is_double>` for testing
strings for compliance without conversion.

----------

.. doxygenfunction:: numeric(const char *file, int line, const std::string &str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: numeric(const char *file, int line, const char *str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: inumeric(const char *file, int line, const std::string &str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: inumeric(const char *file, int line, const char *str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: bnumeric(const char *file, int line, const std::string &str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: bnumeric(const char *file, int line, const char *str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: tnumeric(const char *file, int line, const std::string &str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: tnumeric(const char *file, int line, const char *str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: logical(const char *file, int line, const std::string &str, bool do_abort, LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: logical(const char *file, int line, const char *str, bool do_abort, LAMMPS *lmp)
   :project: progguide


String processing
^^^^^^^^^^^^^^^^^

The following are functions to help with processing strings
and parsing files or arguments.

----------

.. doxygenfunction:: strdup
   :project: progguide

.. doxygenfunction:: lowercase
   :project: progguide

.. doxygenfunction:: uppercase
   :project: progguide

.. doxygenfunction:: trim
   :project: progguide

.. doxygenfunction:: trim_comment
   :project: progguide

.. doxygenfunction:: strcompress
   :project: progguide

.. doxygenfunction:: strip_style_suffix
   :project: progguide

.. doxygenfunction:: star_subst
   :project: progguide

.. doxygenfunction:: has_utf8
   :project: progguide

.. doxygenfunction:: utf8_subst
   :project: progguide

.. doxygenfunction:: count_words(const char *text)
   :project: progguide

.. doxygenfunction:: count_words(const std::string &text)
   :project: progguide

.. doxygenfunction:: count_words(const std::string &text, const std::string &separators)
   :project: progguide

.. doxygenfunction:: trim_and_count_words
   :project: progguide

.. doxygenfunction:: join_words
   :project: progguide

.. doxygenfunction:: split_words
   :project: progguide

.. doxygenfunction:: split_lines
   :project: progguide

.. doxygenfunction:: strsame
   :project: progguide

.. doxygenfunction:: strmatch
   :project: progguide

.. doxygenfunction:: strfind
   :project: progguide

.. doxygenfunction:: is_integer
   :project: progguide

.. doxygenfunction:: is_double
   :project: progguide

.. doxygenfunction:: is_id
   :project: progguide

.. doxygenfunction:: is_type
   :project: progguide

Potential file functions
^^^^^^^^^^^^^^^^^^^^^^^^

.. doxygenfunction:: get_potential_file_path
   :project: progguide

.. doxygenfunction:: get_potential_date
   :project: progguide

.. doxygenfunction:: get_potential_units
   :project: progguide

.. doxygenfunction:: get_supported_conversions
   :project: progguide

.. doxygenfunction:: get_conversion_factor
   :project: progguide

.. doxygenfunction:: open_potential(const std::string &name, LAMMPS *lmp, int *auto_convert)
   :project: progguide

Argument processing
^^^^^^^^^^^^^^^^^^^

.. doxygenfunction:: bounds
   :project: progguide

.. doxygenfunction:: bounds_typelabel
   :project: progguide

.. doxygenfunction:: expand_args
   :project: progguide

.. doxygenfunction:: parse_grid_id
   :project: progguide

.. doxygenfunction:: expand_type
   :project: progguide

Convenience functions
^^^^^^^^^^^^^^^^^^^^^

.. doxygenfunction:: logmesg(LAMMPS *lmp, const std::string &format, Args&&... args)
   :project: progguide

.. doxygenfunction:: logmesg(LAMMPS *lmp, const std::string &mesg)
   :project: progguide

.. doxygenfunction:: print(FILE *fp, const std::string &format, Args&&... args)
   :project: progguide

.. doxygenfunction:: print(FILE *fp, const std::string &mesg)
   :project: progguide

.. doxygenfunction:: errorurl
   :project: progguide

.. doxygenfunction:: missing_cmd_args
   :project: progguide

.. doxygenfunction:: point_to_error
   :project: progguide

.. doxygenfunction:: flush_buffers(LAMMPS *lmp)
   :project: progguide

.. doxygenfunction:: getsyserror
   :project: progguide

.. doxygenfunction:: check_packages_for_style
   :project: progguide

.. doxygenfunction:: timespec2seconds
   :project: progguide

.. doxygenfunction:: date2num
   :project: progguide

.. doxygenfunction:: current_date
   :project: progguide

Customized standard functions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. doxygenfunction:: binary_search
   :project: progguide

.. doxygenfunction:: merge_sort
   :project: progguide

---------------------------

Special Math functions
----------------------

The ``MathSpecial`` namespace implements a selection of custom and optimized
mathematical functions for a variety of applications.

.. doxygenfunction:: factorial
   :project: progguide

.. doxygenfunction:: exp2_x86
   :project: progguide

.. doxygenfunction:: fm_exp
   :project: progguide

.. doxygenfunction:: my_erfcx
   :project: progguide

.. doxygenfunction:: expmsq
   :project: progguide

.. doxygenfunction:: square
   :project: progguide

.. doxygenfunction:: cube
   :project: progguide

.. doxygenfunction:: powsign
   :project: progguide

.. doxygenfunction:: powint
   :project: progguide

.. doxygenfunction:: powsinxx
   :project: progguide

---------------------------

Tokenizer classes
-----------------

The purpose of the tokenizer classes is to simplify the recurring task
of breaking lines of text down into words and/or numbers.
Traditionally, LAMMPS code would be using the ``strtok()`` function from
the C library for that purpose, but that function has two significant
disadvantages: 1) it cannot be used concurrently from different LAMMPS
instances since it stores its status in a global variable and 2) it
modifies the string that it is processing.  These classes were
implemented to avoid both of these issues and also to reduce the amount
of code that needs to be written.

The basic procedure is to create an instance of the tokenizer class with
the string to be processed as an argument and then do a loop until all
available tokens are read.  The constructor has a default set of
separator characters, but that can be overridden. The default separators
are all "whitespace" characters, i.e. the space character, the tabulator
character, the carriage return character, the linefeed character, and
the form feed character.

.. code-block:: c++
   :caption: Tokenizer class example listing entries of the PATH environment variable

   #include "tokenizer.h"
   #include <cstdlib>
   #include <string>
   #include <iostream>

   using namespace LAMMPS_NS;

   int main(int, char **)
   {
       const char *path = getenv("PATH");

       if (path != nullptr) {
           Tokenizer p(path,":");
           while (p.has_next())
               std::cout << "Entry: " << p.next() << "\n";
       }
       return 0;
   }

Most tokenizer operations cannot fail except for
:cpp:func:`LAMMPS_NS::Tokenizer::next` (when used without first
checking with :cpp:func:`LAMMPS_NS::Tokenizer::has_next`) and
:cpp:func:`LAMMPS_NS::Tokenizer::skip`.  In case of failure, the class
will throw an exception, so you may need to wrap the code using the
tokenizer into a ``try`` / ``catch`` block to handle errors.  The
:cpp:class:`LAMMPS_NS::ValueTokenizer` class may also throw an exception
when a (type of) number is requested as next token that is not
compatible with the string representing the next word.

.. code-block:: c++
   :caption: ValueTokenizer class example with exception handling

   #include "tokenizer.h"
   #include <cstdlib>
   #include <string>
   #include <iostream>

   using namespace LAMMPS_NS;

   int main(int, char **)
   {
       const char *text = "1 2 3 4 5 20.0 21 twentytwo 2.3";
       double num1(0),num2(0),num3(0),num4(0);

       ValueTokenizer t(text);
       // read 4 doubles after skipping over 5 numbers
       try {
           t.skip(5);
           num1 = t.next_double();
           num2 = t.next_double();
           num3 = t.next_double();
           num4 = t.next_double();
       } catch (TokenizerException &e) {
           std::cout << "Reading numbers failed: " << e.what() << "\n";
       }
       std::cout << "Values: " << num1 << " " << num2 << " " << num3 << " " << num4 << "\n";
       return 0;
   }

This code example should produce the following output:

.. code-block::

   Reading numbers failed: Not a valid floating-point number: 'twentytwo'
   Values: 20 21 0 0

----------

.. doxygenclass:: LAMMPS_NS::Tokenizer
   :project: progguide
   :members:

.. doxygenclass:: LAMMPS_NS::TokenizerException
   :project: progguide
   :members:

.. doxygenclass:: LAMMPS_NS::ValueTokenizer
   :project: progguide
   :members:

.. doxygenclass:: LAMMPS_NS::InvalidIntegerException
   :project: progguide
   :members:

.. doxygenclass:: LAMMPS_NS::InvalidFloatException
   :project: progguide
   :members:

----------


Argument parsing classes
---------------------------

The purpose of argument parsing classes it to simplify and unify how
arguments of commands in LAMMPS are parsed and to make abstractions of
repetitive tasks.

The :cpp:class:`LAMMPS_NS::ArgInfo` class provides an abstraction
for parsing references to compute or fix styles, variables or custom
integer or double properties handled by :doc:`fix property/atom <fix_property_atom>`.
These would start with a "c\_", "f\_", "v\_", "d\_", "d2\_", "i\_", or "i2\_"
followed by the ID or name of than instance and may be postfixed with
one or two array indices "[<number>]" with numbers > 0.

A typical code segment would look like this:

.. code-block:: c++
   :caption: Usage example for ArgInfo class

   int nvalues = 0;
   for (iarg = 0; iarg < nargnew; iarg++) {
     ArgInfo argi(arg[iarg]);

     which[nvalues] = argi.get_type();
     argindex[nvalues] = argi.get_index1();
     ids[nvalues] = argi.copy_name();

     if ((which[nvalues] == ArgInfo::UNKNOWN)
          || (which[nvalues] == ArgInfo::NONE)
          || (argi.get_dim() > 1))
       error->all(FLERR,"Illegal compute XXX command");

     nvalues++;
   }

----------

.. doxygenclass:: LAMMPS_NS::ArgInfo
   :project: progguide
   :members:


----------

.. _file-reader-classes:

File reader classes
-------------------

The purpose of the file reader classes is to simplify the recurring task
of reading and parsing files. They can use the
:cpp:class:`ValueTokenizer <LAMMPS_NS::ValueTokenizer>` class to process
the read in text.  The :cpp:class:`TextFileReader
<LAMMPS_NS::TextFileReader>` is a more general version while
:cpp:class:`PotentialFileReader <LAMMPS_NS::PotentialFileReader>` is
specialized to implement the behavior expected for looking up and
reading/parsing files with potential parameters in LAMMPS.  The
potential file reader class requires a LAMMPS instance, requires to be
run on MPI rank 0 only, will use the
:cpp:func:`utils::get_potential_file_path
<LAMMPS_NS::utils::get_potential_file_path>` function to look up and
open the file, and will call the :cpp:class:`LAMMPS_NS::Error` class in
case of failures to read or to convert numbers, so that LAMMPS will be
aborted.

.. code-block:: c++
   :caption: Use of PotentialFileReader class in pair style coul/streitz

    PotentialFileReader reader(lmp, file, "coul/streitz");
    char * line;

    while((line = reader.next_line(NPARAMS_PER_LINE))) {
      try {
        ValueTokenizer values(line);
        std::string iname = values.next_string();

        int ielement;
        for (ielement = 0; ielement < nelements; ielement++)
          if (iname == elements[ielement]) break;

        if (nparams == maxparam) {
          maxparam += DELTA;
          params = (Param *) memory->srealloc(params,maxparam*sizeof(Param),
                                              "pair:params");
        }

        params[nparams].ielement = ielement;
        params[nparams].chi = values.next_double();
        params[nparams].eta = values.next_double();
        params[nparams].gamma = values.next_double();
        params[nparams].zeta = values.next_double();
        params[nparams].zcore = values.next_double();

      } catch (TokenizerException & e) {
        error->one(FLERR, e.what());
      }
      nparams++;
    }

A file that would be parsed by the reader code fragment looks like this:

.. parsed-literal::

   # DATE: 2015-02-19 UNITS: metal CONTRIBUTOR: Ray Shan CITATION: Streitz and Mintmire, Phys Rev B, 50, 11996-12003 (1994)
   #
   # X (eV)                J (eV)          gamma (1/\AA)   zeta (1/\AA)    Z (e)

   Al      0.000000        10.328655       0.000000        0.968438        0.763905
   O       5.484763        14.035715       0.000000        2.143957        0.000000


----------

.. doxygenclass:: LAMMPS_NS::TextFileReader
   :project: progguide
   :members:

.. doxygenclass:: LAMMPS_NS::PotentialFileReader
   :project: progguide
   :members:

----------

Memory pool classes
-------------------

The memory pool classes are used for cases where otherwise many
small memory allocations would be needed and where the data would
be either all used or all freed.  One example for that is the
storage of neighbor lists.  The memory management strategy is
based on the assumption that allocations will be in chunks of similar
sizes.  The allocation is then not done per individual call for a
reserved chunk of memory, but for a "page" that can hold multiple
chunks of data.  A parameter for the maximum chunk size must be
provided, as that is used to determine whether a new page of memory
must be used.

The :cpp:class:`MyPage <LAMMPS_NS::MyPage>` class offers two ways to
reserve a chunk: 1) with :cpp:func:`MyPage::get() <LAMMPS_NS::MyPage::get>` the
chunk size needs to be known in advance, 2) with :cpp:func:`MyPage::vget()
<LAMMPS_NS::MyPage::vget>` a pointer to the next chunk is returned, but
its size is registered later with :cpp:func:`MyPage::vgot()
<LAMMPS_NS::MyPage::vgot>`.

.. code-block:: c++
   :caption: Example of using :cpp:class:`MyPage <LAMMPS_NS::MyPage>`

      #include "my_page.h"
      using namespace LAMMPS_NS;

      MyPage<double> *dpage = new MyPage<double>;
      // max size of chunk: 256, size of page: 10240 doubles (=81920 bytes)
      dpage->init(256,10240);

      double **build_some_lists(int num)
      {
          dpage->reset();
          double **dlist = new double*[num];
          for (int i=0; i < num; ++i) {
              double *dptr = dpage.vget();
              int jnum = 0;
              for (int j=0; j < jmax; ++j) {
                  // compute some dvalue for eligible loop index j
                  dptr[j] = dvalue;
                  ++jnum;
              }
              if (dpage.status() != 0) {
                  // handle out of memory or jnum too large errors
              }
              dpage.vgot(jnum);
              dlist[i] = dptr;
          }
          return dlist;
      }

----------

.. doxygenclass:: LAMMPS_NS::MyPage
   :project: progguide
   :members:

.. doxygenclass:: LAMMPS_NS::MyPoolChunk
   :project: progguide
   :members:

----------

Eigensolver functions
---------------------

The ``MathEigen`` sub-namespace of the ``LAMMPS_NS`` namespace contains
functions and classes for eigensolvers. Currently only the
:cpp:func:`jacobi3 function <MathEigen::jacobi3>` is used in various
places in LAMMPS.  That function is built on top of a group of more
generic eigensolvers that are maintained in the ``math_eigen_impl.h``
header file.  This header contains the implementation of three template
classes:

#. "Jacobi" calculates all of the eigenvalues and eigenvectors
   of a dense, symmetric, real matrix.

#. The "PEigenDense" class only calculates the principal eigenvalue
   (i.e. the largest or smallest eigenvalue), and its corresponding
   eigenvector.  However it is much more efficient than "Jacobi" when
   applied to large matrices (larger than 13x13).  PEigenDense also can
   understand complex-valued Hermitian matrices.

#. The "LambdaLanczos" class is a generalization of "PEigenDense" which can be
   applied to arbitrary sparse matrices.

The "math_eigen_impl.h" code is an amalgamation of `jacobi_pd
<https://github.com/jewettaij/jacobi_pd>`_ by Andrew Jewett at Scripps
Research (under CC0-1.0 license) and `Lambda Lanczos
<https://github.com/mrcdr/lambda-lanczos>`_ by Yuya Kurebayashi at
Tohoku University (under MIT license)

----------

.. doxygenfunction:: MathEigen::jacobi3(double const *const *mat, double *eval, double **evec, int sort)
   :project: progguide

.. doxygenfunction:: MathEigen::jacobi3(double const mat[3][3], double *eval, double evec[3][3], int sort)
   :project: progguide

---------------------------

.. _communication_buffer_coding_with_ubuf:

Communication buffer coding with *ubuf*
---------------------------------------

LAMMPS uses communication buffers where it collects data from various
class instances and then exchanges the data with neighboring subdomains.
For simplicity those buffers are defined as ``double`` buffers and
used for doubles and integer numbers. This presents a unique problem
when 64-bit integers are used.  While the storage needed for a ``double``
is also 64-bit, it cannot be used by a simple assignment.  To get around
that limitation, LAMMPS uses the :cpp:union:`ubuf <LAMMPS_NS::ubuf>`
union.  It is used in the various "pack" and "unpack" functions in the
LAMMPS classes to store and retrieve integers that may be 64-bit from
the communication buffers.

---------------------------

.. doxygenunion:: LAMMPS_NS::ubuf
   :project: progguide