reformate for improved readability and make some updates due to changes in the code

2025-06-24 00:09:46 -04:00
parent 24b15f7e46
commit 80fbdceff2
1 changed files with 130 additions and 105 deletions
--- a/doc/src/Errors_common.rst
+++ b/doc/src/Errors_common.rst
@ -1,124 +1,149 @@
-Common problems
-===============
+Common issues that are often regarded as bugs
+=============================================

-If two LAMMPS runs do not produce the exact same answer on different
-machines or different numbers of processors, this is typically not a
-bug.  In theory you should get identical answers on any number of
-processors and on any machine.  In practice, numerical round-off can
-cause slight differences and eventual divergence of molecular dynamics
-phase space trajectories within a few 100s or few 1000s of timesteps.
-However, the statistical properties of the two runs (e.g. average
-energy or temperature) should still be the same.
+The list below are some random notes on behavior of LAMMPS that is
+sometimes unexpected or even considered a bug.  Most of the time, these
+are just issues of understanding how LAMMPS is implemented and
+parallelized.  Please also have a look at the :doc:`Error details
+discussions page <Errors_details>` that contains recommendations for
+tracking down issues and explanations for error messages that may
+sometimes be confusing or need additional explanations.

-If the :doc:`velocity <velocity>` command is used to set initial atom
-velocities, a particular atom can be assigned a different velocity
-when the problem is run on a different number of processors or on
-different machines.  If this happens, the phase space trajectories of
-the two simulations will rapidly diverge.  See the discussion of the
-*loop* option in the :doc:`velocity <velocity>` command for details and
-options that avoid this issue.
+- A LAMMPS simulation typically has two stages, 1) issuing commands
+  and 2) run or minimize.  Most LAMMPS errors are detected in stage 1),
+  others at the beginning of stage 2), and finally others like a bond
+  stretching too far may or lost atoms or bonds may not occur until the
+  middle of a run.

-Similarly, the :doc:`create_atoms <create_atoms>` command generates a
-lattice of atoms.  For the same physical system, the ordering and
-numbering of atoms by atom ID may be different depending on the number
-of processors.
+- If two LAMMPS runs do not produce the exact same answer on different
+  machines or different numbers of processors, this is typically not a
+  bug.  In theory you should get identical answers on any number of
+  processors and on any machine.  In practice, numerical round-off can
+  cause slight differences and eventual divergence of molecular dynamics
+  phase space trajectories within a few 100s or few 1000s of timesteps.
+  This can be triggered by different ordering of atoms due to different
+  domain decompositions, but also through different CPU architectures,
+  different operating systems, different compilers or compiler versions,
+  different compiler optimization levels, different FFT libraries.
+  However, the statistical properties of the two runs (e.g. average
+  energy or temperature) should still be the same.

-Some commands use random number generators which may be setup to
-produce different random number streams on each processor and hence
-will produce different effects when run on different numbers of
-processors.  A commonly-used example is the :doc:`fix langevin <fix_langevin>` command for thermostatting.
+- If the :doc:`velocity <velocity>` command is used to set initial atom
+  velocities, a particular atom can be assigned a different velocity
+  when the problem is run on a different number of processors or on
+  different machines.  If this happens, the phase space trajectories of
+  the two simulations will rapidly diverge.  See the discussion of the
+  *loop* option in the :doc:`velocity <velocity>` command for details
+  and options that avoid this issue.

-A LAMMPS simulation typically has two stages, setup and run.  Most
-LAMMPS errors are detected at setup time; others like a bond
-stretching too far may not occur until the middle of a run.
+- Similarly, the :doc:`create_atoms <create_atoms>` command generates a
+  lattice of atoms.  For the same physical system, the ordering and
+  numbering of atoms by atom ID may be different depending on the number
+  of processors.

-LAMMPS tries to flag errors and print informative error messages so
-you can fix the problem.  For most errors it will also print the last
-input script command that it was processing.  Of course, LAMMPS cannot
-figure out your physics or numerical mistakes, like choosing too big a
-timestep, specifying erroneous force field coefficients, or putting 2
-atoms on top of each other!  If you run into errors that LAMMPS
-does not catch that you think it should flag, please send an email to
-the `developers <https://www.lammps.org/authors.html>`_ or create an new
-topic on the dedicated `MatSci forum section <https://matsci.org/lammps/>`_.
+- Some commands use random number generators which may be setup to
+  produce different random number streams on each processor and hence
+  will produce different effects when run on different numbers of
+  processors.  A commonly-used example is the :doc:`fix langevin
+  <fix_langevin>` command for thermostatting.

-If you get an error message about an invalid command in your input
-script, you can determine what command is causing the problem by
-looking in the log.lammps file or using the :doc:`echo command <echo>`
-to see it on the screen.  If you get an error like "Invalid ...
-style", with ... being fix, compute, pair, etc, it means that you
-mistyped the style name or that the command is part of an optional
-package which was not compiled into your executable.  The list of
-available styles in your executable can be listed by using
-:doc:`the -h command-line switch <Run_options>`.  The installation and
-compilation of optional packages is explained on the
-:doc:`Build packages <Build_package>` doc page.
+- LAMMPS tries to flag errors and print informative error messages so
+  you can fix the problem.  For most errors it will also print the last
+  input script command that it was processing or even point to the
+  keyword that is causing troubles.  Of course, LAMMPS cannot figure out
+  your physics or numerical mistakes, like choosing too big a timestep,
+  specifying erroneous force field coefficients, or putting 2 atoms on
+  top of each other!  Also, LAMMPS does not know what you *intend* to
+  do, but very strictly applies the syntax as described in the
+  documentation.  If you run into errors that LAMMPS does not catch that
+  you think it should flag, please send an email to the `developers
+  <https://www.lammps.org/authors.html>`_ or create an new topic on the
+  dedicated `MatSci forum section <https://matsci.org/lammps/>`_.

-For a given command, LAMMPS expects certain arguments in a specified
-order.  If you mess this up, LAMMPS will often flag the error, but it
-may also simply read a bogus argument and assign a value that is
-valid, but not what you wanted.  E.g. trying to read the string "abc"
-as an integer value of 0.  Careful reading of the associated doc page
-for the command should allow you to fix these problems. In most cases,
-where LAMMPS expects to read a number, either integer or floating point,
-it performs a stringent test on whether the provided input actually
-is an integer or floating-point number, respectively, and reject the
-input with an error message (for instance, when an integer is required,
-but a floating-point number 1.0 is provided):
+- If you get an error message about an invalid command in your input
+  script, you can determine what command is causing the problem by
+  looking in the log.lammps file or using the :doc:`echo command <echo>`
+  to see it on the screen.  If you get an error like "Invalid ...
+  style", with ... being fix, compute, pair, etc, it means that you
+  mistyped the style name or that the command is part of an optional
+  package which was not compiled into your executable.  The list of
+  available styles in your executable can be listed by using
+  :doc:`the -h command-line switch <Run_options>`.  The installation and
+  compilation of optional packages is explained on the :doc:`Build
+  packages <Build_package>` doc page.

-.. parsed-literal::
+- For a given command, LAMMPS expects certain arguments in a specified
+  order.  If you mess this up, LAMMPS will often flag the error, but it
+  may also simply read a bogus argument and assign a value that is
+  valid, but not what you wanted.  E.g. trying to read the string "abc"
+  as an integer value of 0.  Careful reading of the associated doc page
+  for the command should allow you to fix these problems. In most cases,
+  where LAMMPS expects to read a number, either integer or floating
+  point, it performs a stringent test on whether the provided input
+  actually is an integer or floating-point number, respectively, and
+  reject the input with an error message (for instance, when an integer
+  is required, but a floating-point number 1.0 is provided):

-   ERROR: Expected integer parameter instead of '1.0' in input script or data file
+  .. parsed-literal::

-Some commands allow for using variable references in place of numeric
-constants so that the value can be evaluated and may change over the
-course of a run.  This is typically done with the syntax *v_name* for a
-parameter, where name is the name of the variable. On the other hand,
-immediate variable expansion with the syntax ${name} is performed while
-reading the input and before parsing commands,
+     ERROR: Expected integer parameter instead of '1.0' in input script or data file

-.. note::
+- Some commands allow for using variable references in place of numeric
+  constants so that the value can be evaluated and may change over the
+  course of a run.  This is typically done with the syntax *v_name* for
+  a parameter, where name is the name of the variable. On the other
+  hand, immediate variable expansion with the syntax ${name} is
+  performed while reading the input and before parsing commands,

-   Using a variable reference (i.e. *v_name*) is only allowed if
-   the documentation of the corresponding command explicitly says it is.
-   Otherwise, you will receive an error message of this kind:
+  .. note::

-.. parsed-literal::
+     Using a variable reference (i.e. *v_name*) is only allowed if
+     the documentation of the corresponding command explicitly says it is.
+     Otherwise, you will receive an error message of this kind:

-   ERROR: Expected floating point parameter instead of 'v_name' in input script or data file
+  .. parsed-literal::

-Generally, LAMMPS will print a message to the screen and logfile and
-exit gracefully when it encounters a fatal error.  Sometimes it will
-print a WARNING to the screen and logfile and continue on; you can
-decide if the WARNING is important or not.  A WARNING message that is
-generated in the middle of a run is only printed to the screen, not to
-the logfile, to avoid cluttering up thermodynamic output.  If LAMMPS
-crashes or hangs without spitting out an error message first then it
-could be a bug (see :doc:`this section <Errors_bugs>`) or one of the following
-cases:
+     ERROR: Expected floating point parameter instead of 'v_name' in input script or data file

-LAMMPS runs in the available memory a processor allows to be
-allocated.  Most reasonable MD runs are compute limited, not memory
-limited, so this should not be a bottleneck on most platforms.  Almost
-all large memory allocations in the code are done via C-style malloc's
-which will generate an error message if you run out of memory.
-Smaller chunks of memory are allocated via C++ "new" statements.  If
-you are unlucky you could run out of memory just when one of these
-small requests is made, in which case the code will crash or hang (in
-parallel), since LAMMPS does not trap on those errors.
+- Generally, LAMMPS will print a message to the screen and logfile and
+  exit gracefully when it encounters a fatal error.  When running in
+  parallel this message may be stuck in an I/O buffer and LAMMPS will be
+  terminated before that buffer is printed.  In that case you can try
+  adding the ``-nonblock`` or ``-nb`` command-line flag to turn off that
+  buffering.  Please note that this should not be used for production
+  runs, since turning off buffering usually has a significant negative
+  impact on performance (even worse than :doc:`thermo_modify flush yes
+  <thermo_modify>`).  Sometimes LAMMPS will print a WARNING to the
+  screen and logfile and continue on; you can decide if the WARNING is
+  important or not, but as a general rule do not ignore warnings that
+  you not understand.  A WARNING message that is generated in the middle
+  of a run is only printed to the screen, not to the logfile, to avoid
+  cluttering up thermodynamic output.  If LAMMPS crashes or hangs
+  without generating an error message first then it could be a bug
+  (see :doc:`this section <Errors_bugs>`).

-Illegal arithmetic can cause LAMMPS to run slow or crash.  This is
-typically due to invalid physics and numerics that your simulation is
-computing.  If you see wild thermodynamic values or NaN values in your
-LAMMPS output, something is wrong with your simulation.  If you
-suspect this is happening, it is a good idea to print out
-thermodynamic info frequently (e.g. every timestep) via the
-:doc:`thermo <thermo>` so you can monitor what is happening.
-Visualizing the atom movement is also a good idea to ensure your model
-is behaving as you expect.
+- LAMMPS runs in the available memory a processor allows to be
+  allocated.  Most reasonable MD runs are compute limited, not memory
+  limited, so this should not be a bottleneck on most platforms.  Almost
+  all large memory allocations in the code are done via C-style malloc's
+  which will generate an error message if you run out of memory.
+  Smaller chunks of memory are allocated via C++ "new" statements.  If
+  you are unlucky you could run out of memory just when one of these
+  small requests is made, in which case the code will crash or hang (in
+  parallel).

-In parallel, one way LAMMPS can hang is due to how different MPI
-implementations handle buffering of messages.  If the code hangs
-without an error message, it may be that you need to specify an MPI
-setting or two (usually via an environment variable) to enable
-buffering or boost the sizes of messages that can be buffered.
+- Illegal arithmetic can cause LAMMPS to run slow or crash.  This is
+  typically due to invalid physics and numerics that your simulation is
+  computing.  If you see wild thermodynamic values or NaN values in your
+  LAMMPS output, something is wrong with your simulation.  If you
+  suspect this is happening, it is a good idea to print out
+  thermodynamic info frequently (e.g. every timestep) via the
+  :doc:`thermo <thermo>` so you can monitor what is happening.
+  Visualizing the atom movement is also a good idea to ensure your model
+  is behaving as you expect.
+
+- When running in parallel with MPI, one way LAMMPS can hang is because
+  LAMMPS has come across an error condition, but only on one or a few
+  MPI processes and not all of them.  LAMMPS has two different "stop
+  with an error message" functions and the correct one has to be called
+  or else it will hang.