reformate for improved readability and make some updates due to changes in the code
This commit is contained in:
@ -1,124 +1,149 @@
|
|||||||
Common problems
|
Common issues that are often regarded as bugs
|
||||||
===============
|
=============================================
|
||||||
|
|
||||||
If two LAMMPS runs do not produce the exact same answer on different
|
The list below are some random notes on behavior of LAMMPS that is
|
||||||
machines or different numbers of processors, this is typically not a
|
sometimes unexpected or even considered a bug. Most of the time, these
|
||||||
bug. In theory you should get identical answers on any number of
|
are just issues of understanding how LAMMPS is implemented and
|
||||||
processors and on any machine. In practice, numerical round-off can
|
parallelized. Please also have a look at the :doc:`Error details
|
||||||
cause slight differences and eventual divergence of molecular dynamics
|
discussions page <Errors_details>` that contains recommendations for
|
||||||
phase space trajectories within a few 100s or few 1000s of timesteps.
|
tracking down issues and explanations for error messages that may
|
||||||
However, the statistical properties of the two runs (e.g. average
|
sometimes be confusing or need additional explanations.
|
||||||
energy or temperature) should still be the same.
|
|
||||||
|
|
||||||
If the :doc:`velocity <velocity>` command is used to set initial atom
|
- A LAMMPS simulation typically has two stages, 1) issuing commands
|
||||||
velocities, a particular atom can be assigned a different velocity
|
and 2) run or minimize. Most LAMMPS errors are detected in stage 1),
|
||||||
when the problem is run on a different number of processors or on
|
others at the beginning of stage 2), and finally others like a bond
|
||||||
different machines. If this happens, the phase space trajectories of
|
stretching too far may or lost atoms or bonds may not occur until the
|
||||||
the two simulations will rapidly diverge. See the discussion of the
|
middle of a run.
|
||||||
*loop* option in the :doc:`velocity <velocity>` command for details and
|
|
||||||
options that avoid this issue.
|
|
||||||
|
|
||||||
Similarly, the :doc:`create_atoms <create_atoms>` command generates a
|
- If two LAMMPS runs do not produce the exact same answer on different
|
||||||
lattice of atoms. For the same physical system, the ordering and
|
machines or different numbers of processors, this is typically not a
|
||||||
numbering of atoms by atom ID may be different depending on the number
|
bug. In theory you should get identical answers on any number of
|
||||||
of processors.
|
processors and on any machine. In practice, numerical round-off can
|
||||||
|
cause slight differences and eventual divergence of molecular dynamics
|
||||||
|
phase space trajectories within a few 100s or few 1000s of timesteps.
|
||||||
|
This can be triggered by different ordering of atoms due to different
|
||||||
|
domain decompositions, but also through different CPU architectures,
|
||||||
|
different operating systems, different compilers or compiler versions,
|
||||||
|
different compiler optimization levels, different FFT libraries.
|
||||||
|
However, the statistical properties of the two runs (e.g. average
|
||||||
|
energy or temperature) should still be the same.
|
||||||
|
|
||||||
Some commands use random number generators which may be setup to
|
- If the :doc:`velocity <velocity>` command is used to set initial atom
|
||||||
produce different random number streams on each processor and hence
|
velocities, a particular atom can be assigned a different velocity
|
||||||
will produce different effects when run on different numbers of
|
when the problem is run on a different number of processors or on
|
||||||
processors. A commonly-used example is the :doc:`fix langevin <fix_langevin>` command for thermostatting.
|
different machines. If this happens, the phase space trajectories of
|
||||||
|
the two simulations will rapidly diverge. See the discussion of the
|
||||||
|
*loop* option in the :doc:`velocity <velocity>` command for details
|
||||||
|
and options that avoid this issue.
|
||||||
|
|
||||||
A LAMMPS simulation typically has two stages, setup and run. Most
|
- Similarly, the :doc:`create_atoms <create_atoms>` command generates a
|
||||||
LAMMPS errors are detected at setup time; others like a bond
|
lattice of atoms. For the same physical system, the ordering and
|
||||||
stretching too far may not occur until the middle of a run.
|
numbering of atoms by atom ID may be different depending on the number
|
||||||
|
of processors.
|
||||||
|
|
||||||
LAMMPS tries to flag errors and print informative error messages so
|
- Some commands use random number generators which may be setup to
|
||||||
you can fix the problem. For most errors it will also print the last
|
produce different random number streams on each processor and hence
|
||||||
input script command that it was processing. Of course, LAMMPS cannot
|
will produce different effects when run on different numbers of
|
||||||
figure out your physics or numerical mistakes, like choosing too big a
|
processors. A commonly-used example is the :doc:`fix langevin
|
||||||
timestep, specifying erroneous force field coefficients, or putting 2
|
<fix_langevin>` command for thermostatting.
|
||||||
atoms on top of each other! If you run into errors that LAMMPS
|
|
||||||
does not catch that you think it should flag, please send an email to
|
|
||||||
the `developers <https://www.lammps.org/authors.html>`_ or create an new
|
|
||||||
topic on the dedicated `MatSci forum section <https://matsci.org/lammps/>`_.
|
|
||||||
|
|
||||||
If you get an error message about an invalid command in your input
|
- LAMMPS tries to flag errors and print informative error messages so
|
||||||
script, you can determine what command is causing the problem by
|
you can fix the problem. For most errors it will also print the last
|
||||||
looking in the log.lammps file or using the :doc:`echo command <echo>`
|
input script command that it was processing or even point to the
|
||||||
to see it on the screen. If you get an error like "Invalid ...
|
keyword that is causing troubles. Of course, LAMMPS cannot figure out
|
||||||
style", with ... being fix, compute, pair, etc, it means that you
|
your physics or numerical mistakes, like choosing too big a timestep,
|
||||||
mistyped the style name or that the command is part of an optional
|
specifying erroneous force field coefficients, or putting 2 atoms on
|
||||||
package which was not compiled into your executable. The list of
|
top of each other! Also, LAMMPS does not know what you *intend* to
|
||||||
available styles in your executable can be listed by using
|
do, but very strictly applies the syntax as described in the
|
||||||
:doc:`the -h command-line switch <Run_options>`. The installation and
|
documentation. If you run into errors that LAMMPS does not catch that
|
||||||
compilation of optional packages is explained on the
|
you think it should flag, please send an email to the `developers
|
||||||
:doc:`Build packages <Build_package>` doc page.
|
<https://www.lammps.org/authors.html>`_ or create an new topic on the
|
||||||
|
dedicated `MatSci forum section <https://matsci.org/lammps/>`_.
|
||||||
|
|
||||||
For a given command, LAMMPS expects certain arguments in a specified
|
- If you get an error message about an invalid command in your input
|
||||||
order. If you mess this up, LAMMPS will often flag the error, but it
|
script, you can determine what command is causing the problem by
|
||||||
may also simply read a bogus argument and assign a value that is
|
looking in the log.lammps file or using the :doc:`echo command <echo>`
|
||||||
valid, but not what you wanted. E.g. trying to read the string "abc"
|
to see it on the screen. If you get an error like "Invalid ...
|
||||||
as an integer value of 0. Careful reading of the associated doc page
|
style", with ... being fix, compute, pair, etc, it means that you
|
||||||
for the command should allow you to fix these problems. In most cases,
|
mistyped the style name or that the command is part of an optional
|
||||||
where LAMMPS expects to read a number, either integer or floating point,
|
package which was not compiled into your executable. The list of
|
||||||
it performs a stringent test on whether the provided input actually
|
available styles in your executable can be listed by using
|
||||||
is an integer or floating-point number, respectively, and reject the
|
:doc:`the -h command-line switch <Run_options>`. The installation and
|
||||||
input with an error message (for instance, when an integer is required,
|
compilation of optional packages is explained on the :doc:`Build
|
||||||
but a floating-point number 1.0 is provided):
|
packages <Build_package>` doc page.
|
||||||
|
|
||||||
.. parsed-literal::
|
- For a given command, LAMMPS expects certain arguments in a specified
|
||||||
|
order. If you mess this up, LAMMPS will often flag the error, but it
|
||||||
|
may also simply read a bogus argument and assign a value that is
|
||||||
|
valid, but not what you wanted. E.g. trying to read the string "abc"
|
||||||
|
as an integer value of 0. Careful reading of the associated doc page
|
||||||
|
for the command should allow you to fix these problems. In most cases,
|
||||||
|
where LAMMPS expects to read a number, either integer or floating
|
||||||
|
point, it performs a stringent test on whether the provided input
|
||||||
|
actually is an integer or floating-point number, respectively, and
|
||||||
|
reject the input with an error message (for instance, when an integer
|
||||||
|
is required, but a floating-point number 1.0 is provided):
|
||||||
|
|
||||||
ERROR: Expected integer parameter instead of '1.0' in input script or data file
|
.. parsed-literal::
|
||||||
|
|
||||||
Some commands allow for using variable references in place of numeric
|
ERROR: Expected integer parameter instead of '1.0' in input script or data file
|
||||||
constants so that the value can be evaluated and may change over the
|
|
||||||
course of a run. This is typically done with the syntax *v_name* for a
|
|
||||||
parameter, where name is the name of the variable. On the other hand,
|
|
||||||
immediate variable expansion with the syntax ${name} is performed while
|
|
||||||
reading the input and before parsing commands,
|
|
||||||
|
|
||||||
.. note::
|
- Some commands allow for using variable references in place of numeric
|
||||||
|
constants so that the value can be evaluated and may change over the
|
||||||
|
course of a run. This is typically done with the syntax *v_name* for
|
||||||
|
a parameter, where name is the name of the variable. On the other
|
||||||
|
hand, immediate variable expansion with the syntax ${name} is
|
||||||
|
performed while reading the input and before parsing commands,
|
||||||
|
|
||||||
Using a variable reference (i.e. *v_name*) is only allowed if
|
.. note::
|
||||||
the documentation of the corresponding command explicitly says it is.
|
|
||||||
Otherwise, you will receive an error message of this kind:
|
|
||||||
|
|
||||||
.. parsed-literal::
|
Using a variable reference (i.e. *v_name*) is only allowed if
|
||||||
|
the documentation of the corresponding command explicitly says it is.
|
||||||
|
Otherwise, you will receive an error message of this kind:
|
||||||
|
|
||||||
ERROR: Expected floating point parameter instead of 'v_name' in input script or data file
|
.. parsed-literal::
|
||||||
|
|
||||||
Generally, LAMMPS will print a message to the screen and logfile and
|
ERROR: Expected floating point parameter instead of 'v_name' in input script or data file
|
||||||
exit gracefully when it encounters a fatal error. Sometimes it will
|
|
||||||
print a WARNING to the screen and logfile and continue on; you can
|
|
||||||
decide if the WARNING is important or not. A WARNING message that is
|
|
||||||
generated in the middle of a run is only printed to the screen, not to
|
|
||||||
the logfile, to avoid cluttering up thermodynamic output. If LAMMPS
|
|
||||||
crashes or hangs without spitting out an error message first then it
|
|
||||||
could be a bug (see :doc:`this section <Errors_bugs>`) or one of the following
|
|
||||||
cases:
|
|
||||||
|
|
||||||
LAMMPS runs in the available memory a processor allows to be
|
- Generally, LAMMPS will print a message to the screen and logfile and
|
||||||
allocated. Most reasonable MD runs are compute limited, not memory
|
exit gracefully when it encounters a fatal error. When running in
|
||||||
limited, so this should not be a bottleneck on most platforms. Almost
|
parallel this message may be stuck in an I/O buffer and LAMMPS will be
|
||||||
all large memory allocations in the code are done via C-style malloc's
|
terminated before that buffer is printed. In that case you can try
|
||||||
which will generate an error message if you run out of memory.
|
adding the ``-nonblock`` or ``-nb`` command-line flag to turn off that
|
||||||
Smaller chunks of memory are allocated via C++ "new" statements. If
|
buffering. Please note that this should not be used for production
|
||||||
you are unlucky you could run out of memory just when one of these
|
runs, since turning off buffering usually has a significant negative
|
||||||
small requests is made, in which case the code will crash or hang (in
|
impact on performance (even worse than :doc:`thermo_modify flush yes
|
||||||
parallel), since LAMMPS does not trap on those errors.
|
<thermo_modify>`). Sometimes LAMMPS will print a WARNING to the
|
||||||
|
screen and logfile and continue on; you can decide if the WARNING is
|
||||||
|
important or not, but as a general rule do not ignore warnings that
|
||||||
|
you not understand. A WARNING message that is generated in the middle
|
||||||
|
of a run is only printed to the screen, not to the logfile, to avoid
|
||||||
|
cluttering up thermodynamic output. If LAMMPS crashes or hangs
|
||||||
|
without generating an error message first then it could be a bug
|
||||||
|
(see :doc:`this section <Errors_bugs>`).
|
||||||
|
|
||||||
Illegal arithmetic can cause LAMMPS to run slow or crash. This is
|
- LAMMPS runs in the available memory a processor allows to be
|
||||||
typically due to invalid physics and numerics that your simulation is
|
allocated. Most reasonable MD runs are compute limited, not memory
|
||||||
computing. If you see wild thermodynamic values or NaN values in your
|
limited, so this should not be a bottleneck on most platforms. Almost
|
||||||
LAMMPS output, something is wrong with your simulation. If you
|
all large memory allocations in the code are done via C-style malloc's
|
||||||
suspect this is happening, it is a good idea to print out
|
which will generate an error message if you run out of memory.
|
||||||
thermodynamic info frequently (e.g. every timestep) via the
|
Smaller chunks of memory are allocated via C++ "new" statements. If
|
||||||
:doc:`thermo <thermo>` so you can monitor what is happening.
|
you are unlucky you could run out of memory just when one of these
|
||||||
Visualizing the atom movement is also a good idea to ensure your model
|
small requests is made, in which case the code will crash or hang (in
|
||||||
is behaving as you expect.
|
parallel).
|
||||||
|
|
||||||
In parallel, one way LAMMPS can hang is due to how different MPI
|
- Illegal arithmetic can cause LAMMPS to run slow or crash. This is
|
||||||
implementations handle buffering of messages. If the code hangs
|
typically due to invalid physics and numerics that your simulation is
|
||||||
without an error message, it may be that you need to specify an MPI
|
computing. If you see wild thermodynamic values or NaN values in your
|
||||||
setting or two (usually via an environment variable) to enable
|
LAMMPS output, something is wrong with your simulation. If you
|
||||||
buffering or boost the sizes of messages that can be buffered.
|
suspect this is happening, it is a good idea to print out
|
||||||
|
thermodynamic info frequently (e.g. every timestep) via the
|
||||||
|
:doc:`thermo <thermo>` so you can monitor what is happening.
|
||||||
|
Visualizing the atom movement is also a good idea to ensure your model
|
||||||
|
is behaving as you expect.
|
||||||
|
|
||||||
|
- When running in parallel with MPI, one way LAMMPS can hang is because
|
||||||
|
LAMMPS has come across an error condition, but only on one or a few
|
||||||
|
MPI processes and not all of them. LAMMPS has two different "stop
|
||||||
|
with an error message" functions and the correct one has to be called
|
||||||
|
or else it will hang.
|
||||||
|
|||||||
Reference in New Issue
Block a user