Merge branch 'master' into lammps-icms

Resolved Conflicts:
	lib/meam/Makefile.gfortran
	lib/poems/Makefile.g++
	lib/reax/Makefile.gfortran
	python/lammps.py
	src/USER-CUDA/cuda.cpp
This commit is contained in:
Axel Kohlmeyer
2012-08-15 08:38:04 -04:00
67 changed files with 3370 additions and 1354 deletions

View File

@ -14,8 +14,8 @@
<P>This section describes how to build and use LAMMPS via a Python <P>This section describes how to build and use LAMMPS via a Python
interface. interface.
</P> </P>
<UL><LI>11.1 <A HREF = "#py_1">Setting necessary environment variables</A> <UL><LI>11.1 <A HREF = "#py_1">Building LAMMPS as a shared library</A>
<LI>11.2 <A HREF = "#py_2">Building LAMMPS as a shared library</A> <LI>11.2 <A HREF = "#py_2">Installing the Python wrapper into Python</A>
<LI>11.3 <A HREF = "#py_3">Extending Python with MPI to run in parallel</A> <LI>11.3 <A HREF = "#py_3">Extending Python with MPI to run in parallel</A>
<LI>11.4 <A HREF = "#py_4">Testing the Python-LAMMPS interface</A> <LI>11.4 <A HREF = "#py_4">Testing the Python-LAMMPS interface</A>
<LI>11.5 <A HREF = "#py_5">Using LAMMPS from Python</A> <LI>11.5 <A HREF = "#py_5">Using LAMMPS from Python</A>
@ -76,109 +76,97 @@ check which version of Python you have installed, by simply typing
<HR> <HR>
<A NAME = "py_1"></A><H4>11.1 Setting necessary environment variables <A NAME = "py_1"></A><H4>11.1 Building LAMMPS as a shared library
</H4>
<P>For Python to use the LAMMPS interface, it needs to find two files.
The paths to these files need to be added to two environment variables
that Python checks.
</P>
<P>The first is the environment variable PYTHONPATH. It needs
to include the directory where the python/lammps.py file is.
</P>
<P>For the csh or tcsh shells, you could add something like this to your
~/.cshrc file:
</P>
<PRE>setenv PYTHONPATH $<I>PYTHONPATH</I>:/home/sjplimp/lammps/python
</PRE>
<P>The second is the environment variable LD_LIBRARY_PATH, which is used
by the operating system to find dynamic shared libraries when it loads
them. It needs to include the directory where the shared LAMMPS
library will be. Normally this is the LAMMPS src dir, as explained in
the following section.
</P>
<P>For the csh or tcsh shells, you could add something like this to your
~/.cshrc file:
</P>
<PRE>setenv LD_LIBRARY_PATH $<I>LD_LIBRARY_PATH</I>:/home/sjplimp/lammps/src
</PRE>
<P>As discussed below, if your LAMMPS build includes auxiliary libraries,
they must also be available as shared libraries for Python to
successfully load LAMMPS. If they are not in default places where the
operating system can find them, then you also have to add their paths
to the LD_LIBRARY_PATH environment variable.
</P>
<P>For example, if you are using the dummy MPI library provided in
src/STUBS, you need to add something like this to your ~/.cshrc file:
</P>
<PRE>setenv LD_LIBRARY_PATH $<I>LD_LIBRARY_PATH</I>:/home/sjplimp/lammps/src/STUBS
</PRE>
<P>If you are using the LAMMPS USER-ATC package, you need to add
something like this to your ~/.cshrc file:
</P>
<PRE>setenv LD_LIBRARY_PATH $<I>LD_LIBRARY_PATH</I>:/home/sjplimp/lammps/lib/atc
</PRE>
<HR>
<A NAME = "py_2"></A><H4>11.2 Building LAMMPS as a shared library
</H4> </H4>
<P>Instructions on how to build LAMMPS as a shared library are given in <P>Instructions on how to build LAMMPS as a shared library are given in
<A HREF = "Section_start.html#start_5">Section_start 5</A>. A shared library is one <A HREF = "Section_start.html#start_5">Section_start 5</A>. A shared library is one
that is dynamically loadable, which is what Python requires. On Linux that is dynamically loadable, which is what Python requires. On Linux
this is a library file that ends in ".so", not ".a". this is a library file that ends in ".so", not ".a".
</P> </P>
<P>>From the src directory, type <P>From the src directory, type
</P> </P>
<P>make makeshlib <PRE>make makeshlib
make -f Makefile.shlib foo make -f Makefile.shlib foo
</P>
<P>where foo is the machine target name, such as linux or g++ or serial.
This should create the file liblmp_foo.so in the src directory, as
well as a soft link liblmp.so which is what the Python wrapper will
load by default. If you are building multiple machine versions of the
shared library, the soft link is always set to the most recently built
version.
</P>
<P>Note that as discussed in below, a LAMMPS build may depend on several
auxiliary libraries, which are specified in your low-level
src/Makefile.foo file. For example, an MPI library, the FFTW library,
a JPEG library, etc. Depending on what LAMMPS packages you have
installed, the build may also require additional libraries from the
lib directories, such as lib/atc/libatc.so or lib/reax/libreax.so.
</P>
<P>You must insure that each of these libraries exist in shared library
form (*.so file for Linux systems), or either the LAMMPS shared
library build or the Python load of the library will fail. For the
load to be successful all the shared libraries must also be in
directories that the operating system checks. See the discussion in
the preceding section about the LD_LIBRARY_PATH environment variable
for how to insure this.
</P>
<P>Note that some system libraries, such as MPI, if you installed it
yourself, may not be built by default as shared libraries. The build
instructions for the library should tell you how to do this.
</P>
<P>For example, here is how to build and install the <A HREF = "http://www-unix.mcs.anl.gov/mpi">MPICH
library</A>, a popular open-source version of MPI, distributed by
Argonne National Labs, as a shared library in the default
/usr/local/lib location:
</P>
<PRE>./configure --enable-shared
make
make install
</PRE> </PRE>
<P>You may need to use "sudo make install" in place of the last line if <P>where foo is the machine target name, such as linux or g++ or serial.
you do not have write priveleges for /usr/local/lib. The end result This should create the file liblammps_foo.so in the src directory, as
should be the file /usr/local/lib/libmpich.so. well as a soft link liblammps.so, which is what the Python wrapper will
load by default. Note that if you are building multiple machine
versions of the shared library, the soft link is always set to the
most recently built version.
</P> </P>
<P>Note that not all of the auxiliary libraries provided with LAMMPS have <P>If this fails, see <A HREF = "Section_start.html#start_5">Section_start 5</A> for
shared-library Makefiles in their lib directories. Typically this more details, especially if your LAMMPS build uses auxiliary libraries
simply requires a Makefile.foo that adds a -fPIC switch when files are like MPI or FFTW which may not be built as shared libraries on your
compiled and a "-fPIC -shared" switches when the library is linked system.
with a C++ (or Fortran) compiler, as well as an output target that </P>
ends in ".so", like libatc.o. As we or others create and contribute <HR>
these Makefiles, we will add them to the LAMMPS distribution.
<A NAME = "py_2"></A><H4>11.2 Installing the Python wrapper into Python
</H4>
<P>For Python to invoke LAMMPS, there are 2 files it needs to know about:
</P>
<UL><LI>python/lammps.py
<LI>src/liblammps.so
</UL>
<P>Lammps.py is the Python wrapper on the LAMMPS library interface.
Liblammps.so is the shared LAMMPS library that Python loads, as
described above.
</P>
<P>You can insure Python can find these files in one of two ways:
</P>
<UL><LI>set two environment variables
<LI>run the python/install.py script
</UL>
<P>If you set the paths to these files as environment variables, you only
have to do it once. For the csh or tcsh shells, add something like
this to your ~/.cshrc file, one line for each of the two files:
</P>
<PRE>setenv PYTHONPATH $<I>PYTHONPATH</I>:/home/sjplimp/lammps/python
setenv LD_LIBRARY_PATH $<I>LD_LIBRARY_PATH</I>:/home/sjplimp/lammps/src
</PRE>
<P>If you use the python/install.py script, you need to invoke it every
time you rebuild LAMMPS (as a shared library) or make changes to the
python/lammps.py file.
</P>
<P>You can invoke install.py from the python directory as
</P>
<PRE>% python install.py <B>libdir</B> <B>pydir</B>
</PRE>
<P>The optional libdir is where to copy the LAMMPS shared library to; the
default is /usr/local/lib. The optional pydir is where to copy the
lammps.py file to; the default is the site-packages directory of the
version of Python that is running the install script.
</P>
<P>Note that libdir must be a location that is in your default
LD_LIBRARY_PATH, like /usr/local/lib or /usr/lib. And pydir must be a
location that Python looks in by default for imported modules, like
its site-packages dir. If you want to copy these files to
non-standard locations, such as within your own user space, you will
need to set your PYTHONPATH and LD_LIBRARY_PATH environment variables
accordingly, as above.
</P>
<P>If the instally.py script does not allow you to copy files into system
directories, prefix the python command with "sudo". If you do this,
make sure that the Python that root runs is the same as the Python you
run. E.g. you may need to do something like
</P>
<PRE>% sudo /usr/local/bin/python install.py <B>libdir</B> <B>pydir</B>
</PRE>
<P>You can also invoke install.py from the make command in the src
directory as
</P>
<PRE>% make install-python
</PRE>
<P>In this mode you cannot append optional arguments. Again, you may
need to prefix this with "sudo". In this mode you cannot control
which Python is invoked by root.
</P>
<P>Note that if you want Python to be able to load different versions of
the LAMMPS shared library (see <A HREF = "#py_5">this section</A> below), you will
need to manually copy files like lmplammps_g++.so into the appropriate
system directory. This is not needed if you set the LD_LIBRARY_PATH
environment variable as described above.
</P> </P>
<HR> <HR>
@ -197,13 +185,12 @@ as a library and allow MPI functions to be called from Python.
<LI><A HREF = "http://code.google.com/p/maroonmpi/">maroonmpi</A> <LI><A HREF = "http://code.google.com/p/maroonmpi/">maroonmpi</A>
<LI><A HREF = "http://code.google.com/p/mpi4py/">mpi4py</A> <LI><A HREF = "http://code.google.com/p/mpi4py/">mpi4py</A>
<LI><A HREF = "http://nbcr.sdsc.edu/forum/viewtopic.php?t=89&sid=c997fefc3933bd66204875b436940f16">myMPI</A> <LI><A HREF = "http://nbcr.sdsc.edu/forum/viewtopic.php?t=89&sid=c997fefc3933bd66204875b436940f16">myMPI</A>
<LI><A HREF = "http://datamining.anu.edu.au/~ole/pypar">Pypar</A> <LI><A HREF = "http://code.google.com/p/pypar">Pypar</A>
</UL> </UL>
<P>All of these except pyMPI work by wrapping the MPI library (which must <P>All of these except pyMPI work by wrapping the MPI library and
be available on your system as a shared library, as discussed above), exposing (some portion of) its interface to your Python script. This
and exposing (some portion of) its interface to your Python script. means Python cannot be used interactively in parallel, since they do
This means Python cannot be used interactively in parallel, since they not address the issue of interactive input to multiple instances of
do not address the issue of interactive input to multiple instances of
Python running on different processors. The one exception is pyMPI, Python running on different processors. The one exception is pyMPI,
which alters the Python interpreter to address this issue, and (I which alters the Python interpreter to address this issue, and (I
believe) creates a new alternate executable (in place of "python" believe) creates a new alternate executable (in place of "python"
@ -233,17 +220,17 @@ sudo python setup.py install
<P>The "sudo" is only needed if required to copy Numpy files into your <P>The "sudo" is only needed if required to copy Numpy files into your
Python distribution's site-packages directory. Python distribution's site-packages directory.
</P> </P>
<P>To install Pypar (version pypar-2.1.0_66 as of April 2009), unpack it <P>To install Pypar (version pypar-2.1.4_94 as of Aug 2012), unpack it
and from its "source" directory, type and from its "source" directory, type
</P> </P>
<PRE>python setup.py build <PRE>python setup.py build
sudo python setup.py install sudo python setup.py install
</PRE> </PRE>
<P>Again, the "sudo" is only needed if required to copy PyPar files into <P>Again, the "sudo" is only needed if required to copy Pypar files into
your Python distribution's site-packages directory. your Python distribution's site-packages directory.
</P> </P>
<P>If you have successully installed Pypar, you should be able to run <P>If you have successully installed Pypar, you should be able to run
python serially and type Python and type
</P> </P>
<PRE>import pypar <PRE>import pypar
</PRE> </PRE>
@ -259,6 +246,19 @@ print "Proc %d out of %d procs" % (pypar.rank(),pypar.size())
</PRE> </PRE>
<P>and see one line of output for each processor you run on. <P>and see one line of output for each processor you run on.
</P> </P>
<P>IMPORTANT NOTE: To use Pypar and LAMMPS in parallel from Python, you
must insure both are using the same version of MPI. If you only have
one MPI installed on your system, this is not an issue, but it can be
if you have multiple MPIs. Your LAMMPS build is explicit about which
MPI it is using, since you specify the details in your lo-level
src/MAKE/Makefile.foo file. Pypar uses the "mpicc" command to find
information about the MPI it uses to build against. And it tries to
load "libmpi.so" from the LD_LIBRARY_PATH. This may or may not find
the MPI library that LAMMPS is using. If you have problems running
both Pypar and LAMMPS together, this is an issue you may need to
address, e.g. by moving other MPI installations so that Pypar finds
the right one.
</P>
<HR> <HR>
<A NAME = "py_4"></A><H4>11.4 Testing the Python-LAMMPS interface <A NAME = "py_4"></A><H4>11.4 Testing the Python-LAMMPS interface
@ -272,24 +272,17 @@ and type:
<P>If you get no errors, you're ready to use LAMMPS from Python. <P>If you get no errors, you're ready to use LAMMPS from Python.
If the load fails, the most common error to see is If the load fails, the most common error to see is
</P> </P>
<P>"CDLL: asdfasdfasdf" <PRE>OSError: Could not load LAMMPS dynamic library
</P> </PRE>
<P>which means Python was unable to load the LAMMPS shared library. This <P>which means Python was unable to load the LAMMPS shared library. This
can occur if it can't find the LAMMMPS library; see the environment typically occurs if the system can't find the LAMMMPS shared library
variable discussion <A HREF = "#python_1">above</A>. Or if it can't find one of the or one of the auxiliary shared libraries it depends on.
auxiliary libraries that was specified in the LAMMPS build, in a
shared dynamic library format. This includes all libraries needed by
main LAMMPS (e.g. MPI or FFTW or JPEG), system libraries needed by
main LAMMPS (e.g. extra libs needed by MPI), or packages you have
installed that require libraries provided with LAMMPS (e.g. the
USER-ATC package require lib/atc/libatc.so) or system libraries
(e.g. BLAS or Fortran-to-C libraries) listed in the
lib/package/Makefile.lammps file. Again, all of these must be
available as shared libraries, or the Python load will fail.
</P> </P>
<P>Python (actually the operating system) isn't verbose about telling you <P>Python (actually the operating system) isn't verbose about telling you
why the load failed, so go through the steps above and in why the load failed, so carefully go through the steps above regarding
<A HREF = "Section_start.html#start_5">Section_start 5</A> carefully. environment variables, and the instructions in <A HREF = "Section_start.html#start_5">Section_start
5</A> about building a shared library and
about setting the LD_LIBRARY_PATH envirornment variable.
</P> </P>
<H5><B>Test LAMMPS and Python in serial:</B> <H5><B>Test LAMMPS and Python in serial:</B>
</H5> </H5>
@ -334,10 +327,10 @@ pypar.finalize()
<P>Note that if you leave out the 3 lines from test.py that specify Pypar <P>Note that if you leave out the 3 lines from test.py that specify Pypar
commands you will instantiate and run LAMMPS independently on each of commands you will instantiate and run LAMMPS independently on each of
the P processors specified in the mpirun command. In this case you the P processors specified in the mpirun command. In this case you
should get 4 sets of output, each showing that a run was made on a should get 4 sets of output, each showing that a LAMMPS run was made
single processor, instead of one set of output showing that it ran on on a single processor, instead of one set of output showing that
4 processors. If the 1-processor outputs occur, it means that Pypar LAMMPS ran on 4 processors. If the 1-processor outputs occur, it
is not working correctly. means that Pypar is not working correctly.
</P> </P>
<P>Also note that once you import the PyPar module, Pypar initializes MPI <P>Also note that once you import the PyPar module, Pypar initializes MPI
for you, and you can use MPI calls directly in your Python script, as for you, and you can use MPI calls directly in your Python script, as
@ -345,6 +338,8 @@ described in the Pypar documentation. The last line of your Python
script should be pypar.finalize(), to insure MPI is shut down script should be pypar.finalize(), to insure MPI is shut down
correctly. correctly.
</P> </P>
<H5><B>Running Python scripts:</B>
</H5>
<P>Note that any Python script (not just for LAMMPS) can be invoked in <P>Note that any Python script (not just for LAMMPS) can be invoked in
one of several ways: one of several ways:
</P> </P>
@ -379,25 +374,18 @@ Python on a single processor, not in parallel.
the source code for which is in python/lammps.py, which creates a the source code for which is in python/lammps.py, which creates a
"lammps" object, with a set of methods that can be invoked on that "lammps" object, with a set of methods that can be invoked on that
object. The sample Python code below assumes you have first imported object. The sample Python code below assumes you have first imported
the "lammps" module in your Python script. You can also include its the "lammps" module in your Python script, as follows:
settings as follows, which are useful in test return values from some
of the methods described below:
</P> </P>
<PRE>from lammps import lammps <PRE>from lammps import lammps
from lammps import LMPINT as INT
from lammps import LMPDOUBLE as DOUBLE
from lammps import LMPIPTR as IPTR
from lammps import LMPDPTR as DPTR
from lammps import LMPDPTRPTR as DPTRPTR
</PRE> </PRE>
<P>These are the methods defined by the lammps module. If you look <P>These are the methods defined by the lammps module. If you look
at the file src/library.cpp you will see that they correspond at the file src/library.cpp you will see that they correspond
one-to-one with calls you can make to the LAMMPS library from a C++ or one-to-one with calls you can make to the LAMMPS library from a C++ or
C or Fortran program. C or Fortran program.
</P> </P>
<PRE>lmp = lammps() # create a LAMMPS object using the default liblmp.so library <PRE>lmp = lammps() # create a LAMMPS object using the default liblammps.so library
lmp = lammps("g++") # create a LAMMPS object using the liblmp_g++.so library lmp = lammps("g++") # create a LAMMPS object using the liblammps_g++.so library
lmp = lammps("",list) # ditto, with command-line args, list = ["-echo","screen"] lmp = lammps("",list) # ditto, with command-line args, e.g. list = ["-echo","screen"]
lmp = lammps("g++",list) lmp = lammps("g++",list)
</PRE> </PRE>
<PRE>lmp.close() # destroy a LAMMPS object <PRE>lmp.close() # destroy a LAMMPS object
@ -407,11 +395,15 @@ lmp.command(cmd) # invoke a single LAMMPS command, cmd = "run 100"
</PRE> </PRE>
<PRE>xlo = lmp.extract_global(name,type) # extract a global quantity <PRE>xlo = lmp.extract_global(name,type) # extract a global quantity
# name = "boxxlo", "nlocal", etc # name = "boxxlo", "nlocal", etc
# type = INT or DOUBLE # type = 0 = int
# 1 = double
</PRE> </PRE>
<PRE>coords = lmp.extract_atom(name,type) # extract a per-atom quantity <PRE>coords = lmp.extract_atom(name,type) # extract a per-atom quantity
# name = "x", "type", etc # name = "x", "type", etc
# type = IPTR or DPTR or DPTRPTR # type = 0 = vector of ints
# 1 = array of ints
# 2 = vector of doubles
# 3 = array of doubles
</PRE> </PRE>
<PRE>eng = lmp.extract_compute(id,style,type) # extract value(s) from a compute <PRE>eng = lmp.extract_compute(id,style,type) # extract value(s) from a compute
v3 = lmp.extract_fix(id,style,type,i,j) # extract value(s) from a fix v3 = lmp.extract_fix(id,style,type,i,j) # extract value(s) from a fix
@ -431,18 +423,23 @@ v3 = lmp.extract_fix(id,style,type,i,j) # extract value(s) from a fix
# 1 = atom-style variable # 1 = atom-style variable
</PRE> </PRE>
<PRE>natoms = lmp.get_natoms() # total # of atoms as int <PRE>natoms = lmp.get_natoms() # total # of atoms as int
x = lmp.get_coords() # return coords of all atoms in x data = lmp.gather_atoms(name,type,count) # return atom attribute of all atoms gathered into data, ordered by atom ID
lmp.put_coords(x) # set all atom coords via x # name = "x", "charge", "type", etc
# count = # of per-atom values, 1 or 3, etc
lmp.scatter_atoms(name,type,count,data) # scatter atom attribute of all atoms from data, ordered by atom ID
# name = "x", "charge", "type", etc
# count = # of per-atom values, 1 or 3, etc
</PRE> </PRE>
<HR> <HR>
<P>IMPORTANT NOTE: Currently, the creation of a LAMMPS object does not <P>IMPORTANT NOTE: Currently, the creation of a LAMMPS object from within
take an MPI communicator as an argument. There should be a way to do lammps.py does not take an MPI communicator as an argument. There
this, so that the LAMMPS instance runs on a subset of processors if should be a way to do this, so that the LAMMPS instance runs on a
desired, but I don't know how to do it from Pypar. So for now, it subset of processors if desired, but I don't know how to do it from
runs on MPI_COMM_WORLD, which is all the processors. If someone Pypar. So for now, it runs with MPI_COMM_WORLD, which is all the
figures out how to do this with one or more of the Python wrappers for processors. If someone figures out how to do this with one or more of
MPI, like Pypar, please let us know and we will amend these doc pages. the Python wrappers for MPI, like Pypar, please let us know and we
will amend these doc pages.
</P> </P>
<P>Note that you can create multiple LAMMPS objects in your Python <P>Note that you can create multiple LAMMPS objects in your Python
script, and coordinate and run multiple simulations, e.g. script, and coordinate and run multiple simulations, e.g.
@ -470,8 +467,8 @@ returned, which you can use via normal Python subscripting. See the
extract() method in the src/atom.cpp file for a list of valid names. extract() method in the src/atom.cpp file for a list of valid names.
Again, new names could easily be added. A pointer to a vector of Again, new names could easily be added. A pointer to a vector of
doubles or integers, or a pointer to an array of doubles (double **) doubles or integers, or a pointer to an array of doubles (double **)
is returned. You need to specify the appropriate data type via the or integers (int **) is returned. You need to specify the appropriate
type argument. data type via the type argument.
</P> </P>
<P>For extract_compute() and extract_fix(), the global, per-atom, or <P>For extract_compute() and extract_fix(), the global, per-atom, or
local data calulated by the compute or fix can be accessed. What is local data calulated by the compute or fix can be accessed. What is
@ -499,58 +496,57 @@ Python subscripting. The values will be zero for atoms not in the
specified group. specified group.
</P> </P>
<P>The get_natoms() method returns the total number of atoms in the <P>The get_natoms() method returns the total number of atoms in the
simulation, as an int. Note that extract_global("natoms") returns the simulation, as an int.
same value, but as a double, which is the way LAMMPS stores it to
allow for systems with more atoms than can be stored in an int (> 2
billion).
</P> </P>
<P>The get_coords() method returns an ctypes vector of doubles of length <P>The gather_atoms() method returns a ctypes vector of ints or doubles
3*natoms, for the coordinates of all the atoms in the simulation, as specified by type, of length count*natoms, for the property of all
ordered by x,y,z and then by atom ID (see code for put_coords() the atoms in the simulation specified by name, ordered by count and
below). The array can be used via normal Python subscripting. If then by atom ID. The vector can be used via normal Python
atom IDs are not consecutively ordered within LAMMPS, a None is subscripting. If atom IDs are not consecutively ordered within
returned as indication of an error. LAMMPS, a None is returned as indication of an error.
</P> </P>
<P>Note that the data structure get_coords() returns is different from <P>Note that the data structure gather_atoms("x") returns is different
the data structure returned by extract_atom("x") in four ways. (1) from the data structure returned by extract_atom("x") in four ways.
Get_coords() returns a vector which you index as x[i]; (1) Gather_atoms() returns a vector which you index as x[i];
extract_atom() returns an array which you index as x[i][j]. (2) extract_atom() returns an array which you index as x[i][j]. (2)
Get_coords() orders the atoms by atom ID while extract_atom() does Gather_atoms() orders the atoms by atom ID while extract_atom() does
not. (3) Get_coords() returns a list of all atoms in the simulation; not. (3) Gathert_atoms() returns a list of all atoms in the
extract_atoms() returns just the atoms local to each processor. (4) simulation; extract_atoms() returns just the atoms local to each
Finally, the get_coords() data structure is a copy of the atom coords processor. (4) Finally, the gather_atoms() data structure is a copy
stored internally in LAMMPS, whereas extract_atom returns an array of the atom coords stored internally in LAMMPS, whereas extract_atom()
that points directly to the internal data. This means you can change returns an array that effectively points directly to the internal
values inside LAMMPS from Python by assigning a new values to the data. This means you can change values inside LAMMPS from Python by
extract_atom() array. To do this with the get_atoms() vector, you assigning a new values to the extract_atom() array. To do this with
need to change values in the vector, then invoke the put_coords() the gather_atoms() vector, you need to change values in the vector,
method. then invoke the scatter_atoms() method.
</P> </P>
<P>The put_coords() method takes a vector of coordinates for all atoms in <P>The scatter_atoms() method takes a vector of ints or doubles as
the simulation, assumed to be ordered by x,y,z and then by atom ID, specified by type, of length count*natoms, for the property of all the
and uses the values to overwrite the corresponding coordinates for atoms in the simulation specified by name, ordered by bount and then
each atom inside LAMMPS. This requires LAMMPS to have its "map" by atom ID. It uses the vector of data to overwrite the corresponding
option enabled; see the <A HREF = "atom_modify.html">atom_modify</A> command for properties for each atom inside LAMMPS. This requires LAMMPS to have
details. If it is not or if atom IDs are not consecutively ordered, its "map" option enabled; see the <A HREF = "atom_modify.html">atom_modify</A>
no coordinates are reset, command for details. If it is not, or if atom IDs are not
consecutively ordered, no coordinates are reset.
</P> </P>
<P>The array of coordinates passed to put_coords() must be a ctypes <P>The array of coordinates passed to scatter_atoms() must be a ctypes
vector of doubles, allocated and initialized something like this: vector of ints or doubles, allocated and initialized something like
this:
</P> </P>
<PRE>from ctypes import * <PRE>from ctypes import *
natoms = lmp.get_atoms() natoms = lmp.get_natoms()
n3 = 3*natoms n3 = 3*natoms
x = (c_double*n3)() x = (n3*c_double)()
x<B>0</B> = x coord of atom with ID 1 x<B>0</B> = x coord of atom with ID 1
x<B>1</B> = y coord of atom with ID 1 x<B>1</B> = y coord of atom with ID 1
x<B>2</B> = z coord of atom with ID 1 x<B>2</B> = z coord of atom with ID 1
x<B>3</B> = x coord of atom with ID 2 x<B>3</B> = x coord of atom with ID 2
... ...
x<B>n3-1</B> = z coord of atom with ID natoms x<B>n3-1</B> = z coord of atom with ID natoms
lmp.put_coords(x) lmp.scatter_coords("x",1,3,x)
</PRE> </PRE>
<P>Alternatively, you can just change values in the vector returned by <P>Alternatively, you can just change values in the vector returned by
get_coords(), since it is a ctypes vector of doubles. gather_atoms("x",1,3), since it is a ctypes vector of doubles.
</P> </P>
<HR> <HR>

View File

@ -11,8 +11,8 @@
This section describes how to build and use LAMMPS via a Python This section describes how to build and use LAMMPS via a Python
interface. interface.
11.1 "Setting necessary environment variables"_#py_1 11.1 "Building LAMMPS as a shared library"_#py_1
11.2 "Building LAMMPS as a shared library"_#py_2 11.2 "Installing the Python wrapper into Python"_#py_2
11.3 "Extending Python with MPI to run in parallel"_#py_3 11.3 "Extending Python with MPI to run in parallel"_#py_3
11.4 "Testing the Python-LAMMPS interface"_#py_4 11.4 "Testing the Python-LAMMPS interface"_#py_4
11.5 "Using LAMMPS from Python"_#py_5 11.5 "Using LAMMPS from Python"_#py_5
@ -72,109 +72,97 @@ check which version of Python you have installed, by simply typing
:line :line
:line :line
11.1 Setting necessary environment variables :link(py_1),h4 11.1 Building LAMMPS as a shared library :link(py_1),h4
For Python to use the LAMMPS interface, it needs to find two files.
The paths to these files need to be added to two environment variables
that Python checks.
The first is the environment variable PYTHONPATH. It needs
to include the directory where the python/lammps.py file is.
For the csh or tcsh shells, you could add something like this to your
~/.cshrc file:
setenv PYTHONPATH ${PYTHONPATH}:/home/sjplimp/lammps/python :pre
The second is the environment variable LD_LIBRARY_PATH, which is used
by the operating system to find dynamic shared libraries when it loads
them. It needs to include the directory where the shared LAMMPS
library will be. Normally this is the LAMMPS src dir, as explained in
the following section.
For the csh or tcsh shells, you could add something like this to your
~/.cshrc file:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src :pre
As discussed below, if your LAMMPS build includes auxiliary libraries,
they must also be available as shared libraries for Python to
successfully load LAMMPS. If they are not in default places where the
operating system can find them, then you also have to add their paths
to the LD_LIBRARY_PATH environment variable.
For example, if you are using the dummy MPI library provided in
src/STUBS, you need to add something like this to your ~/.cshrc file:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src/STUBS :pre
If you are using the LAMMPS USER-ATC package, you need to add
something like this to your ~/.cshrc file:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/lib/atc :pre
:line
11.2 Building LAMMPS as a shared library :link(py_2),h4
Instructions on how to build LAMMPS as a shared library are given in Instructions on how to build LAMMPS as a shared library are given in
"Section_start 5"_Section_start.html#start_5. A shared library is one "Section_start 5"_Section_start.html#start_5. A shared library is one
that is dynamically loadable, which is what Python requires. On Linux that is dynamically loadable, which is what Python requires. On Linux
this is a library file that ends in ".so", not ".a". this is a library file that ends in ".so", not ".a".
>From the src directory, type From the src directory, type
make makeshlib make makeshlib
make -f Makefile.shlib foo make -f Makefile.shlib foo :pre
where foo is the machine target name, such as linux or g++ or serial. where foo is the machine target name, such as linux or g++ or serial.
This should create the file liblmp_foo.so in the src directory, as This should create the file liblammps_foo.so in the src directory, as
well as a soft link liblmp.so which is what the Python wrapper will well as a soft link liblammps.so, which is what the Python wrapper will
load by default. If you are building multiple machine versions of the load by default. Note that if you are building multiple machine
shared library, the soft link is always set to the most recently built versions of the shared library, the soft link is always set to the
version. most recently built version.
Note that as discussed in below, a LAMMPS build may depend on several If this fails, see "Section_start 5"_Section_start.html#start_5 for
auxiliary libraries, which are specified in your low-level more details, especially if your LAMMPS build uses auxiliary libraries
src/Makefile.foo file. For example, an MPI library, the FFTW library, like MPI or FFTW which may not be built as shared libraries on your
a JPEG library, etc. Depending on what LAMMPS packages you have system.
installed, the build may also require additional libraries from the
lib directories, such as lib/atc/libatc.so or lib/reax/libreax.so.
You must insure that each of these libraries exist in shared library :line
form (*.so file for Linux systems), or either the LAMMPS shared
library build or the Python load of the library will fail. For the
load to be successful all the shared libraries must also be in
directories that the operating system checks. See the discussion in
the preceding section about the LD_LIBRARY_PATH environment variable
for how to insure this.
Note that some system libraries, such as MPI, if you installed it 11.2 Installing the Python wrapper into Python :link(py_2),h4
yourself, may not be built by default as shared libraries. The build
instructions for the library should tell you how to do this.
For example, here is how to build and install the "MPICH For Python to invoke LAMMPS, there are 2 files it needs to know about:
library"_mpich, a popular open-source version of MPI, distributed by
Argonne National Labs, as a shared library in the default
/usr/local/lib location:
:link(mpich,http://www-unix.mcs.anl.gov/mpi) python/lammps.py
src/liblammps.so :ul
./configure --enable-shared Lammps.py is the Python wrapper on the LAMMPS library interface.
make Liblammps.so is the shared LAMMPS library that Python loads, as
make install :pre described above.
You may need to use "sudo make install" in place of the last line if You can insure Python can find these files in one of two ways:
you do not have write priveleges for /usr/local/lib. The end result
should be the file /usr/local/lib/libmpich.so.
Note that not all of the auxiliary libraries provided with LAMMPS have set two environment variables
shared-library Makefiles in their lib directories. Typically this run the python/install.py script :ul
simply requires a Makefile.foo that adds a -fPIC switch when files are
compiled and a "-fPIC -shared" switches when the library is linked If you set the paths to these files as environment variables, you only
with a C++ (or Fortran) compiler, as well as an output target that have to do it once. For the csh or tcsh shells, add something like
ends in ".so", like libatc.o. As we or others create and contribute this to your ~/.cshrc file, one line for each of the two files:
these Makefiles, we will add them to the LAMMPS distribution.
setenv PYTHONPATH ${PYTHONPATH}:/home/sjplimp/lammps/python
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src :pre
If you use the python/install.py script, you need to invoke it every
time you rebuild LAMMPS (as a shared library) or make changes to the
python/lammps.py file.
You can invoke install.py from the python directory as
% python install.py [libdir] [pydir] :pre
The optional libdir is where to copy the LAMMPS shared library to; the
default is /usr/local/lib. The optional pydir is where to copy the
lammps.py file to; the default is the site-packages directory of the
version of Python that is running the install script.
Note that libdir must be a location that is in your default
LD_LIBRARY_PATH, like /usr/local/lib or /usr/lib. And pydir must be a
location that Python looks in by default for imported modules, like
its site-packages dir. If you want to copy these files to
non-standard locations, such as within your own user space, you will
need to set your PYTHONPATH and LD_LIBRARY_PATH environment variables
accordingly, as above.
If the instally.py script does not allow you to copy files into system
directories, prefix the python command with "sudo". If you do this,
make sure that the Python that root runs is the same as the Python you
run. E.g. you may need to do something like
% sudo /usr/local/bin/python install.py [libdir] [pydir] :pre
You can also invoke install.py from the make command in the src
directory as
% make install-python :pre
In this mode you cannot append optional arguments. Again, you may
need to prefix this with "sudo". In this mode you cannot control
which Python is invoked by root.
Note that if you want Python to be able to load different versions of
the LAMMPS shared library (see "this section"_#py_5 below), you will
need to manually copy files like lmplammps_g++.so into the appropriate
system directory. This is not needed if you set the LD_LIBRARY_PATH
environment variable as described above.
:line :line
@ -193,13 +181,12 @@ These include
"maroonmpi"_http://code.google.com/p/maroonmpi/ "maroonmpi"_http://code.google.com/p/maroonmpi/
"mpi4py"_http://code.google.com/p/mpi4py/ "mpi4py"_http://code.google.com/p/mpi4py/
"myMPI"_http://nbcr.sdsc.edu/forum/viewtopic.php?t=89&sid=c997fefc3933bd66204875b436940f16 "myMPI"_http://nbcr.sdsc.edu/forum/viewtopic.php?t=89&sid=c997fefc3933bd66204875b436940f16
"Pypar"_http://datamining.anu.edu.au/~ole/pypar :ul "Pypar"_http://code.google.com/p/pypar :ul
All of these except pyMPI work by wrapping the MPI library (which must All of these except pyMPI work by wrapping the MPI library and
be available on your system as a shared library, as discussed above), exposing (some portion of) its interface to your Python script. This
and exposing (some portion of) its interface to your Python script. means Python cannot be used interactively in parallel, since they do
This means Python cannot be used interactively in parallel, since they not address the issue of interactive input to multiple instances of
do not address the issue of interactive input to multiple instances of
Python running on different processors. The one exception is pyMPI, Python running on different processors. The one exception is pyMPI,
which alters the Python interpreter to address this issue, and (I which alters the Python interpreter to address this issue, and (I
believe) creates a new alternate executable (in place of "python" believe) creates a new alternate executable (in place of "python"
@ -229,17 +216,17 @@ sudo python setup.py install :pre
The "sudo" is only needed if required to copy Numpy files into your The "sudo" is only needed if required to copy Numpy files into your
Python distribution's site-packages directory. Python distribution's site-packages directory.
To install Pypar (version pypar-2.1.0_66 as of April 2009), unpack it To install Pypar (version pypar-2.1.4_94 as of Aug 2012), unpack it
and from its "source" directory, type and from its "source" directory, type
python setup.py build python setup.py build
sudo python setup.py install :pre sudo python setup.py install :pre
Again, the "sudo" is only needed if required to copy PyPar files into Again, the "sudo" is only needed if required to copy Pypar files into
your Python distribution's site-packages directory. your Python distribution's site-packages directory.
If you have successully installed Pypar, you should be able to run If you have successully installed Pypar, you should be able to run
python serially and type Python and type
import pypar :pre import pypar :pre
@ -255,6 +242,19 @@ print "Proc %d out of %d procs" % (pypar.rank(),pypar.size()) :pre
and see one line of output for each processor you run on. and see one line of output for each processor you run on.
IMPORTANT NOTE: To use Pypar and LAMMPS in parallel from Python, you
must insure both are using the same version of MPI. If you only have
one MPI installed on your system, this is not an issue, but it can be
if you have multiple MPIs. Your LAMMPS build is explicit about which
MPI it is using, since you specify the details in your lo-level
src/MAKE/Makefile.foo file. Pypar uses the "mpicc" command to find
information about the MPI it uses to build against. And it tries to
load "libmpi.so" from the LD_LIBRARY_PATH. This may or may not find
the MPI library that LAMMPS is using. If you have problems running
both Pypar and LAMMPS together, this is an issue you may need to
address, e.g. by moving other MPI installations so that Pypar finds
the right one.
:line :line
11.4 Testing the Python-LAMMPS interface :link(py_4),h4 11.4 Testing the Python-LAMMPS interface :link(py_4),h4
@ -268,24 +268,17 @@ and type:
If you get no errors, you're ready to use LAMMPS from Python. If you get no errors, you're ready to use LAMMPS from Python.
If the load fails, the most common error to see is If the load fails, the most common error to see is
"CDLL: asdfasdfasdf" OSError: Could not load LAMMPS dynamic library :pre
which means Python was unable to load the LAMMPS shared library. This which means Python was unable to load the LAMMPS shared library. This
can occur if it can't find the LAMMMPS library; see the environment typically occurs if the system can't find the LAMMMPS shared library
variable discussion "above"_#python_1. Or if it can't find one of the or one of the auxiliary shared libraries it depends on.
auxiliary libraries that was specified in the LAMMPS build, in a
shared dynamic library format. This includes all libraries needed by
main LAMMPS (e.g. MPI or FFTW or JPEG), system libraries needed by
main LAMMPS (e.g. extra libs needed by MPI), or packages you have
installed that require libraries provided with LAMMPS (e.g. the
USER-ATC package require lib/atc/libatc.so) or system libraries
(e.g. BLAS or Fortran-to-C libraries) listed in the
lib/package/Makefile.lammps file. Again, all of these must be
available as shared libraries, or the Python load will fail.
Python (actually the operating system) isn't verbose about telling you Python (actually the operating system) isn't verbose about telling you
why the load failed, so go through the steps above and in why the load failed, so carefully go through the steps above regarding
"Section_start 5"_Section_start.html#start_5 carefully. environment variables, and the instructions in "Section_start
5"_Section_start.html#start_5 about building a shared library and
about setting the LD_LIBRARY_PATH envirornment variable.
[Test LAMMPS and Python in serial:] :h5 [Test LAMMPS and Python in serial:] :h5
@ -330,10 +323,10 @@ and you should see the same output as if you had typed
Note that if you leave out the 3 lines from test.py that specify Pypar Note that if you leave out the 3 lines from test.py that specify Pypar
commands you will instantiate and run LAMMPS independently on each of commands you will instantiate and run LAMMPS independently on each of
the P processors specified in the mpirun command. In this case you the P processors specified in the mpirun command. In this case you
should get 4 sets of output, each showing that a run was made on a should get 4 sets of output, each showing that a LAMMPS run was made
single processor, instead of one set of output showing that it ran on on a single processor, instead of one set of output showing that
4 processors. If the 1-processor outputs occur, it means that Pypar LAMMPS ran on 4 processors. If the 1-processor outputs occur, it
is not working correctly. means that Pypar is not working correctly.
Also note that once you import the PyPar module, Pypar initializes MPI Also note that once you import the PyPar module, Pypar initializes MPI
for you, and you can use MPI calls directly in your Python script, as for you, and you can use MPI calls directly in your Python script, as
@ -341,6 +334,8 @@ described in the Pypar documentation. The last line of your Python
script should be pypar.finalize(), to insure MPI is shut down script should be pypar.finalize(), to insure MPI is shut down
correctly. correctly.
[Running Python scripts:] :h5
Note that any Python script (not just for LAMMPS) can be invoked in Note that any Python script (not just for LAMMPS) can be invoked in
one of several ways: one of several ways:
@ -374,25 +369,18 @@ The Python interface to LAMMPS consists of a Python "lammps" module,
the source code for which is in python/lammps.py, which creates a the source code for which is in python/lammps.py, which creates a
"lammps" object, with a set of methods that can be invoked on that "lammps" object, with a set of methods that can be invoked on that
object. The sample Python code below assumes you have first imported object. The sample Python code below assumes you have first imported
the "lammps" module in your Python script. You can also include its the "lammps" module in your Python script, as follows:
settings as follows, which are useful in test return values from some
of the methods described below:
from lammps import lammps from lammps import lammps :pre
from lammps import LMPINT as INT
from lammps import LMPDOUBLE as DOUBLE
from lammps import LMPIPTR as IPTR
from lammps import LMPDPTR as DPTR
from lammps import LMPDPTRPTR as DPTRPTR :pre
These are the methods defined by the lammps module. If you look These are the methods defined by the lammps module. If you look
at the file src/library.cpp you will see that they correspond at the file src/library.cpp you will see that they correspond
one-to-one with calls you can make to the LAMMPS library from a C++ or one-to-one with calls you can make to the LAMMPS library from a C++ or
C or Fortran program. C or Fortran program.
lmp = lammps() # create a LAMMPS object using the default liblmp.so library lmp = lammps() # create a LAMMPS object using the default liblammps.so library
lmp = lammps("g++") # create a LAMMPS object using the liblmp_g++.so library lmp = lammps("g++") # create a LAMMPS object using the liblammps_g++.so library
lmp = lammps("",list) # ditto, with command-line args, list = \["-echo","screen"\] lmp = lammps("",list) # ditto, with command-line args, e.g. list = \["-echo","screen"\]
lmp = lammps("g++",list) :pre lmp = lammps("g++",list) :pre
lmp.close() # destroy a LAMMPS object :pre lmp.close() # destroy a LAMMPS object :pre
@ -402,11 +390,15 @@ lmp.command(cmd) # invoke a single LAMMPS command, cmd = "run 100" :pre
xlo = lmp.extract_global(name,type) # extract a global quantity xlo = lmp.extract_global(name,type) # extract a global quantity
# name = "boxxlo", "nlocal", etc # name = "boxxlo", "nlocal", etc
# type = INT or DOUBLE :pre # type = 0 = int
# 1 = double :pre
coords = lmp.extract_atom(name,type) # extract a per-atom quantity coords = lmp.extract_atom(name,type) # extract a per-atom quantity
# name = "x", "type", etc # name = "x", "type", etc
# type = IPTR or DPTR or DPTRPTR :pre # type = 0 = vector of ints
# 1 = array of ints
# 2 = vector of doubles
# 3 = array of doubles :pre
eng = lmp.extract_compute(id,style,type) # extract value(s) from a compute eng = lmp.extract_compute(id,style,type) # extract value(s) from a compute
v3 = lmp.extract_fix(id,style,type,i,j) # extract value(s) from a fix v3 = lmp.extract_fix(id,style,type,i,j) # extract value(s) from a fix
@ -426,18 +418,23 @@ var = lmp.extract_variable(name,group,flag) # extract value(s) from a variable
# 1 = atom-style variable :pre # 1 = atom-style variable :pre
natoms = lmp.get_natoms() # total # of atoms as int natoms = lmp.get_natoms() # total # of atoms as int
x = lmp.get_coords() # return coords of all atoms in x data = lmp.gather_atoms(name,type,count) # return atom attribute of all atoms gathered into data, ordered by atom ID
lmp.put_coords(x) # set all atom coords via x :pre # name = "x", "charge", "type", etc
# count = # of per-atom values, 1 or 3, etc
lmp.scatter_atoms(name,type,count,data) # scatter atom attribute of all atoms from data, ordered by atom ID
# name = "x", "charge", "type", etc
# count = # of per-atom values, 1 or 3, etc :pre
:line :line
IMPORTANT NOTE: Currently, the creation of a LAMMPS object does not IMPORTANT NOTE: Currently, the creation of a LAMMPS object from within
take an MPI communicator as an argument. There should be a way to do lammps.py does not take an MPI communicator as an argument. There
this, so that the LAMMPS instance runs on a subset of processors if should be a way to do this, so that the LAMMPS instance runs on a
desired, but I don't know how to do it from Pypar. So for now, it subset of processors if desired, but I don't know how to do it from
runs on MPI_COMM_WORLD, which is all the processors. If someone Pypar. So for now, it runs with MPI_COMM_WORLD, which is all the
figures out how to do this with one or more of the Python wrappers for processors. If someone figures out how to do this with one or more of
MPI, like Pypar, please let us know and we will amend these doc pages. the Python wrappers for MPI, like Pypar, please let us know and we
will amend these doc pages.
Note that you can create multiple LAMMPS objects in your Python Note that you can create multiple LAMMPS objects in your Python
script, and coordinate and run multiple simulations, e.g. script, and coordinate and run multiple simulations, e.g.
@ -465,8 +462,8 @@ returned, which you can use via normal Python subscripting. See the
extract() method in the src/atom.cpp file for a list of valid names. extract() method in the src/atom.cpp file for a list of valid names.
Again, new names could easily be added. A pointer to a vector of Again, new names could easily be added. A pointer to a vector of
doubles or integers, or a pointer to an array of doubles (double **) doubles or integers, or a pointer to an array of doubles (double **)
is returned. You need to specify the appropriate data type via the or integers (int **) is returned. You need to specify the appropriate
type argument. data type via the type argument.
For extract_compute() and extract_fix(), the global, per-atom, or For extract_compute() and extract_fix(), the global, per-atom, or
local data calulated by the compute or fix can be accessed. What is local data calulated by the compute or fix can be accessed. What is
@ -494,58 +491,57 @@ Python subscripting. The values will be zero for atoms not in the
specified group. specified group.
The get_natoms() method returns the total number of atoms in the The get_natoms() method returns the total number of atoms in the
simulation, as an int. Note that extract_global("natoms") returns the simulation, as an int.
same value, but as a double, which is the way LAMMPS stores it to
allow for systems with more atoms than can be stored in an int (> 2
billion).
The get_coords() method returns an ctypes vector of doubles of length The gather_atoms() method returns a ctypes vector of ints or doubles
3*natoms, for the coordinates of all the atoms in the simulation, as specified by type, of length count*natoms, for the property of all
ordered by x,y,z and then by atom ID (see code for put_coords() the atoms in the simulation specified by name, ordered by count and
below). The array can be used via normal Python subscripting. If then by atom ID. The vector can be used via normal Python
atom IDs are not consecutively ordered within LAMMPS, a None is subscripting. If atom IDs are not consecutively ordered within
returned as indication of an error. LAMMPS, a None is returned as indication of an error.
Note that the data structure get_coords() returns is different from Note that the data structure gather_atoms("x") returns is different
the data structure returned by extract_atom("x") in four ways. (1) from the data structure returned by extract_atom("x") in four ways.
Get_coords() returns a vector which you index as x\[i\]; (1) Gather_atoms() returns a vector which you index as x\[i\];
extract_atom() returns an array which you index as x\[i\]\[j\]. (2) extract_atom() returns an array which you index as x\[i\]\[j\]. (2)
Get_coords() orders the atoms by atom ID while extract_atom() does Gather_atoms() orders the atoms by atom ID while extract_atom() does
not. (3) Get_coords() returns a list of all atoms in the simulation; not. (3) Gathert_atoms() returns a list of all atoms in the
extract_atoms() returns just the atoms local to each processor. (4) simulation; extract_atoms() returns just the atoms local to each
Finally, the get_coords() data structure is a copy of the atom coords processor. (4) Finally, the gather_atoms() data structure is a copy
stored internally in LAMMPS, whereas extract_atom returns an array of the atom coords stored internally in LAMMPS, whereas extract_atom()
that points directly to the internal data. This means you can change returns an array that effectively points directly to the internal
values inside LAMMPS from Python by assigning a new values to the data. This means you can change values inside LAMMPS from Python by
extract_atom() array. To do this with the get_atoms() vector, you assigning a new values to the extract_atom() array. To do this with
need to change values in the vector, then invoke the put_coords() the gather_atoms() vector, you need to change values in the vector,
method. then invoke the scatter_atoms() method.
The put_coords() method takes a vector of coordinates for all atoms in The scatter_atoms() method takes a vector of ints or doubles as
the simulation, assumed to be ordered by x,y,z and then by atom ID, specified by type, of length count*natoms, for the property of all the
and uses the values to overwrite the corresponding coordinates for atoms in the simulation specified by name, ordered by bount and then
each atom inside LAMMPS. This requires LAMMPS to have its "map" by atom ID. It uses the vector of data to overwrite the corresponding
option enabled; see the "atom_modify"_atom_modify.html command for properties for each atom inside LAMMPS. This requires LAMMPS to have
details. If it is not or if atom IDs are not consecutively ordered, its "map" option enabled; see the "atom_modify"_atom_modify.html
no coordinates are reset, command for details. If it is not, or if atom IDs are not
consecutively ordered, no coordinates are reset.
The array of coordinates passed to put_coords() must be a ctypes The array of coordinates passed to scatter_atoms() must be a ctypes
vector of doubles, allocated and initialized something like this: vector of ints or doubles, allocated and initialized something like
this:
from ctypes import * from ctypes import *
natoms = lmp.get_atoms() natoms = lmp.get_natoms()
n3 = 3*natoms n3 = 3*natoms
x = (c_double*n3)() x = (n3*c_double)()
x[0] = x coord of atom with ID 1 x[0] = x coord of atom with ID 1
x[1] = y coord of atom with ID 1 x[1] = y coord of atom with ID 1
x[2] = z coord of atom with ID 1 x[2] = z coord of atom with ID 1
x[3] = x coord of atom with ID 2 x[3] = x coord of atom with ID 2
... ...
x[n3-1] = z coord of atom with ID natoms x[n3-1] = z coord of atom with ID natoms
lmp.put_coords(x) :pre lmp.scatter_coords("x",1,3,x) :pre
Alternatively, you can just change values in the vector returned by Alternatively, you can just change values in the vector returned by
get_coords(), since it is a ctypes vector of doubles. gather_atoms("x",1,3), since it is a ctypes vector of doubles.
:line :line

View File

@ -281,10 +281,11 @@ dummy MPI library provided in src/STUBS, since you don't need a true
MPI library installed on your system. See the MPI library installed on your system. See the
src/MAKE/Makefile.serial file for how to specify the 3 MPI variables src/MAKE/Makefile.serial file for how to specify the 3 MPI variables
in this case. You will also need to build the STUBS library for your in this case. You will also need to build the STUBS library for your
platform before making LAMMPS itself. From the src directory, type platform before making LAMMPS itself. To build from the src
"make stubs", or from the STUBS dir, type "make" and it should create directory, type "make stubs", or from the STUBS dir, type "make".
a libmpi.a suitable for linking to LAMMPS. If this build fails, you This should create a libmpi_stubs.a file suitable for linking to
will need to edit the STUBS/Makefile for your platform. LAMMPS. If the build fails, you will need to edit the STUBS/Makefile
for your platform.
</P> </P>
<P>The file STUBS/mpi.cpp provides a CPU timer function called <P>The file STUBS/mpi.cpp provides a CPU timer function called
MPI_Wtime() that calls gettimeofday() . If your system doesn't MPI_Wtime() that calls gettimeofday() . If your system doesn't
@ -779,24 +780,28 @@ then be called from another application or a scripting language. See
LAMMPS to other codes. See <A HREF = "Section_python.html">this section</A> for LAMMPS to other codes. See <A HREF = "Section_python.html">this section</A> for
more info on wrapping and running LAMMPS from Python. more info on wrapping and running LAMMPS from Python.
</P> </P>
<H5><B>Static library:</B>
</H5>
<P>To build LAMMPS as a static library (*.a file on Linux), type <P>To build LAMMPS as a static library (*.a file on Linux), type
</P> </P>
<PRE>make makelib <PRE>make makelib
make -f Makefile.lib foo make -f Makefile.lib foo
</PRE> </PRE>
<P>where foo is the machine name. This kind of library is typically used <P>where foo is the machine name. This kind of library is typically used
to statically link a driver application to all of LAMMPS, so that you to statically link a driver application to LAMMPS, so that you can
can insure all dependencies are satisfied at compile time. Note that insure all dependencies are satisfied at compile time. Note that
inclusion or exclusion of any desired optional packages should be done inclusion or exclusion of any desired optional packages should be done
before typing "make makelib". The first "make" command will create a before typing "make makelib". The first "make" command will create a
current Makefile.lib with all the file names in your src dir. The 2nd current Makefile.lib with all the file names in your src dir. The
"make" command will use it to build LAMMPS as a static library, using second "make" command will use it to build LAMMPS as a static library,
the ARCHIVE and ARFLAGS settings in src/MAKE/Makefile.foo. The build using the ARCHIVE and ARFLAGS settings in src/MAKE/Makefile.foo. The
will create the file liblmp_foo.a which another application can link build will create the file liblammps_foo.a which another application can
to. link to.
</P> </P>
<H5><B>Shared library:</B>
</H5>
<P>To build LAMMPS as a shared library (*.so file on Linux), which can be <P>To build LAMMPS as a shared library (*.so file on Linux), which can be
dynamically loaded, type dynamically loaded, e.g. from Python, type
</P> </P>
<PRE>make makeshlib <PRE>make makeshlib
make -f Makefile.shlib foo make -f Makefile.shlib foo
@ -806,31 +811,58 @@ wrapping LAMMPS with Python; see <A HREF = "Section_python.html">Section_python<
for details. Again, note that inclusion or exclusion of any desired for details. Again, note that inclusion or exclusion of any desired
optional packages should be done before typing "make makelib". The optional packages should be done before typing "make makelib". The
first "make" command will create a current Makefile.shlib with all the first "make" command will create a current Makefile.shlib with all the
file names in your src dir. The 2nd "make" command will use it to file names in your src dir. The second "make" command will use it to
build LAMMPS as a shared library, using the SHFLAGS and SHLIBFLAGS build LAMMPS as a shared library, using the SHFLAGS and SHLIBFLAGS
settings in src/MAKE/Makefile.foo. The build will create the file settings in src/MAKE/Makefile.foo. The build will create the file
liblmp_foo.so which another application can link to dyamically, as liblammps_foo.so which another application can link to dyamically. It
well as a soft link liblmp.so, which the Python wrapper uses by will also create a soft link liblammps.so, which the Python wrapper uses
default. by default.
</P> </P>
<P>Note that for a shared library to be usable by a calling program, all <P>Note that for a shared library to be usable by a calling program, all
the auxiliary libraries it depends on must also exist as shared the auxiliary libraries it depends on must also exist as shared
libraries, and be find-able by the operating system. Else you will libraries. This will be the case for libraries included with LAMMPS,
get a run-time error when the shared library is loaded. For LAMMPS, such as the dummy MPI library in src/STUBS or any package libraries in
this includes all libraries needed by main LAMMPS (e.g. MPI or FFTW or lib/packges, since they are always built as shared libraries with the
JPEG), system libraries needed by main LAMMPS (e.g. extra libs needed -fPIC switch. However, if a library like MPI or FFTW does not exist
by MPI), or packages you have installed that require libraries as a shared library, the second make command will generate an error.
provided with LAMMPS (e.g. the USER-ATC package require This means you will need to install a shared library version of the
lib/atc/libatc.so) or system libraries (e.g. BLAS or Fortran-to-C package. The build instructions for the library should tell you how
libraries) listed in the lib/package/Makefile.lammps file. See the to do this.
discussion about the LAMMPS shared library in
<A HREF = "Section_python.html">Section_python</A> for details about how to build
shared versions of these libraries, and how to insure the operating
system can find them, by setting the LD_LIBRARY_PATH environment
variable correctly.
</P> </P>
<P>Either flavor of library allows one or more LAMMPS objects to be <P>As an example, here is how to build and install the <A HREF = "http://www-unix.mcs.anl.gov/mpi">MPICH
instantiated from the calling program. library</A>, a popular open-source version of MPI, distributed by
Argonne National Labs, as a shared library in the default
/usr/local/lib location:
</P>
<PRE>./configure --enable-shared
make
make install
</PRE>
<P>You may need to use "sudo make install" in place of the last line if
you do not have write privileges for /usr/local/lib. The end result
should be the file /usr/local/lib/libmpich.so.
</P>
<H5><B>Additional requirement for using a shared library:</B>
</H5>
<P>The operating system finds shared libraries to load at run-time using
the environment variable LD_LIBRARY_PATH. So you may wish to copy the
file src/liblammps.so or src/liblammps_g++.so (for example) to a place
the system can find it by default, such as /usr/local/lib, or you may
wish to add the lammps src directory to LD_LIBRARY_PATH, so that the
current version of the shared library is always available to programs
that use it.
</P>
<P>For the csh or tcsh shells, you would add something like this to your
~/.cshrc file:
</P>
<PRE>setenv LD_LIBRARY_PATH $<I>LD_LIBRARY_PATH</I>:/home/sjplimp/lammps/src
</PRE>
<H5><B>Calling the LAMMPS library:</B>
</H5>
<P>Either flavor of library (static or shared0 allows one or more LAMMPS
objects to be instantiated from the calling program.
</P> </P>
<P>When used from a C++ program, all of LAMMPS is wrapped in a LAMMPS_NS <P>When used from a C++ program, all of LAMMPS is wrapped in a LAMMPS_NS
namespace; you can safely use any of its classes and methods from namespace; you can safely use any of its classes and methods from
@ -841,17 +873,17 @@ Python, the library has a simple function-style interface, provided in
src/library.cpp and src/library.h. src/library.cpp and src/library.h.
</P> </P>
<P>See the sample codes in examples/COUPLE/simple for examples of C++ and <P>See the sample codes in examples/COUPLE/simple for examples of C++ and
C codes that invoke LAMMPS thru its library interface. There are C and Fortran codes that invoke LAMMPS thru its library interface.
other examples as well in the COUPLE directory which are discussed in There are other examples as well in the COUPLE directory which are
<A HREF = "Section_howto.html#howto_10">Section_howto 10</A> of the manual. See discussed in <A HREF = "Section_howto.html#howto_10">Section_howto 10</A> of the
<A HREF = "Section_python.html">Section_python</A> of the manual for a description manual. See <A HREF = "Section_python.html">Section_python</A> of the manual for a
of the Python wrapper provided with LAMMPS that operates through the description of the Python wrapper provided with LAMMPS that operates
LAMMPS library interface. through the LAMMPS library interface.
</P> </P>
<P>The files src/library.cpp and library.h contain the C-style interface <P>The files src/library.cpp and library.h define the C-style API for
to LAMMPS. See <A HREF = "Section_howto.html#howto_19">Section_howto 19</A> of the using LAMMPS as a library. See <A HREF = "Section_howto.html#howto_19">Section_howto
manual for a description of the interface and how to extend it for 19</A> of the manual for a description of the
your needs. interface and how to extend it for your needs.
</P> </P>
<HR> <HR>

View File

@ -275,10 +275,11 @@ dummy MPI library provided in src/STUBS, since you don't need a true
MPI library installed on your system. See the MPI library installed on your system. See the
src/MAKE/Makefile.serial file for how to specify the 3 MPI variables src/MAKE/Makefile.serial file for how to specify the 3 MPI variables
in this case. You will also need to build the STUBS library for your in this case. You will also need to build the STUBS library for your
platform before making LAMMPS itself. From the src directory, type platform before making LAMMPS itself. To build from the src
"make stubs", or from the STUBS dir, type "make" and it should create directory, type "make stubs", or from the STUBS dir, type "make".
a libmpi.a suitable for linking to LAMMPS. If this build fails, you This should create a libmpi_stubs.a file suitable for linking to
will need to edit the STUBS/Makefile for your platform. LAMMPS. If the build fails, you will need to edit the STUBS/Makefile
for your platform.
The file STUBS/mpi.c provides a CPU timer function called The file STUBS/mpi.c provides a CPU timer function called
MPI_Wtime() that calls gettimeofday() . If your system doesn't MPI_Wtime() that calls gettimeofday() . If your system doesn't
@ -773,24 +774,28 @@ then be called from another application or a scripting language. See
LAMMPS to other codes. See "this section"_Section_python.html for LAMMPS to other codes. See "this section"_Section_python.html for
more info on wrapping and running LAMMPS from Python. more info on wrapping and running LAMMPS from Python.
[Static library:] :h5
To build LAMMPS as a static library (*.a file on Linux), type To build LAMMPS as a static library (*.a file on Linux), type
make makelib make makelib
make -f Makefile.lib foo :pre make -f Makefile.lib foo :pre
where foo is the machine name. This kind of library is typically used where foo is the machine name. This kind of library is typically used
to statically link a driver application to all of LAMMPS, so that you to statically link a driver application to LAMMPS, so that you can
can insure all dependencies are satisfied at compile time. Note that insure all dependencies are satisfied at compile time. Note that
inclusion or exclusion of any desired optional packages should be done inclusion or exclusion of any desired optional packages should be done
before typing "make makelib". The first "make" command will create a before typing "make makelib". The first "make" command will create a
current Makefile.lib with all the file names in your src dir. The 2nd current Makefile.lib with all the file names in your src dir. The
"make" command will use it to build LAMMPS as a static library, using second "make" command will use it to build LAMMPS as a static library,
the ARCHIVE and ARFLAGS settings in src/MAKE/Makefile.foo. The build using the ARCHIVE and ARFLAGS settings in src/MAKE/Makefile.foo. The
will create the file liblmp_foo.a which another application can link build will create the file liblammps_foo.a which another application can
to. link to.
[Shared library:] :h5
To build LAMMPS as a shared library (*.so file on Linux), which can be To build LAMMPS as a shared library (*.so file on Linux), which can be
dynamically loaded, type dynamically loaded, e.g. from Python, type
make makeshlib make makeshlib
make -f Makefile.shlib foo :pre make -f Makefile.shlib foo :pre
@ -800,31 +805,58 @@ wrapping LAMMPS with Python; see "Section_python"_Section_python.html
for details. Again, note that inclusion or exclusion of any desired for details. Again, note that inclusion or exclusion of any desired
optional packages should be done before typing "make makelib". The optional packages should be done before typing "make makelib". The
first "make" command will create a current Makefile.shlib with all the first "make" command will create a current Makefile.shlib with all the
file names in your src dir. The 2nd "make" command will use it to file names in your src dir. The second "make" command will use it to
build LAMMPS as a shared library, using the SHFLAGS and SHLIBFLAGS build LAMMPS as a shared library, using the SHFLAGS and SHLIBFLAGS
settings in src/MAKE/Makefile.foo. The build will create the file settings in src/MAKE/Makefile.foo. The build will create the file
liblmp_foo.so which another application can link to dyamically, as liblammps_foo.so which another application can link to dyamically. It
well as a soft link liblmp.so, which the Python wrapper uses by will also create a soft link liblammps.so, which the Python wrapper uses
default. by default.
Note that for a shared library to be usable by a calling program, all Note that for a shared library to be usable by a calling program, all
the auxiliary libraries it depends on must also exist as shared the auxiliary libraries it depends on must also exist as shared
libraries, and be find-able by the operating system. Else you will libraries. This will be the case for libraries included with LAMMPS,
get a run-time error when the shared library is loaded. For LAMMPS, such as the dummy MPI library in src/STUBS or any package libraries in
this includes all libraries needed by main LAMMPS (e.g. MPI or FFTW or lib/packges, since they are always built as shared libraries with the
JPEG), system libraries needed by main LAMMPS (e.g. extra libs needed -fPIC switch. However, if a library like MPI or FFTW does not exist
by MPI), or packages you have installed that require libraries as a shared library, the second make command will generate an error.
provided with LAMMPS (e.g. the USER-ATC package require This means you will need to install a shared library version of the
lib/atc/libatc.so) or system libraries (e.g. BLAS or Fortran-to-C package. The build instructions for the library should tell you how
libraries) listed in the lib/package/Makefile.lammps file. See the to do this.
discussion about the LAMMPS shared library in
"Section_python"_Section_python.html for details about how to build
shared versions of these libraries, and how to insure the operating
system can find them, by setting the LD_LIBRARY_PATH environment
variable correctly.
Either flavor of library allows one or more LAMMPS objects to be As an example, here is how to build and install the "MPICH
instantiated from the calling program. library"_mpich, a popular open-source version of MPI, distributed by
Argonne National Labs, as a shared library in the default
/usr/local/lib location:
:link(mpich,http://www-unix.mcs.anl.gov/mpi)
./configure --enable-shared
make
make install :pre
You may need to use "sudo make install" in place of the last line if
you do not have write privileges for /usr/local/lib. The end result
should be the file /usr/local/lib/libmpich.so.
[Additional requirement for using a shared library:] :h5
The operating system finds shared libraries to load at run-time using
the environment variable LD_LIBRARY_PATH. So you may wish to copy the
file src/liblammps.so or src/liblammps_g++.so (for example) to a place
the system can find it by default, such as /usr/local/lib, or you may
wish to add the lammps src directory to LD_LIBRARY_PATH, so that the
current version of the shared library is always available to programs
that use it.
For the csh or tcsh shells, you would add something like this to your
~/.cshrc file:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src :pre
[Calling the LAMMPS library:] :h5
Either flavor of library (static or shared0 allows one or more LAMMPS
objects to be instantiated from the calling program.
When used from a C++ program, all of LAMMPS is wrapped in a LAMMPS_NS When used from a C++ program, all of LAMMPS is wrapped in a LAMMPS_NS
namespace; you can safely use any of its classes and methods from namespace; you can safely use any of its classes and methods from
@ -835,17 +867,17 @@ Python, the library has a simple function-style interface, provided in
src/library.cpp and src/library.h. src/library.cpp and src/library.h.
See the sample codes in examples/COUPLE/simple for examples of C++ and See the sample codes in examples/COUPLE/simple for examples of C++ and
C codes that invoke LAMMPS thru its library interface. There are C and Fortran codes that invoke LAMMPS thru its library interface.
other examples as well in the COUPLE directory which are discussed in There are other examples as well in the COUPLE directory which are
"Section_howto 10"_Section_howto.html#howto_10 of the manual. See discussed in "Section_howto 10"_Section_howto.html#howto_10 of the
"Section_python"_Section_python.html of the manual for a description manual. See "Section_python"_Section_python.html of the manual for a
of the Python wrapper provided with LAMMPS that operates through the description of the Python wrapper provided with LAMMPS that operates
LAMMPS library interface. through the LAMMPS library interface.
The files src/library.cpp and library.h contain the C-style interface The files src/library.cpp and library.h define the C-style API for
to LAMMPS. See "Section_howto 19"_Section_howto.html#howto_19 of the using LAMMPS as a library. See "Section_howto
manual for a description of the interface and how to extend it for 19"_Section_howto.html#howto_19 of the manual for a description of the
your needs. interface and how to extend it for your needs.
:line :line

View File

@ -327,17 +327,19 @@ direction for xy deformation) from the unstrained orientation.
</P> </P>
<P>The tilt factor T as a function of time will change as <P>The tilt factor T as a function of time will change as
</P> </P>
<PRE>T(t) = T0 + erate*dt <PRE>T(t) = T0 + L0*erate*dt
</PRE> </PRE>
<P>where T0 is the initial tilt factor and dt is the elapsed time (in <P>where T0 is the initial tilt factor, L0 is the original length of the
time units). Thus if <I>erate</I> R is specified as 0.1 and time units are box perpendicular to the shear direction (e.g. y box length for xy
picoseconds, this means the shear strain will increase by 0.1 every deformation), and dt is the elapsed time (in time units). Thus if
picosecond. I.e. if the xy shear strain was initially 0.0, then <I>erate</I> R is specified as 0.1 and time units are picoseconds, this
strain after 1 psec = 0.1, strain after 2 psec = 0.2, etc. Thus the means the shear strain will increase by 0.1 every picosecond. I.e. if
tilt factor would be 0.0 at time 0, 0.1*ybox at 1 psec, 0.2*ybox at 2 the xy shear strain was initially 0.0, then strain after 1 psec = 0.1,
psec, etc, where ybox is the original y box length. R = 1 or 2 means strain after 2 psec = 0.2, etc. Thus the tilt factor would be 0.0 at
the tilt factor will increase by 1 or 2 every picosecond. R = -0.01 time 0, 0.1*ybox at 1 psec, 0.2*ybox at 2 psec, etc, where ybox is the
means a decrease in shear strain by 0.01 every picosecond. original y box length. R = 1 or 2 means the tilt factor will increase
by 1 or 2 every picosecond. R = -0.01 means a decrease in shear
strain by 0.01 every picosecond.
</P> </P>
<P>The <I>trate</I> style changes a tilt factor at a "constant true shear <P>The <I>trate</I> style changes a tilt factor at a "constant true shear
strain rate". Note that this is not an "engineering shear strain strain rate". Note that this is not an "engineering shear strain

View File

@ -317,17 +317,19 @@ direction for xy deformation) from the unstrained orientation.
The tilt factor T as a function of time will change as The tilt factor T as a function of time will change as
T(t) = T0 + erate*dt :pre T(t) = T0 + L0*erate*dt :pre
where T0 is the initial tilt factor and dt is the elapsed time (in where T0 is the initial tilt factor, L0 is the original length of the
time units). Thus if {erate} R is specified as 0.1 and time units are box perpendicular to the shear direction (e.g. y box length for xy
picoseconds, this means the shear strain will increase by 0.1 every deformation), and dt is the elapsed time (in time units). Thus if
picosecond. I.e. if the xy shear strain was initially 0.0, then {erate} R is specified as 0.1 and time units are picoseconds, this
strain after 1 psec = 0.1, strain after 2 psec = 0.2, etc. Thus the means the shear strain will increase by 0.1 every picosecond. I.e. if
tilt factor would be 0.0 at time 0, 0.1*ybox at 1 psec, 0.2*ybox at 2 the xy shear strain was initially 0.0, then strain after 1 psec = 0.1,
psec, etc, where ybox is the original y box length. R = 1 or 2 means strain after 2 psec = 0.2, etc. Thus the tilt factor would be 0.0 at
the tilt factor will increase by 1 or 2 every picosecond. R = -0.01 time 0, 0.1*ybox at 1 psec, 0.2*ybox at 2 psec, etc, where ybox is the
means a decrease in shear strain by 0.01 every picosecond. original y box length. R = 1 or 2 means the tilt factor will increase
by 1 or 2 every picosecond. R = -0.01 means a decrease in shear
strain by 0.01 every picosecond.
The {trate} style changes a tilt factor at a "constant true shear The {trate} style changes a tilt factor at a "constant true shear
strain rate". Note that this is not an "engineering shear strain strain rate". Note that this is not an "engineering shear strain

View File

@ -58,6 +58,11 @@ results from a unitless LJ simulation into physical quantities.
<LI>electric field = force/charge, where E* = E (4 pi perm0 sigma epsilon)^1/2 sigma / epsilon <LI>electric field = force/charge, where E* = E (4 pi perm0 sigma epsilon)^1/2 sigma / epsilon
<LI>density = mass/volume, where rho* = rho sigma^dim <LI>density = mass/volume, where rho* = rho sigma^dim
</UL> </UL>
<P>Note that for LJ units, the default mode of thermodyamic output via
the <A HREF = "thermo_style.html">thermo_style</A> command is to normalize energies
by the number of atoms, i.e. energy/atom. This can be changed via the
<A HREF = "therm_modify.html">thermo_modify norm</A> command.
</P>
<P>For style <I>real</I>, these are the units: <P>For style <I>real</I>, these are the units:
</P> </P>
<UL><LI>mass = grams/mole <UL><LI>mass = grams/mole

View File

@ -55,6 +55,11 @@ dipole = reduced LJ dipole, moment where *mu = mu / (4 pi perm0 sigma^3 epsilon)
electric field = force/charge, where E* = E (4 pi perm0 sigma epsilon)^1/2 sigma / epsilon electric field = force/charge, where E* = E (4 pi perm0 sigma epsilon)^1/2 sigma / epsilon
density = mass/volume, where rho* = rho sigma^dim :ul density = mass/volume, where rho* = rho sigma^dim :ul
Note that for LJ units, the default mode of thermodyamic output via
the "thermo_style"_thermo_style.html command is to normalize energies
by the number of atoms, i.e. energy/atom. This can be changed via the
"thermo_modify norm"_therm_modify.html command.
For style {real}, these are the units: For style {real}, these are the units:
mass = grams/mole mass = grams/mole

View File

@ -17,7 +17,7 @@ library. Basically, you type something like
make makelib make makelib
make -f Makefile.lib g++ make -f Makefile.lib g++
in the LAMMPS src directory to create liblmp_g++.a in the LAMMPS src directory to create liblammps_g++.a
The library interface to LAMMPS is in src/library.cpp. Routines can The library interface to LAMMPS is in src/library.cpp. Routines can
be easily added to this file so an external program can perform the be easily added to this file so an external program can perform the
@ -34,5 +34,7 @@ library collection of useful inter-code communication routines
simple simple example of driver code calling LAMMPS as library simple simple example of driver code calling LAMMPS as library
fortran a wrapper on the LAMMPS library API that fortran a wrapper on the LAMMPS library API that
can be called from Fortran can be called from Fortran
fortran2 a more sophisticated wrapper on the LAMMPS library API that
can be called from Fortran
Each sub-directory has its own README. Each sub-directory has its own README.

View File

@ -1,9 +1,8 @@
libfwrapper.c is a C file that wraps the LAMMPS library API libfwrapper.c is a C file that wraps the LAMMPS library API
in src/library.h so that it can be called from Fortran. in src/library.h so that it can be called from Fortran.
See the couple/simple/simple.f90 program for an example See the couple/simple/simple.f90 program for an example of a Fortran
of a Fortran code that does this. code that does this.
See the README file in that dir for instructions See the README file in that dir for instructions on how to build a
on how to build a Fortran code that uses this Fortran code that uses this wrapper and links to the LAMMPS library.
wrapper and links to the LAMMPS library.

View File

@ -22,7 +22,7 @@
#include "library.h" /* this is a LAMMPS include file */ #include "library.h" /* this is a LAMMPS include file */
/* wrapper for creating a lammps instance from fortran. /* wrapper for creating a lammps instance from fortran.
since fortran has no simple way to emit a c-compatible since fortran has no simple way to emit a C-compatible
argument array, we don't support it. for simplicity, argument array, we don't support it. for simplicity,
the address of the pointer to the lammps object is the address of the pointer to the lammps object is
stored in a 64-bit integer on all platforms. */ stored in a 64-bit integer on all platforms. */
@ -109,6 +109,8 @@ void lammps_get_natoms_(int64_t *ptr, MPI_Fint *natoms)
/* wrapper to copy coordinates from lammps to fortran */ /* wrapper to copy coordinates from lammps to fortran */
/* NOTE: this is now out-of-date, needs to be updated to lammps_gather_atoms()
void lammps_get_coords_(int64_t *ptr, double *coords) void lammps_get_coords_(int64_t *ptr, double *coords)
{ {
void *obj; void *obj;
@ -117,8 +119,12 @@ void lammps_get_coords_(int64_t *ptr, double *coords)
lammps_get_coords(obj,coords); lammps_get_coords(obj,coords);
} }
*/
/* wrapper to copy coordinates from fortran to lammps */ /* wrapper to copy coordinates from fortran to lammps */
/* NOTE: this is now out-of-date, needs to be updated to lammps_scatter_atoms()
void lammps_put_coords_(int64_t *ptr, double *coords) void lammps_put_coords_(int64_t *ptr, double *coords)
{ {
void *obj; void *obj;
@ -127,3 +133,4 @@ void lammps_put_coords_(int64_t *ptr, double *coords)
lammps_put_coords(obj,coords); lammps_put_coords(obj,coords);
} }
*/

View File

@ -0,0 +1,235 @@
/* -----------------------------------------------------------------------
LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
www.cs.sandia.gov/~sjplimp/lammps.html
Steve Plimpton, sjplimp@sandia.gov, Sandia National Laboratories
Copyright (2003) Sandia Corporation. Under the terms of Contract
DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
certain rights in this software. This software is distributed under
the GNU General Public License.
See the README file in the top-level LAMMPS directory.
------------------------------------------------------------------------- */
/* ------------------------------------------------------------------------
Contributing author: Karl D. Hammond <karlh@ugcs.caltech.edu>
University of Tennessee, Knoxville (USA), 2012
------------------------------------------------------------------------- */
/* This is set of "wrapper" functions to assist LAMMPS.F90, which itself
provides a (I hope) robust Fortran interface to library.cpp and
library.h. All functions herein COULD be added to library.cpp instead of
including this as a separate file. See the README for instructions. */
#include <mpi.h>
#include "LAMMPS-wrapper.h"
#include <library.h>
#include <lammps.h>
#include <atom.h>
#include <fix.h>
#include <compute.h>
#include <modify.h>
#include <error.h>
using namespace LAMMPS_NS;
void lammps_open_fortran_wrapper (int argc, char **argv,
MPI_Fint communicator, void **ptr)
{
MPI_Comm C_communicator = MPI_Comm_f2c (communicator);
lammps_open (argc, argv, C_communicator, ptr);
}
int lammps_get_ntypes (void *ptr)
{
class LAMMPS *lmp = (class LAMMPS *) ptr;
int ntypes = lmp->atom->ntypes;
return ntypes;
}
void lammps_error_all (void *ptr, const char *file, int line, const char *str)
{
class LAMMPS *lmp = (class LAMMPS *) ptr;
lmp->error->all (file, line, str);
}
int lammps_extract_compute_vectorsize (void *ptr, char *id, int style)
{
class LAMMPS *lmp = (class LAMMPS *) ptr;
int icompute = lmp->modify->find_compute(id);
if ( icompute < 0 ) return 0;
class Compute *compute = lmp->modify->compute[icompute];
if ( style == 0 )
{
if ( !compute->vector_flag )
return 0;
else
return compute->size_vector;
}
else if ( style == 1 )
{
return lammps_get_natoms (ptr);
}
else if ( style == 2 )
{
if ( !compute->local_flag )
return 0;
else
return compute->size_local_rows;
}
else
return 0;
}
void lammps_extract_compute_arraysize (void *ptr, char *id, int style,
int *nrows, int *ncols)
{
class LAMMPS *lmp = (class LAMMPS *) ptr;
int icompute = lmp->modify->find_compute(id);
if ( icompute < 0 )
{
*nrows = 0;
*ncols = 0;
}
class Compute *compute = lmp->modify->compute[icompute];
if ( style == 0 )
{
if ( !compute->array_flag )
{
*nrows = 0;
*ncols = 0;
}
else
{
*nrows = compute->size_array_rows;
*ncols = compute->size_array_cols;
}
}
else if ( style == 1 )
{
if ( !compute->peratom_flag )
{
*nrows = 0;
*ncols = 0;
}
else
{
*nrows = lammps_get_natoms (ptr);
*ncols = compute->size_peratom_cols;
}
}
else if ( style == 2 )
{
if ( !compute->local_flag )
{
*nrows = 0;
*ncols = 0;
}
else
{
*nrows = compute->size_local_rows;
*ncols = compute->size_local_cols;
}
}
else
{
*nrows = 0;
*ncols = 0;
}
return;
}
int lammps_extract_fix_vectorsize (void *ptr, char *id, int style)
{
class LAMMPS *lmp = (class LAMMPS *) ptr;
int ifix = lmp->modify->find_fix(id);
if ( ifix < 0 ) return 0;
class Fix *fix = lmp->modify->fix[ifix];
if ( style == 0 )
{
if ( !fix->vector_flag )
return 0;
else
return fix->size_vector;
}
else if ( style == 1 )
{
return lammps_get_natoms (ptr);
}
else if ( style == 2 )
{
if ( !fix->local_flag )
return 0;
else
return fix->size_local_rows;
}
else
return 0;
}
void lammps_extract_fix_arraysize (void *ptr, char *id, int style,
int *nrows, int *ncols)
{
class LAMMPS *lmp = (class LAMMPS *) ptr;
int ifix = lmp->modify->find_fix(id);
if ( ifix < 0 )
{
*nrows = 0;
*ncols = 0;
}
class Fix *fix = lmp->modify->fix[ifix];
if ( style == 0 )
{
if ( !fix->array_flag )
{
*nrows = 0;
*ncols = 0;
}
else
{
*nrows = fix->size_array_rows;
*ncols = fix->size_array_cols;
}
}
else if ( style == 1 )
{
if ( !fix->peratom_flag )
{
*nrows = 0;
*ncols = 0;
}
else
{
*nrows = lammps_get_natoms (ptr);
*ncols = fix->size_peratom_cols;
}
}
else if ( style == 2 )
{
if ( !fix->local_flag )
{
*nrows = 0;
*ncols = 0;
}
else
{
*nrows = fix->size_local_rows;
*ncols = fix->size_local_cols;
}
}
else
{
*nrows = 0;
*ncols = 0;
}
return;
}
/* vim: set ts=3 sts=3 expandtab: */

View File

@ -0,0 +1,47 @@
/* -----------------------------------------------------------------------
LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
www.cs.sandia.gov/~sjplimp/lammps.html
Steve Plimpton, sjplimp@sandia.gov, Sandia National Laboratories
Copyright (2003) Sandia Corporation. Under the terms of Contract
DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
certain rights in this software. This software is distributed under
the GNU General Public License.
See the README file in the top-level LAMMPS directory.
------------------------------------------------------------------------- */
/* ------------------------------------------------------------------------
Contributing author: Karl D. Hammond <karlh@ugcs.caltech.edu>
University of Tennessee, Knoxville (USA), 2012
------------------------------------------------------------------------- */
/* This is set of "wrapper" functions to assist LAMMPS.F90, which itself
provides a (I hope) robust Fortran interface to library.cpp and
library.h. All prototypes herein COULD be added to library.h instead of
including this as a separate file. See the README for instructions. */
/* These prototypes probably belong in mpi.h in the src/STUBS directory. */
#ifndef OPEN_MPI
#define MPI_Comm_f2c(a) a
#define MPI_Fint int
#endif
#ifdef __cplusplus
extern "C" {
#endif
/* Prototypes for auxiliary functions */
void lammps_open_fortran_wrapper (int, char**, MPI_Fint, void**);
int lammps_get_ntypes (void*);
int lammps_extract_compute_vectorsize (void*, char*, int);
void lammps_extract_compute_arraysize (void*, char*, int, int*, int*);
int lammps_extract_fix_vectorsize (void*, char*, int);
void lammps_extract_fix_arraysize (void*, char*, int, int*, int*);
void lammps_error_all (void *ptr, const char*, int, const char*);
#ifdef __cplusplus
}
#endif
/* vim: set ts=3 sts=3 expandtab: */

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,221 @@
LAMMPS.F90 defines a Fortran 2003 module, LAMMPS, which wraps all functions in
src/library.h so they can be used directly from Fortran-encoded programs.
All functions in src/library.h that use and/or return C-style pointers have
Fortran wrapper functions that use Fortran-style arrays, pointers, and
strings; all C-style memory management is handled internally with no user
intervention.
This interface was created by Karl Hammond who you can contact with
questions:
Karl D. Hammond
University of Tennessee, Knoxville
karlh at ugcs.caltech.edu
karlh atutk.edu
-------------------------------------
--COMPILATION--
First, be advised that mixed-language programming is not trivial. It requires
you to link in the required libraries of all languages you use (in this case,
those for Fortran, C, and C++), as well as any other libraries required.
You are also advised to read the --USE-- section below before trying to
compile.
The following steps will work to compile this module (replace ${LAMMPS_SRC}
with the path to your LAMMPS source directory):
(1) Compile LAMMPS as a static library. Call the resulting file ${LAMMPS_LIB},
which will have an actual name lake liblammps_openmpi.a. If compiling
using the MPI stubs in ${LAMMPS_SRC}/STUBS, you will need to know where
libmpi.a is as well (I'll call it ${MPI_STUBS} hereafter)
(2) Copy said library to your Fortran program's source directory or include
its location in a -L${LAMMPS_SRC} flag to your compiler.
(3) Compile (but don't link!) LAMMPS.F90. Example:
mpif90 -c LAMMPS.f90
OR
gfortran -c LAMMPS.F90
Copy the LAMMPS.o and lammps.mod (or whatever your compiler calls module
files) to your Fortran program's source directory.
NOTE: you may get a warning such as,
subroutine lammps_open_wrapper (argc, argv, communicator, ptr) &
Variable 'communicator' at (1) is a parameter to the BIND(C)
procedure 'lammps_open_wrapper' but may not be C interoperable
This is normal (see --IMPLEMENTATION NOTES--).
(4) Compile (but don't link) LAMMPS-wrapper.cpp. You will need its header
file as well. You will have to provide the locations of LAMMPS's
header files. For example,
mpicxx -c -I${LAMMPS_SRC} LAMMPS-wrapper.cpp
OR
g++ -c -I${LAMMPS_SRC} -I${LAMMPS_SRC}/STUBS LAMMPS-wrapper.cpp
OR
icpc -c -I${LAMMPS_SRC} -I${LAMMPS_SRC}/STUBS LAMMPS-wrapper.cpp
Copy the resulting object file LAMMPS-wrapper.o to your Fortran program's
source directory.
(4b) OPTIONAL: Make a library so you can carry around two files instead of
three. Example:
ar rs liblammps_fortran.a LAMMPS.o LAMMPS-wrapper.o
This will create the file liblammps_fortran.a that you can use in place
of "LAMMPS.o LAMMPS-wrapper.o" in part (6). Note that you will still
need to have the .mod file from part (3).
It is also possible to add LAMMPS.o and LAMMPS-wrapper.o into the
LAMMPS library (e.g., liblammps_openmpi.a) instead of creating a separate
library, like so:
ar rs ${LAMMPS_LIB} LAMMPS.o LAMMPS-wrapper.o
In this case, you can now use the Fortran wrapper functions as if they
were part of the usual LAMMPS library interface (if you have the module
file visible to the compiler, that is).
(5) Compile your Fortran program. Example:
mpif90 -c myfreeformatfile.f90
mpif90 -c myfixedformatfile.f
OR
gfortran -c myfreeformatfile.f90
gfortran -c myfixedformatfile.f
The object files generated by these steps are collectively referred to
as ${my_object_files} in the next step(s).
IMPORTANT: If the Fortran module from part (3) is not in the current
directory or in one searched by the compiler for module files, you will
need to include that location via the -I flag to the compiler.
(6) Link everything together, including any libraries needed by LAMMPS (such
as the C++ standard library, the C math library, the JPEG library, fftw,
etc.) For example,
mpif90 LAMMPS.o LAMMPS-wrapper.o ${my_object_files} \
${LAMMPS_LIB} -lstdc++ -lm
OR
gfortran LAMMPS.o LAMMPS-wrapper.o ${my_object_files} \
${LAMMPS_LIB} ${MPI_STUBS} -lstdc++ -lm
OR
ifort LAMMPS.o LAMMPS-wrapper.o ${my_object_files} \
${LAMMPS_LIB} ${MPI_STUBS} -cxxlib -limf -lm
Any other required libraries (e.g. -ljpeg, -lfftw) should be added to
the end of this line.
You should now have a working executable.
Steps 3 and 4 above are accomplished, possibly after some modifications to
the makefile, by make using the attached makefile.
-------------------------------------
--USAGE--
To use this API, your program unit (PROGRAM/SUBROUTINE/FUNCTION/MODULE/etc.)
should look something like this:
program call_lammps
use LAMMPS
! Other modules, etc.
implicit none
type (lammps_instance) :: lmp ! This is a pointer to your LAMMPS instance
double precision :: fix
double precision, dimension(:), allocatable :: fix2
! Rest of declarations
call lammps_open_no_mpi ('lmp -in /dev/null -screen out.lammps',lmp)
! Set up rest of program here
call lammps_file (lmp, 'in.example')
call lammps_extract_fix (fix, lmp, '2', 0, 1, 1, 1)
call lammps_extract_fix (fix2, lmp, '4', 0, 2, 1, 1)
call lammps_close (lmp)
end program call_lammps
Important notes:
* All arguments which are char* variables in library.cpp are character (len=*)
variables here. For example,
call lammps_command (lmp, 'units metal')
will work as expected.
* The public functions (the only ones you can use) have interfaces as
described in the comments at the top of LAMMPS.F90. They are not always
the same as those in library.h, since C strings are replaced by Fortran
strings and the like.
* The module attempts to check whether you have done something stupid (such
as assign a 2D array to a scalar), but it's not perfect. For example, the
command
call lammps_extract_global (nlocal, ptr, 'nlocal')
will give nlocal correctly if nlocal is of type INTEGER, but it will give
the wrong answer if nlocal is of type REAL or DOUBLE PRECISION. This is a
feature of the (void*) type cast in library.cpp. There is no way I can
check this for you!
* You are allowed to use REAL or DOUBLE PRECISION floating-point numbers.
All LAMMPS data (which are of type REAL(C_double)) are rounded off if
placed in single precision variables. It is tacitly assumed that NO C++
variables are of type float; everything is int or double (since this is
all library.cpp currently handles).
* An example of a complete program is offered at the end of this file.
-------------------------------------
--TROUBLESHOOTING--
Compile-time errors probably indicate that your compiler is not new enough to
support Fortran 2003 features. For example, GCC 4.1.2 will not compile this
module, but GCC 4.4.0 will.
If your compiler balks at 'use, intrinsic :: ISO_C_binding,' try removing the
intrinsic part so it looks like an ordinary module. However, it is likely
that such a compiler will also have problems with everything else in the
file as well.
If you get a segfault as soon as the lammps_open call is made, check that you
compiled your program AND LAMMPS-header.cpp using the same MPI headers. Using
the stubs for one and the actual MPI library for the other will cause major
problems.
If you find run-time errors, please pass them along via the LAMMPS Users
mailing list. Please provide a minimal working example along with the names
and versions of the compilers you are using. Please make sure the error is
repeatable and is in MY code, not yours (generating a minimal working example
will usually ensure this anyway).
-------------------------------------
--IMPLEMENTATION NOTES--
The Fortran procedures have the same names as the C procedures, and
their purpose is the same, but they may take different arguments. Here are
some of the important differences:
* lammps_open and lammps_open_no_mpi take a string instead of argc and
argv. This is necessary because C and C++ have a very different way
of treating strings than Fortran.
* All C++ functions that accept char* pointers now accept Fortran-style
strings within this interface instead.
* All of the lammps_extract_[something] functions, which return void*
C-style pointers, have been replaced by generic subroutines that return
Fortran variables (which may be arrays). The first argument houses the
variable to be returned; all other arguments are identical except as
stipulated above. Note that it is not possible to declare generic
functions that are selected based solely on the type/kind/rank (TKR)
signature of the return value, only based on the TKR of the arguments.
* The SHAPE of the first argument to lammps_extract_[something] is checked
against the "shape" of the C array (e.g., double vs. double* vs. double**).
Calling a subroutine with arguments of inappropriate rank will result in an
error at run time.
* All arrays passed to subroutines must be ALLOCATABLE and are REALLOCATED
to fit the shape of the array LAMMPS will be returning.
* The indices i and j in lammps_extract_fix are used the same way they
are in f_ID[i][j] references in LAMMPS (i.e., starting from 1). This is
different than the way library.cpp uses these numbers, but is more
consistent with the way arrays are accessed in LAMMPS and in Fortran.
* The char* pointer normally returned by lammps_command is thrown away
in this version; note also that lammps_command is now a subroutine
instead of a function.
* The pointer to LAMMPS itself is of type(lammps_instance), which is itself
a synonym for type(C_ptr), part of ISO_C_BINDING. Type (C_ptr) is
C's void* data type. This should be the only C data type that needs to
be used by the end user.
* This module will almost certainly generate a compile-time warning,
such as,
subroutine lammps_open_wrapper (argc, argv, communicator, ptr) &
Variable 'communicator' at (1) is a parameter to the BIND(C)
procedure 'lammps_open_wrapper' but may not be C interoperable
This happens because lammps_open_wrapper actually takes a Fortran
INTEGER argument, whose type is defined by the MPI library itself. The
Fortran integer is converted to a C integer by the MPI library (if such
conversion is actually necessary).
* Unlike library.cpp, this module returns COPIES of the data LAMMPS actually
uses. This is done for safety reasons, as you should, in general, not be
overwriting LAMMPS data directly from Fortran. If you require this
functionality, it is possible to write another function that, for example,
returns a Fortran pointer that resolves to the C/C++ data instead of
copying the contents of that pointer to the original array as is done now.

View File

@ -0,0 +1,15 @@
units metal
lattice bcc 3.1656
region simbox block 0 10 0 10 0 10
create_box 2 simbox
create_atoms 1 region simbox
pair_style eam/fs
pair_coeff * * path/to/my_potential.eam.fs A1 A2
mass 1 58.2 # These are made-up numbers
mass 2 28.3
velocity all create 1200.0 7474848 dist gaussian
fix 1 all nve
fix 2 all dt/reset 1 1E-5 1E-3 0.01 units box
fix 4 all ave/histo 10 5 100 0.5 1.5 50 f_2 file temp.histo ave running
thermo_style custom step dt temp press etotal f_4[1][1]
thermo 100

View File

@ -0,0 +1,33 @@
SHELL = /bin/sh
# Path to LAMMPS extraction directory
LAMMPS_ROOT = ../svn-dist
LAMMPS_SRC = $(LAMMPS_ROOT)/src
# Remove the line below if using mpicxx/mpic++ as your C++ compiler
MPI_STUBS = $(LAMMPS_SRC)/STUBS
FC = gfortran # replace with your Fortran compiler
CXX = g++ # replace with your C++ compiler
# Flags for Fortran compiler, C++ compiler, and C preprocessor, respectively
FFLAGS = -O2
CXXFLAGS = -O2
CPPFLAGS =
all : liblammps_fortran.a
liblammps_fortran.a : LAMMPS.o LAMMPS-wrapper.o
$(AR) rs $@ $^
LAMMPS.o lammps.mod : LAMMPS.F90
$(FC) $(CPPFLAGS) $(FFLAGS) -c $<
LAMMPS-wrapper.o : LAMMPS-wrapper.cpp LAMMPS-wrapper.h
$(CXX) $(CPPFLAGS) $(CXXFLAGS) -c $< -I$(LAMMPS_SRC) -I$(MPI_STUBS)
clean :
$(RM) *.o *.mod liblammps_fortran.a
dist :
tar -czf Fortran-interface.tar.gz LAMMPS-wrapper.h LAMMPS-wrapper.cpp LAMMPS.F90 makefile README

View File

@ -0,0 +1,44 @@
program simple
use LAMMPS
implicit none
type (lammps_instance) :: lmp
double precision :: compute, fix, fix2
double precision, dimension(:), allocatable :: compute_v, mass, r
double precision, dimension(:,:), allocatable :: x
real, dimension(:,:), allocatable :: x_r
call lammps_open_no_mpi ('',lmp)
call lammps_file (lmp, 'in.simple')
call lammps_command (lmp, 'run 500')
call lammps_extract_fix (fix, lmp, '2', 0, 1, 1, 1)
print *, 'Fix is ', fix
call lammps_extract_fix (fix2, lmp, '4', 0, 2, 1, 1)
print *, 'Fix 2 is ', fix2
call lammps_extract_compute (compute, lmp, 'thermo_temp', 0, 0)
print *, 'Compute is ', compute
call lammps_extract_compute (compute_v, lmp, 'thermo_temp', 0, 1)
print *, 'Vector is ', compute_v
call lammps_extract_atom (mass, lmp, 'mass')
print *, 'Mass is ', mass
call lammps_extract_atom (x, lmp, 'x')
if ( .not. allocated (x) ) print *, 'x is not allocated'
print *, 'x is ', x(1,:)
call lammps_extract_atom (x_r, lmp, 'x')
if ( .not. allocated (x_r) ) print *, 'x is not allocated'
print *, 'x_r is ', x_r(1,:)
call lammps_get_coords (lmp, r)
print *, 'r is ', r(1:3)
call lammps_close (lmp)
end program simple

View File

@ -35,7 +35,8 @@ gcc -L/home/sjplimp/lammps/src simple.o \
-llmp_g++ -lfftw -lmpich -lmpl -lpthread -lstdc++ -o simpleC -llmp_g++ -lfftw -lmpich -lmpl -lpthread -lstdc++ -o simpleC
This builds the Fortran wrapper and driver with the LAMMPS library This builds the Fortran wrapper and driver with the LAMMPS library
using a Fortran and C compiler: using a Fortran and C compiler, using the wrapper in the fortran
directory:
cp ../fortran/libfwrapper.c . cp ../fortran/libfwrapper.c .
gcc -I/home/sjplimp/lammps/src -c libfwrapper.c gcc -I/home/sjplimp/lammps/src -c libfwrapper.c

View File

@ -99,10 +99,10 @@ int main(int narg, char **arg)
int natoms = lammps_get_natoms(ptr); int natoms = lammps_get_natoms(ptr);
double *x = (double *) malloc(3*natoms*sizeof(double)); double *x = (double *) malloc(3*natoms*sizeof(double));
lammps_get_coords(ptr,x); lammps_gather_atoms(lmp,"x",1,3,x);
double epsilon = 0.1; double epsilon = 0.1;
x[0] += epsilon; x[0] += epsilon;
lammps_put_coords(ptr,x); lammps_scatter_atoms(lmp,"x",1,3,x);
free(x); free(x);
lammps_command(ptr,"run 1"); lammps_command(ptr,"run 1");

View File

@ -23,6 +23,7 @@
#include "stdlib.h" #include "stdlib.h"
#include "string.h" #include "string.h"
#include "mpi.h" #include "mpi.h"
#include "lammps.h" // these are LAMMPS include files #include "lammps.h" // these are LAMMPS include files
#include "input.h" #include "input.h"
#include "atom.h" #include "atom.h"
@ -104,10 +105,10 @@ int main(int narg, char **arg)
int natoms = static_cast<int> (lmp->atom->natoms); int natoms = static_cast<int> (lmp->atom->natoms);
double *x = new double[3*natoms]; double *x = new double[3*natoms];
lammps_get_coords(lmp,x); // no LAMMPS class function for this lammps_gather_atoms(lmp,"x",1,3,x);
double epsilon = 0.1; double epsilon = 0.1;
x[0] += epsilon; x[0] += epsilon;
lammps_put_coords(lmp,x); // no LAMMPS class function for this lammps_scatter_atoms(lmp,"x",1,3,x);
delete [] x; delete [] x;
lmp->input->one("run 1"); lmp->input->one("run 1");

View File

@ -115,9 +115,9 @@ PROGRAM f_driver
CALL lammps_get_natoms(ptr,natoms) CALL lammps_get_natoms(ptr,natoms)
ALLOCATE(x(3*natoms)) ALLOCATE(x(3*natoms))
CALL lammps_get_coords(ptr,x) CALL lammps_gather_atoms(ptr,'x',1,3,x);
x(1) = x(1) + epsilon x(1) = x(1) + epsilon
CALL lammps_put_coords(ptr,x) CALL lammps_scatter_atoms(ptr,'x',1,3,x);
DEALLOCATE(x) DEALLOCATE(x)

View File

@ -98,7 +98,7 @@ OBJ = $(SRC:.cpp=.o)
# the same MPI library that LAMMPS is built with # the same MPI library that LAMMPS is built with
CC = g++ CC = g++
CCFLAGS = -O -g -I../../src -DMPICH_IGNORE_CXX_SEEK CCFLAGS = -O -g -fPIC -I../../src -DMPICH_IGNORE_CXX_SEEK
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
DEPFLAGS = -M DEPFLAGS = -M

View File

@ -98,7 +98,7 @@ OBJ = $(SRC:.cpp=.o)
# the same MPI library that LAMMPS is built with # the same MPI library that LAMMPS is built with
CC = icc CC = icc
CCFLAGS = -O -g -I../../src -DMPICH_IGNORE_CXX_SEEK CCFLAGS = -O -g -fPIC -I../../src -DMPICH_IGNORE_CXX_SEEK
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
DEPFLAGS = -M DEPFLAGS = -M

View File

@ -98,7 +98,7 @@ OBJ = $(SRC:.cpp=.o)
# the same MPI library that LAMMPS is built with # the same MPI library that LAMMPS is built with
CC = g++ CC = g++
CCFLAGS = -O -g -I../../src -I../../src/STUBS CCFLAGS = -O -g -fPIC -I../../src -I../../src/STUBS
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
DEPFLAGS = -M DEPFLAGS = -M

View File

@ -33,7 +33,7 @@ OBJ = $(SRC:.cpp=.o)
# the same MPI library that LAMMPS is built with # the same MPI library that LAMMPS is built with
CC = mpic++ CC = mpic++
CCFLAGS = -O -Isystems/interact/TCP/ -Isystems/interact -Iivutils/include CCFLAGS = -O -fPIC -Isystems/interact/TCP/ -Isystems/interact -Iivutils/include
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
DEPFLAGS = -M DEPFLAGS = -M

View File

@ -3,7 +3,7 @@
# ------ SETTINGS ------ # ------ SETTINGS ------
CXX = g++ CXX = g++
CXXFLAGS = -O2 -g -funroll-loops # -DCOLVARS_DEBUG CXXFLAGS = -O2 -g -fPIC -funroll-loops # -DCOLVARS_DEBUG
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rscv ARCHFLAG = -rscv
SHELL = /bin/sh SHELL = /bin/sh

View File

@ -27,7 +27,7 @@ OBJ = $(SRC:.f=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
FC = gfortran FC = gfortran
FFLAGS = -O3 -march=native -mpc64 \ FFLAGS = -O3 -fPIC -march=native -mpc64 \
-ffast-math -funroll-loops -fstrict-aliasing -Wall -W -Wno-uninitialized -fno-second-underscore -ffast-math -funroll-loops -fstrict-aliasing -Wall -W -Wno-uninitialized -fno-second-underscore
FFLAGS0 = -O0 -march=native -mpc64 \ FFLAGS0 = -O0 -march=native -mpc64 \
-Wall -W -Wno-uninitialized -fno-second-underscore -Wall -W -Wno-uninitialized -fno-second-underscore

View File

@ -23,7 +23,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = g95 F90 = g95
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -29,7 +29,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = gfortran F90 = gfortran
F90FLAGS = -O2 -ffast-math -ftree-vectorize -fexpensive-optimizations -fno-second-underscore F90FLAGS = -O2 -fPIC -ffast-math -ftree-vectorize -fexpensive-optimizations -fno-second-underscore
#F90FLAGS = -O #F90FLAGS = -O
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc

View File

@ -23,7 +23,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = ifort F90 = ifort
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -23,7 +23,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = pgf90 F90 = pgf90
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -32,7 +32,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = mpif90 F90 = mpif90
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
LINK = g++ LINK = g++

View File

@ -67,7 +67,7 @@ OBJ = $(SRC:.cpp=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
CC = g++ CC = g++
CCFLAGS = -O2 -Wall -W -funroll-loops -ffast-math -fexpensive-optimizations -finline-functions -fno-rtti -fno-exceptions -Wall #-Wno-deprecated CCFLAGS = -O2 -fPIC -Wall -W -funroll-loops -ffast-math -fexpensive-optimizations -finline-functions -fno-rtti -fno-exceptions -Wall #-Wno-deprecated
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
DEPFLAGS = -M DEPFLAGS = -M

View File

@ -67,7 +67,7 @@ OBJ = $(SRC:.cpp=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
CC = icc CC = icc
CCFLAGS = -O -Wall -Wcheck -wd869,981,1572 CCFLAGS = -O -fPIC -Wall -Wcheck -wd869,981,1572
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
DEPFLAGS = -M DEPFLAGS = -M

View File

@ -67,7 +67,7 @@ OBJ = $(SRC:.cpp=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
CC = CC CC = CC
CCFLAGS = -O -g -Wall #-Wno-deprecated CCFLAGS = -O -fPIC -g -Wall #-Wno-deprecated
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
DEPFLAGS = -M DEPFLAGS = -M

View File

@ -39,7 +39,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = g77 F90 = g77
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -39,7 +39,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = g95 F90 = g95
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -43,7 +43,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = gfortran F90 = gfortran
F90FLAGS = -O3 -Wall -march=native -mpc64 -ffast-math -funroll-loops -fno-second-underscore F90FLAGS = -O3 -Wall -march=native -mpc64 -ffast-math -funroll-loops -fno-second-underscore -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -43,7 +43,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = ifort F90 = ifort
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -39,7 +39,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = pgf90 F90 = pgf90
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -44,7 +44,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = mpif90 F90 = mpif90
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -43,7 +43,7 @@ OBJ = $(SRC:.F=.o)
# ------ SETTINGS ------ # ------ SETTINGS ------
F90 = mpif90 F90 = mpif90
F90FLAGS = -O F90FLAGS = -O -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = -rc ARCHFLAG = -rc
USRLIB = USRLIB =

View File

@ -3,41 +3,29 @@ and allows the LAMMPS library interface to be invoked from Python,
either from a script or interactively. either from a script or interactively.
Details on the Python interface to LAMMPS and how to build LAMMPS as a Details on the Python interface to LAMMPS and how to build LAMMPS as a
shared library for use with Python are given in shared library, for use with Python, are given in
doc/Section_python.html. doc/Section_python.html and in doc/Section_start.html#start_5.
Basically you need to follow these 3 steps: Basically you need to follow these steps in the src directory:
a) Add paths to environment variables in your shell script % make makeshlib # creates Makefile.shlib
% make -f Makefile.shlib g++ # or whatever machine target you wish
% make install-python # may need to do this via sudo
For example, for csh or tcsh, add something like this to ~/.cshrc: You can replace the last step with running the python/install.py
script directly to give you more control over where two relevant files
are installed, or by setting environment variables in your shell
script. See doc/Section_python.html for details.
setenv PYTHONPATH ${PYTHONPATH}:/home/sjplimp/lammps/python You can then launch Python and instantiate an instance of LAMMPS:
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src
setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:/home/sjplimp/lammps/src/STUBS
The latter is only necessary if you will use the MPI stubs library
instead of an MPI installed on your machine.
b) Build LAMMPS as a dynamic library, including dynamic versions of
any libraries it includes for the packages you have installed,
e.g. STUBS, MPI, FFTW, JPEG, package libs.
From the src directory:
% make makeshlib
% make -f Makefile.shlib g++
If successful, this results in the file src/liblmp_g++.so
c) Launch Python and import the LAMMPS wrapper
% python % python
>>> from lammps import lammps >>> from lammps import lammps
>>> lmp = lammps() >>> lmp = lammps()
If that gives no errors, you have succesfully wrapped LAMMPS with If that gives no errors, you have succesfully wrapped LAMMPS with
Python. Python. See doc/Section_python.html#py_5 for tests you can then use
to run LAMMPS both in serial or parallel thru Python.
------------------------------------------------------------------- -------------------------------------------------------------------

View File

@ -18,6 +18,7 @@ if len(argv) != 2:
infile = sys.argv[1] infile = sys.argv[1]
me = 0 me = 0
# uncomment if running in parallel via Pypar # uncomment if running in parallel via Pypar
#import pypar #import pypar
#me = pypar.rank() #me = pypar.rank()
@ -38,12 +39,11 @@ for line in lines: lmp.command(line)
# run a single step with changed coords # run a single step with changed coords
lmp.command("run 10") lmp.command("run 10")
x = lmp.get_coords() x = lmp.gather_atoms("x",1,3)
epsilon = 0.1 epsilon = 0.1
x[0] += epsilon x[0] += epsilon
lmp.put_coords(x) lmp.scatter_atoms("x",1,3,x)
lmp.command("run 1"); lmp.command("run 1");
lmp.command("run 1")
# uncomment if running in parallel via Pypar # uncomment if running in parallel via Pypar
#print "Proc %d out of %d procs has" % (me,nprocs), lmp #print "Proc %d out of %d procs has" % (me,nprocs), lmp

35
python/install.py Normal file
View File

@ -0,0 +1,35 @@
#!/usr/local/bin/python
# copy LAMMPS shared library src/liblammps.so and lammps.py to system dirs
# Syntax: python install.py [libdir] [pydir]
# libdir = target dir for src/liblammps.so, default = /usr/local/lib
# pydir = target dir for lammps.py, default = Python site-packages dir
import sys,commands
if len(sys.argv) > 3:
print "Syntax: python install.py [libdir] [pydir]"
sys.exit()
if len(sys.argv) >= 2: libdir = sys.argv[1]
else: libdir = "/usr/local/lib"
if len(sys.argv) == 3: pydir = sys.argv[2]
else:
paths = sys.path
for i,path in enumerate(paths):
index = path.rfind("site-packages")
if index < 0: continue
if index == len(path) - len("site-packages"): break
pydir = paths[i]
str = "cp ../src/liblammps.so %s" % libdir
print str
outstr = commands.getoutput(str)
if len(outstr.strip()): print outstr
str = "cp ../python/lammps.py %s" % pydir
print str
outstr = commands.getoutput(str)
if len(outstr.strip()): print outstr

View File

@ -17,23 +17,15 @@ import types
from ctypes import * from ctypes import *
import os.path import os.path
LMPINT = 0
LMPDOUBLE = 1
LMPIPTR = 2
LMPDPTR = 3
LMPDPTRPTR = 4
LOCATION = os.path.dirname(__file__)
class lammps: class lammps:
def __init__(self,name="",cmdlineargs=None): def __init__(self,name="",cmdargs=None):
# load liblmp.so by default # load liblammps.so by default
# if name = "g++", load liblmp_g++.so # if name = "g++", load liblammps_g++.so
try: try:
if not name: self.lib = CDLL("liblmp.so") if not name: self.lib = CDLL("liblammps.so")
else: self.lib = CDLL("liblmp_%s.so" % name) else: self.lib = CDLL("liblammps_%s.so" % name)
except: except:
raise OSError,"Could not load LAMMPS dynamic library" raise OSError,"Could not load LAMMPS dynamic library"
@ -42,10 +34,10 @@ class lammps:
# no_mpi call lets LAMMPS use MPI_COMM_WORLD # no_mpi call lets LAMMPS use MPI_COMM_WORLD
# cargs = array of C strings from args # cargs = array of C strings from args
if cmdlineargs: if cmdargs:
cmdlineargs.insert(0,"lammps.py") cmdargs.insert(0,"lammps.py")
narg = len(cmdlineargs) narg = len(cmdargs)
cargs = (c_char_p*narg)(*cmdlineargs) cargs = (c_char_p*narg)(*cmdargs)
self.lmp = c_void_p() self.lmp = c_void_p()
self.lib.lammps_open_no_mpi(narg,cargs,byref(self.lmp)) self.lib.lammps_open_no_mpi(narg,cargs,byref(self.lmp))
else: else:
@ -68,30 +60,26 @@ class lammps:
self.lib.lammps_command(self.lmp,cmd) self.lib.lammps_command(self.lmp,cmd)
def extract_global(self,name,type): def extract_global(self,name,type):
if type == LMPDOUBLE: if type == 0:
self.lib.lammps_extract_global.restype = POINTER(c_double)
ptr = self.lib.lammps_extract_global(self.lmp,name)
return ptr[0]
if type == LMPINT:
self.lib.lammps_extract_global.restype = POINTER(c_int) self.lib.lammps_extract_global.restype = POINTER(c_int)
elif type == 1:
self.lib.lammps_extract_global.restype = POINTER(c_double)
else: return None
ptr = self.lib.lammps_extract_global(self.lmp,name) ptr = self.lib.lammps_extract_global(self.lmp,name)
return ptr[0] return ptr[0]
return None
def extract_atom(self,name,type): def extract_atom(self,name,type):
if type == LMPDPTRPTR: if type == 0:
self.lib.lammps_extract_atom.restype = POINTER(POINTER(c_double))
ptr = self.lib.lammps_extract_atom(self.lmp,name)
return ptr
if type == LMPDPTR:
self.lib.lammps_extract_atom.restype = POINTER(c_double)
ptr = self.lib.lammps_extract_atom(self.lmp,name)
return ptr
if type == LMPIPTR:
self.lib.lammps_extract_atom.restype = POINTER(c_int) self.lib.lammps_extract_atom.restype = POINTER(c_int)
elif type == 1:
self.lib.lammps_extract_atom.restype = POINTER(POINTER(c_int))
elif type == 2:
self.lib.lammps_extract_atom.restype = POINTER(c_double)
elif type == 3:
self.lib.lammps_extract_atom.restype = POINTER(POINTER(c_double))
else: return None
ptr = self.lib.lammps_extract_atom(self.lmp,name) ptr = self.lib.lammps_extract_atom(self.lmp,name)
return ptr return ptr
return None
def extract_compute(self,id,style,type): def extract_compute(self,id,style,type):
if type == 0: if type == 0:
@ -153,18 +141,26 @@ class lammps:
return result return result
return None return None
# return total number of atoms in system
def get_natoms(self): def get_natoms(self):
return self.lib.lammps_get_natoms(self.lmp) return self.lib.lammps_get_natoms(self.lmp)
def get_coords(self): # return vector of atom properties gathered across procs, ordered by atom ID
nlen = 3 * self.lib.lammps_get_natoms(self.lmp)
coords = (c_double*nlen)()
self.lib.lammps_get_coords(self.lmp,coords)
return coords
# assume coords is an array of c_double, as created by get_coords() def gather_atoms(self,name,type,count):
# could check if it is some other Python object and create c_double array? natoms = self.lib.lammps_get_natoms(self.lmp)
# constructor for c_double array can take an arg to use to fill it? if type == 0:
data = ((count*natoms)*c_double)()
self.lib.lammps_gather_atoms(self.lmp,name,type,count,data)
elif type == 1:
data = ((count*natoms)*c_double)()
self.lib.lammps_gather_atoms(self.lmp,name,type,count,data)
else: return None
return data
def put_coords(self,coords): # scatter vector of atom properties across procs, ordered by atom ID
self.lib.lammps_put_coords(self.lmp,coords) # assume vector is of correct type and length, as created by gather_atoms()
def scatter_atoms(self,name,type,count,data):
self.lib.lammps_scatter_atoms(self.lmp,name,type,count,data)

View File

@ -85,6 +85,9 @@ $(EXE): $(OBJ)
lib: $(OBJ) lib: $(OBJ)
$(ARCHIVE) $(ARFLAGS) $(EXE) $(OBJ) $(ARCHIVE) $(ARFLAGS) $(EXE) $(OBJ)
#shlib: $(OBJ)
# $(ARCHIVE) $(ARFLAGS) $(EXE) $(OBJ)
shlib: $(OBJ) shlib: $(OBJ)
$(CC) $(CCFLAGS) $(SHFLAGS) $(SHLIBFLAGS) $(EXTRA_PATH) -o $(EXE) \ $(CC) $(CCFLAGS) $(SHFLAGS) $(SHLIBFLAGS) $(EXTRA_PATH) -o $(EXE) \
$(OBJ) $(EXTRA_LIB) $(LIB) $(OBJ) $(EXTRA_LIB) $(LIB)

View File

@ -2,25 +2,17 @@
SHELL = /bin/sh SHELL = /bin/sh
# this Makefile builds LAMMPS for RedSky with OpenMPI # This Makefile builds LAMMPS for RedSky with OpenMPI.
# to invoke this Makefile, you need these modules loaded: # To use this Makefile, you need appropriate modules loaded.
# mpi/openmpi-1.4.1_oobpr_intel-11.1-f064-c064 # You can determine which modules are loaded by typing:
# misc/env-openmpi-1.4-oobpr
# compilers/intel-11.1-f064-c064
# libraries/intel-mkl-11.1.064
# libraries/fftw-2.1.5_openmpi-1.4.1_oobpr_intel-11.1-f064-c064
# you can determine which modules are loaded by typing:
# module list # module list
# these modules are not the default ones, but can be enabled by # These modules can be enabled by lines like this in your .cshrc or
# lines like this in your .cshrc or other start-up shell file # other start-up shell file or by typing them before you build LAMMPS:
# or by typing them before you build LAMMPS: # module load mpi/openmpi-1.4.2_oobpr_intel-11.1-f064-c064
# module load mpi/openmpi-1.4.3_oobpr_intel-11.1-f064-c064
# module load misc/env-openmpi-1.4-oobpr
# module load compilers/intel-11.1-f064-c064
# module load libraries/intel-mkl-11.1.064 # module load libraries/intel-mkl-11.1.064
# module load libraries/fftw-2.1.5_openmpi-1.4.3_oobpr_intel-11.1-f064-c064 # module load libraries/fftw-2.1.5_openmpi-1.4.2_oobpr_intel-11.1-f064-c064
# these same modules need to be loaded to submit a LAMMPS job, # These same modules need to be loaded to submit a LAMMPS job,
# either interactively or via a batch script # either interactively or via a batch script.
# IMPORTANT NOTE: # IMPORTANT NOTE:
# to run efficiently on RedSky, use the "numa_wrapper" mpiexec option, # to run efficiently on RedSky, use the "numa_wrapper" mpiexec option,

View File

@ -38,10 +38,15 @@ help:
@echo '' @echo ''
@echo 'make clean-all delete all object files' @echo 'make clean-all delete all object files'
@echo 'make clean-machine delete object files for one machine' @echo 'make clean-machine delete object files for one machine'
@echo 'make tar lmp_src.tar.gz of src dir and packages' @echo 'make tar create lmp_src.tar.gz of src dir and packages'
@echo 'make makelib update Makefile.lib for static library build' @echo 'make makelib create Makefile.lib for static library build'
@echo 'make makeshlib update Makefile.shlib for shared library build' @echo 'make makeshlib create Makefile.shlib for shared library build'
@echo 'make makelist update Makefile.list used by old makes' @echo 'make makelist create Makefile.list used by old makes'
@echo 'make -f Makefile.lib machine build LAMMPS as static library for machine'
@echo 'make -f Makefile.shlib machine build LAMMPS as shared library for machine'
@echo 'make -f Makefile.list machine build LAMMPS from explicit list of files'
@echo 'make stubs build dummy MPI library in STUBS'
@echo 'make install-python install LAMMPS wrapper in Python'
@echo '' @echo ''
@echo 'make package list available packages' @echo 'make package list available packages'
@echo 'make package-status status of all packages' @echo 'make package-status status of all packages'
@ -106,12 +111,12 @@ tar:
@cd STUBS; make @cd STUBS; make
@echo "Created $(ROOT)_src.tar.gz" @echo "Created $(ROOT)_src.tar.gz"
# Make MPI STUBS lib # Make MPI STUBS library
stubs: stubs:
@cd STUBS; make clean; make @cd STUBS; make clean; make
# Update Makefile.lib and Makefile.list # Create Makefile.lib, Makefile.shlib, and Makefile.list
makelib: makelib:
@$(SHELL) Make.sh style @$(SHELL) Make.sh style
@ -125,6 +130,11 @@ makelist:
@$(SHELL) Make.sh style @$(SHELL) Make.sh style
@$(SHELL) Make.sh Makefile.list @$(SHELL) Make.sh Makefile.list
# install LAMMPS shared lib and Python wrapper in Python
install-python:
@python ../python/install.py
# Package management # Package management
package: package:

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@ -482,7 +482,7 @@ void PRD::dynamics()
update->integrate->setup(); update->integrate->setup();
// this may be needed if don't do full init // this may be needed if don't do full init
//modify->addstep_compute_all(update->ntimestep); //modify->addstep_compute_all(update->ntimestep);
int ncalls = neighbor->ncalls; bigint ncalls = neighbor->ncalls;
timer->barrier_start(Timer::LOOP); timer->barrier_start(Timer::LOOP);
update->integrate->run(t_event); update->integrate->run(t_event);

View File

@ -39,8 +39,8 @@ class PRD : protected Pointers {
int equal_size_replicas,natoms; int equal_size_replicas,natoms;
int neigh_every,neigh_delay,neigh_dist_check; int neigh_every,neigh_delay,neigh_dist_check;
int nbuild,ndanger;
int quench_reneighbor; int quench_reneighbor;
bigint nbuild,ndanger;
double time_dephase,time_dynamics,time_quench,time_comm,time_output; double time_dephase,time_dynamics,time_quench,time_comm,time_output;
double time_start; double time_start;

View File

@ -42,8 +42,8 @@ class TAD : protected Pointers {
int event_first; int event_first;
int neigh_every,neigh_delay,neigh_dist_check; int neigh_every,neigh_delay,neigh_dist_check;
int nbuild,ndanger;
int quench_reneighbor; int quench_reneighbor;
bigint nbuild,ndanger;
double time_dynamics,time_quench,time_neb,time_comm,time_output; double time_dynamics,time_quench,time_neb,time_comm,time_output;
double time_start; double time_start;

View File

@ -1,8 +1,7 @@
# Makefile for MPI stubs library # Makefile for MPI stubs library
# Syntax: # Syntax:
# make # build static lib as libmpi_stubs.a # make # build lib as libmpi_stubs.a
# make shlib # build shared lib as libmpi_stubs.so
# make clean # remove *.o and lib files # make clean # remove *.o and lib files
# edit System-specific settings as needed for your platform # edit System-specific settings as needed for your platform
@ -18,34 +17,27 @@ INC = mpi.h
# Definitions # Definitions
EXE = libmpi_stubs.a EXE = libmpi_stubs.a
SHLIB = libmpi_stubs.so
OBJ = $(SRC:.c=.o) OBJ = $(SRC:.c=.o)
# System-specific settings # System-specific settings
CC = g++ CC = g++
CCFLAGS = -O CCFLAGS = -O -fPIC
SHFLAGS = -fPIC
ARCHIVE = ar ARCHIVE = ar
ARCHFLAG = rs ARCHFLAG = rs
SHLIBFLAGS = -shared
# Targets # Targets
lib: $(OBJ) lib: $(OBJ)
$(ARCHIVE) $(ARCHFLAG) $(EXE) $(OBJ) $(ARCHIVE) $(ARCHFLAG) $(EXE) $(OBJ)
shlib: $(OBJ)
$(CC) $(CFLAGS) $(SHFLAGS) $(SHLIBFLAGS) -o $(SHLIB) $(OBJ)
clean: clean:
rm -f *.o libmpi_stubs.a libmpi_stubs.so rm -f *.o libmpi_stubs.a
# Compilation rules # Compilation rules
.c.o: .c.o:
$(CC) $(CCFLAGS) $(SHFLAGS) -c $< $(CC) $(CCFLAGS) -c $<
# Individual dependencies # Individual dependencies

View File

@ -48,30 +48,42 @@ using namespace LAMMPS_NS;
Cuda::Cuda(LAMMPS *lmp) : Pointers(lmp) Cuda::Cuda(LAMMPS* lmp) : Pointers(lmp)
{ {
cuda_exists=true; cuda_exists = true;
lmp->cuda=this; lmp->cuda = this;
if(universe->me==0)
if(universe->me == 0)
printf("# Using LAMMPS_CUDA \n"); printf("# Using LAMMPS_CUDA \n");
shared_data.me=universe->me;
device_set=false; shared_data.me = universe->me;
device_set = false;
Cuda_Cuda_GetCompileSettings(&shared_data); Cuda_Cuda_GetCompileSettings(&shared_data);
if(shared_data.compile_settings.prec_glob!=static_cast<int>(sizeof(CUDA_FLOAT))/4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: Global Precision: cuda %i cpp %i\n\n",shared_data.compile_settings.prec_glob, static_cast<int>(sizeof(CUDA_FLOAT))/4); if(shared_data.compile_settings.prec_glob != sizeof(CUDA_FLOAT) / 4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: Global Precision: cuda %i cpp %i\n\n", shared_data.compile_settings.prec_glob, sizeof(CUDA_FLOAT) / 4);
if(shared_data.compile_settings.prec_x!=static_cast<int>(sizeof(X_FLOAT))/4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: X Precision: cuda %i cpp %i\n\n",shared_data.compile_settings.prec_x, static_cast<int>(sizeof(X_FLOAT))/4);
if(shared_data.compile_settings.prec_v!=static_cast<int>(sizeof(V_FLOAT))/4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: V Precision: cuda %i cpp %i\n\n",shared_data.compile_settings.prec_v, static_cast<int>(sizeof(V_FLOAT))/4);
if(shared_data.compile_settings.prec_f!=static_cast<int>(sizeof(F_FLOAT))/4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: F Precision: cuda %i cpp %i\n\n",shared_data.compile_settings.prec_f, static_cast<int>(sizeof(F_FLOAT))/4);
if(shared_data.compile_settings.prec_pppm!=static_cast<int>(sizeof(PPPM_FLOAT))/4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: PPPM Precision: cuda %i cpp %i\n\n",shared_data.compile_settings.prec_pppm, static_cast<int>(sizeof(PPPM_FLOAT))/4);
if(shared_data.compile_settings.prec_fft!=static_cast<int>(sizeof(FFT_FLOAT))/4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: FFT Precision: cuda %i cpp %i\n\n",shared_data.compile_settings.prec_fft, static_cast<int>(sizeof(FFT_FLOAT))/4);
#ifdef FFT_CUFFT
if(shared_data.compile_settings.cufft!=1) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: cufft: cuda %i cpp %i\n\n",shared_data.compile_settings.cufft, 1);
#else
if(shared_data.compile_settings.cufft!=0) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: cufft: cuda %i cpp %i\n\n",shared_data.compile_settings.cufft, 0);
#endif
if(shared_data.compile_settings.arch!=CUDA_ARCH) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: arch: cuda %i cpp %i\n\n",shared_data.compile_settings.cufft, CUDA_ARCH); if(shared_data.compile_settings.prec_x != sizeof(X_FLOAT) / 4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: X Precision: cuda %i cpp %i\n\n", shared_data.compile_settings.prec_x, sizeof(X_FLOAT) / 4);
if(shared_data.compile_settings.prec_v != sizeof(V_FLOAT) / 4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: V Precision: cuda %i cpp %i\n\n", shared_data.compile_settings.prec_v, sizeof(V_FLOAT) / 4);
if(shared_data.compile_settings.prec_f != sizeof(F_FLOAT) / 4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: F Precision: cuda %i cpp %i\n\n", shared_data.compile_settings.prec_f, sizeof(F_FLOAT) / 4);
if(shared_data.compile_settings.prec_pppm != sizeof(PPPM_FLOAT) / 4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: PPPM Precision: cuda %i cpp %i\n\n", shared_data.compile_settings.prec_pppm, sizeof(PPPM_FLOAT) / 4);
if(shared_data.compile_settings.prec_fft != sizeof(FFT_FLOAT) / 4) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: FFT Precision: cuda %i cpp %i\n\n", shared_data.compile_settings.prec_fft, sizeof(FFT_FLOAT) / 4);
#ifdef FFT_CUFFT
if(shared_data.compile_settings.cufft != 1) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: cufft: cuda %i cpp %i\n\n", shared_data.compile_settings.cufft, 1);
#else
if(shared_data.compile_settings.cufft != 0) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: cufft: cuda %i cpp %i\n\n", shared_data.compile_settings.cufft, 0);
#endif
if(shared_data.compile_settings.arch != CUDA_ARCH) printf("\n\n # CUDA WARNING: Compile Settings of cuda and cpp code differ! \n # CUDA WARNING: arch: cuda %i cpp %i\n\n", shared_data.compile_settings.cufft, CUDA_ARCH);
cu_x = 0; cu_x = 0;
cu_v = 0; cu_v = 0;
@ -111,14 +123,14 @@ Cuda::Cuda(LAMMPS *lmp) : Pointers(lmp)
cu_map_array = 0; cu_map_array = 0;
copy_buffer=0; copy_buffer = 0;
copy_buffersize=0; copy_buffersize = 0;
neighbor_decide_by_integrator=0; neighbor_decide_by_integrator = 0;
pinned=true; pinned = true;
debugdata=0; debugdata = 0;
new int[2*CUDA_MAX_DEBUG_SIZE]; new int[2 * CUDA_MAX_DEBUG_SIZE];
finished_setup = false; finished_setup = false;
begin_setup = false; begin_setup = false;
@ -126,16 +138,16 @@ Cuda::Cuda(LAMMPS *lmp) : Pointers(lmp)
setSharedDataZero(); setSharedDataZero();
uploadtime=0; uploadtime = 0;
downloadtime=0; downloadtime = 0;
dotiming=false; dotiming = false;
dotestatom = false; dotestatom = false;
testatom = 0; testatom = 0;
oncpu = true; oncpu = true;
self_comm = 0; self_comm = 0;
MYDBG( printf("# CUDA: Cuda::Cuda Done...\n");) MYDBG(printf("# CUDA: Cuda::Cuda Done...\n");)
//cCudaData<double, float, yx > //cCudaData<double, float, yx >
} }
@ -144,7 +156,7 @@ Cuda::~Cuda()
print_timings(); print_timings();
if(universe->me==0) printf("# CUDA: Free memory...\n"); if(universe->me == 0) printf("# CUDA: Free memory...\n");
delete cu_q; delete cu_q;
delete cu_x; delete cu_x;
@ -178,8 +190,8 @@ Cuda::~Cuda()
delete cu_map_array; delete cu_map_array;
std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.begin(); std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.begin();
while(p != neigh_lists.end())
{ while(p != neigh_lists.end()) {
delete p->second; delete p->second;
++p; ++p;
} }
@ -188,75 +200,81 @@ Cuda::~Cuda()
void Cuda::accelerator(int narg, char** arg) void Cuda::accelerator(int narg, char** arg)
{ {
if(device_set) return; if(device_set) return;
if(universe->me==0)
if(universe->me == 0)
printf("# CUDA: Activate GPU \n"); printf("# CUDA: Activate GPU \n");
int* devicelist=NULL; int* devicelist = NULL;
int pppn=2; int pppn = 2;
for(int i=0;i<narg;i++)
{ for(int i = 0; i < narg; i++) {
if(strcmp(arg[i],"gpu/node")==0) if(strcmp(arg[i], "gpu/node") == 0) {
{ if(++i == narg)
if(++i==narg) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting a number after 'gpu/node' option.");
error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting a number after 'gpu/node' option.");
pppn=atoi(arg[i]); pppn = atoi(arg[i]);
} }
if(strcmp(arg[i],"gpu/node/special")==0) if(strcmp(arg[i], "gpu/node/special") == 0) {
{ if(++i == narg)
if(++i==narg) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting number of GPUs to be used per node after keyword 'gpu/node/special'.");
error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting number of GPUs to be used per node after keyword 'gpu/node/special'.");
pppn=atoi(arg[i]); pppn = atoi(arg[i]);
if(pppn<1) error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting number of GPUs to be used per node after keyword 'gpu/node special'.");
if(i+pppn==narg) if(pppn < 1) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting number of GPUs to be used per node after keyword 'gpu/node special'.");
error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting list of device ids after keyword 'gpu/node special'.");
devicelist=new int[pppn]; if(i + pppn == narg)
for(int k=0;k<pppn;k++) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting list of device ids after keyword 'gpu/node special'.");
{i++;devicelist[k]=atoi(arg[i]);}
devicelist = new int[pppn];
for(int k = 0; k < pppn; k++) {
i++;
devicelist[k] = atoi(arg[i]);
}
} }
if(strcmp(arg[i],"pinned")==0) if(strcmp(arg[i], "pinned") == 0) {
{ if(++i == narg)
if(++i==narg) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting a number after 'pinned' option.");
error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting a number after 'pinned' option.");
pinned=atoi(arg[i])==0?false:true; pinned = atoi(arg[i]) == 0 ? false : true;
if((pinned==false)&&(universe->me==0)) printf(" #CUDA: Pinned memory is not used for communication\n");
if((pinned == false) && (universe->me == 0)) printf(" #CUDA: Pinned memory is not used for communication\n");
} }
if(strcmp(arg[i],"timing")==0) if(strcmp(arg[i], "timing") == 0) {
{ dotiming = true;
dotiming=true;
} }
if(strcmp(arg[i],"suffix")==0) if(strcmp(arg[i], "suffix") == 0) {
{ if(++i == narg)
if(++i==narg) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting a string after 'suffix' option.");
error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting a string after 'suffix' option.");
strcpy(lmp->suffix,arg[i]); strcpy(lmp->suffix, arg[i]);
} }
if(strcmp(arg[i],"overlap_comm")==0) if(strcmp(arg[i], "overlap_comm") == 0) {
{ shared_data.overlap_comm = 1;
shared_data.overlap_comm=1;
} }
if(strcmp(arg[i],"test")==0) if(strcmp(arg[i], "test") == 0) {
{ if(++i == narg)
if(++i==narg) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting a number after 'test' option.");
error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting a number after 'test' option.");
testatom=atof(arg[i]); testatom = atof(arg[i]);
dotestatom=true; dotestatom = true;
} }
if(strcmp(arg[i],"override/bpa")==0) if(strcmp(arg[i], "override/bpa") == 0) {
{ if(++i == narg)
if(++i==narg) error->all(FLERR, "Invalid Options for 'accelerator' command. Expecting a number after 'override/bpa' option.");
error->all(FLERR,"Invalid Options for 'accelerator' command. Expecting a number after 'override/bpa' option.");
shared_data.pair.override_block_per_atom = atoi(arg[i]); shared_data.pair.override_block_per_atom = atoi(arg[i]);
} }
} }
CudaWrapper_Init(0, (char**)0,universe->me,pppn,devicelist); CudaWrapper_Init(0, (char**)0, universe->me, pppn, devicelist);
//if(shared_data.overlap_comm) //if(shared_data.overlap_comm)
CudaWrapper_AddStreams(3); CudaWrapper_AddStreams(3);
cu_x = 0; cu_x = 0;
@ -289,7 +307,7 @@ void Cuda::accelerator(int narg, char** arg)
cu_binned_id = 0; cu_binned_id = 0;
cu_binned_idnew = 0; cu_binned_idnew = 0;
device_set=true; device_set = true;
allocate(); allocate();
delete devicelist; delete devicelist;
} }
@ -328,31 +346,32 @@ void Cuda::setSharedDataZero()
shared_data.buffer_new = 1; shared_data.buffer_new = 1;
shared_data.buffer = NULL; shared_data.buffer = NULL;
shared_data.comm.comm_phase=0; shared_data.comm.comm_phase = 0;
shared_data.overlap_comm=0; shared_data.overlap_comm = 0;
shared_data.comm.buffer = NULL; shared_data.comm.buffer = NULL;
shared_data.comm.buffer_size=0; shared_data.comm.buffer_size = 0;
shared_data.comm.overlap_split_ratio=0; shared_data.comm.overlap_split_ratio = 0;
// setTimingsZero(); // setTimingsZero();
} }
void Cuda::allocate() void Cuda::allocate()
{ {
accelerator(0,NULL); accelerator(0, NULL);
MYDBG(printf("# CUDA: Cuda::allocate ...\n");) MYDBG(printf("# CUDA: Cuda::allocate ...\n");)
if(not cu_virial)
{ if(not cu_virial) {
cu_virial = new cCudaData<double, ENERGY_FLOAT, x > (NULL, & shared_data.pair.virial , 6); cu_virial = new cCudaData<double, ENERGY_FLOAT, x > (NULL, & shared_data.pair.virial , 6);
cu_eng_vdwl = new cCudaData<double, ENERGY_FLOAT, x > (NULL, & shared_data.pair.eng_vdwl ,1); cu_eng_vdwl = new cCudaData<double, ENERGY_FLOAT, x > (NULL, & shared_data.pair.eng_vdwl , 1);
cu_eng_coul = new cCudaData<double, ENERGY_FLOAT, x > (NULL, & shared_data.pair.eng_coul ,1); cu_eng_coul = new cCudaData<double, ENERGY_FLOAT, x > (NULL, & shared_data.pair.eng_coul , 1);
cu_extent = new cCudaData<double, double, x> (extent, 6); cu_extent = new cCudaData<double, double, x> (extent, 6);
shared_data.flag = CudaWrapper_AllocCudaData(sizeof(int)); shared_data.flag = CudaWrapper_AllocCudaData(sizeof(int));
int size=2*CUDA_MAX_DEBUG_SIZE; int size = 2 * CUDA_MAX_DEBUG_SIZE;
debugdata = new int[size]; debugdata = new int[size];
cu_debugdata = new cCudaData<int, int, x > (debugdata , size); cu_debugdata = new cCudaData<int, int, x > (debugdata , size);
shared_data.debugdata=cu_debugdata->dev_data(); shared_data.debugdata = cu_debugdata->dev_data();
} }
checkResize(); checkResize();
setSystemParams(); setSystemParams();
MYDBG(printf("# CUDA: Cuda::allocate done...\n");) MYDBG(printf("# CUDA: Cuda::allocate done...\n");)
@ -376,8 +395,8 @@ void Cuda::setDomainParams()
cuda_shared_domain* cu_domain = &shared_data.domain; cuda_shared_domain* cu_domain = &shared_data.domain;
cu_domain->triclinic = domain->triclinic; cu_domain->triclinic = domain->triclinic;
for(short i=0; i<3; ++i)
{ for(short i = 0; i < 3; ++i) {
cu_domain->periodicity[i] = domain->periodicity[i]; cu_domain->periodicity[i] = domain->periodicity[i];
cu_domain->sublo[i] = domain->sublo[i]; cu_domain->sublo[i] = domain->sublo[i];
cu_domain->subhi[i] = domain->subhi[i]; cu_domain->subhi[i] = domain->subhi[i];
@ -385,34 +404,33 @@ void Cuda::setDomainParams()
cu_domain->boxhi[i] = domain->boxhi[i]; cu_domain->boxhi[i] = domain->boxhi[i];
cu_domain->prd[i] = domain->prd[i]; cu_domain->prd[i] = domain->prd[i];
} }
if(domain->triclinic)
{ if(domain->triclinic) {
for(short i=0; i<3; ++i) for(short i = 0; i < 3; ++i) {
{
cu_domain->boxlo_lamda[i] = domain->boxlo_lamda[i]; cu_domain->boxlo_lamda[i] = domain->boxlo_lamda[i];
cu_domain->boxhi_lamda[i] = domain->boxhi_lamda[i]; cu_domain->boxhi_lamda[i] = domain->boxhi_lamda[i];
cu_domain->prd_lamda[i] = domain->prd_lamda[i]; cu_domain->prd_lamda[i] = domain->prd_lamda[i];
} }
cu_domain->xy = domain->xy; cu_domain->xy = domain->xy;
cu_domain->xz = domain->xz; cu_domain->xz = domain->xz;
cu_domain->yz = domain->yz; cu_domain->yz = domain->yz;
} }
for(int i=0;i<6;i++) for(int i = 0; i < 6; i++) {
{ cu_domain->h[i] = domain->h[i];
cu_domain->h[i]=domain->h[i]; cu_domain->h_inv[i] = domain->h_inv[i];
cu_domain->h_inv[i]=domain->h_inv[i]; cu_domain->h_rate[i] = domain->h_rate[i];
cu_domain->h_rate[i]=domain->h_rate[i];
} }
cu_domain->update=2; cu_domain->update = 2;
MYDBG(printf("# CUDA: Cuda::setDomainParams done ...\n");) MYDBG(printf("# CUDA: Cuda::setDomainParams done ...\n");)
} }
void Cuda::checkResize() void Cuda::checkResize()
{ {
MYDBG(printf("# CUDA: Cuda::checkResize ...\n");) MYDBG(printf("# CUDA: Cuda::checkResize ...\n");)
accelerator(0,NULL); accelerator(0, NULL);
cuda_shared_atom* cu_atom = & shared_data.atom; cuda_shared_atom* cu_atom = & shared_data.atom;
cuda_shared_pair* cu_pair = & shared_data.pair; cuda_shared_pair* cu_pair = & shared_data.pair;
cu_atom->q_flag = atom->q_flag; cu_atom->q_flag = atom->q_flag;
@ -422,116 +440,151 @@ void Cuda::checkResize()
cu_atom->nghost = atom->nghost; cu_atom->nghost = atom->nghost;
// do we have more atoms to upload than currently allocated memory on device? (also true if nothing yet allocated) // do we have more atoms to upload than currently allocated memory on device? (also true if nothing yet allocated)
if(atom->nmax > cu_atom->nmax || cu_tag == NULL) if(atom->nmax > cu_atom->nmax || cu_tag == NULL) {
{ delete cu_x;
delete cu_x; cu_x = new cCudaData<double, X_FLOAT, yx> ((double*)atom->x , & cu_atom->x , atom->nmax, 3,0,true); //cu_x->set_buffer(&(shared_data.buffer),&(shared_data.buffersize),true); cu_x = new cCudaData<double, X_FLOAT, yx> ((double*)atom->x , & cu_atom->x , atom->nmax, 3, 0, true); //cu_x->set_buffer(&(shared_data.buffer),&(shared_data.buffersize),true);
delete cu_v; cu_v = new cCudaData<double, V_FLOAT, yx> ((double*)atom->v, & cu_atom->v , atom->nmax, 3); delete cu_v;
delete cu_f; cu_f = new cCudaData<double, F_FLOAT, yx> ((double*)atom->f, & cu_atom->f , atom->nmax, 3,0,true); cu_v = new cCudaData<double, V_FLOAT, yx> ((double*)atom->v, & cu_atom->v , atom->nmax, 3);
delete cu_tag; cu_tag = new cCudaData<int , int , x > (atom->tag , & cu_atom->tag , atom->nmax ); delete cu_f;
delete cu_type; cu_type = new cCudaData<int , int , x > (atom->type , & cu_atom->type , atom->nmax ); cu_f = new cCudaData<double, F_FLOAT, yx> ((double*)atom->f, & cu_atom->f , atom->nmax, 3, 0, true);
delete cu_mask; cu_mask = new cCudaData<int , int , x > (atom->mask , & cu_atom->mask , atom->nmax ); delete cu_tag;
delete cu_image; cu_image = new cCudaData<int , int , x > (atom->image , & cu_atom->image , atom->nmax ); cu_tag = new cCudaData<int , int , x > (atom->tag , & cu_atom->tag , atom->nmax, 0, true);
delete cu_type;
cu_type = new cCudaData<int , int , x > (atom->type , & cu_atom->type , atom->nmax, 0, true);
delete cu_mask;
cu_mask = new cCudaData<int , int , x > (atom->mask , & cu_atom->mask , atom->nmax, 0, true);
delete cu_image;
cu_image = new cCudaData<int , int , x > (atom->image , & cu_atom->image , atom->nmax, 0, true);
if(atom->rmass) if(atom->rmass) {
{delete cu_rmass; cu_rmass = new cCudaData<double, V_FLOAT, x > (atom->rmass , & cu_atom->rmass , atom->nmax );} delete cu_rmass;
cu_rmass = new cCudaData<double, V_FLOAT, x > (atom->rmass , & cu_atom->rmass , atom->nmax);
if(cu_atom->q_flag)
{delete cu_q; cu_q = new cCudaData<double, F_FLOAT, x > ((double*)atom->q, & cu_atom->q , atom->nmax );}// cu_q->set_buffer(&(copy_buffer),&(copy_buffersize),true);}
if(atom->radius)
{
delete cu_radius; cu_radius = new cCudaData<double, X_FLOAT, x > (atom->radius , & cu_atom->radius , atom->nmax );
delete cu_v_radius; cu_v_radius = new cCudaData<V_FLOAT, V_FLOAT, x> (v_radius , & cu_atom->v_radius , atom->nmax*4);
delete cu_omega_rmass; cu_omega_rmass = new cCudaData<V_FLOAT, V_FLOAT, x> (omega_rmass , & cu_atom->omega_rmass , atom->nmax*4);
} }
if(atom->omega) if(cu_atom->q_flag) {
{delete cu_omega; cu_omega = new cCudaData<double, V_FLOAT, yx > (((double*) atom->omega) , & cu_atom->omega , atom->nmax,3 );} delete cu_q;
cu_q = new cCudaData<double, F_FLOAT, x > ((double*)atom->q, & cu_atom->q , atom->nmax, 0 , true);
}// cu_q->set_buffer(&(copy_buffer),&(copy_buffersize),true);}
if(atom->torque) if(atom->radius) {
{delete cu_torque; cu_torque = new cCudaData<double, F_FLOAT, yx > (((double*) atom->torque) , & cu_atom->torque , atom->nmax,3 );} delete cu_radius;
cu_radius = new cCudaData<double, X_FLOAT, x > (atom->radius , & cu_atom->radius , atom->nmax);
delete cu_v_radius;
cu_v_radius = new cCudaData<V_FLOAT, V_FLOAT, x> (v_radius , & cu_atom->v_radius , atom->nmax * 4);
delete cu_omega_rmass;
cu_omega_rmass = new cCudaData<V_FLOAT, V_FLOAT, x> (omega_rmass , & cu_atom->omega_rmass , atom->nmax * 4);
}
if(atom->omega) {
delete cu_omega;
cu_omega = new cCudaData<double, V_FLOAT, yx > (((double*) atom->omega) , & cu_atom->omega , atom->nmax, 3);
}
if(atom->torque) {
delete cu_torque;
cu_torque = new cCudaData<double, F_FLOAT, yx > (((double*) atom->torque) , & cu_atom->torque , atom->nmax, 3);
}
if(atom->special) {
delete cu_special;
cu_special = new cCudaData<int, int, yx > (((int*) & (atom->special[0][0])) , & cu_atom->special , atom->nmax, atom->maxspecial, 0 , true);
shared_data.atom.maxspecial = atom->maxspecial;
}
if(atom->nspecial) {
delete cu_nspecial;
cu_nspecial = new cCudaData<int, int, yx > (((int*) atom->nspecial) , & cu_atom->nspecial , atom->nmax, 3, 0, true);
}
if(atom->molecule) {
delete cu_molecule;
cu_molecule = new cCudaData<int, int, x > (((int*) atom->molecule) , & cu_atom->molecule , atom->nmax, 0 , true);
}
if(atom->special)
{delete cu_special; cu_special = new cCudaData<int, int, yx > (((int*) &(atom->special[0][0])) , & cu_atom->special , atom->nmax,atom->maxspecial ); shared_data.atom.maxspecial=atom->maxspecial;}
if(atom->nspecial)
{delete cu_nspecial; cu_nspecial = new cCudaData<int, int, yx > (((int*) atom->nspecial) , & cu_atom->nspecial , atom->nmax,3 );}
if(atom->molecule)
{delete cu_molecule; cu_molecule = new cCudaData<int, int, x > (((int*) atom->molecule) , & cu_atom->molecule , atom->nmax );}
shared_data.atom.special_flag = neighbor->special_flag; shared_data.atom.special_flag = neighbor->special_flag;
shared_data.atom.molecular = atom->molecular; shared_data.atom.molecular = atom->molecular;
cu_atom->update_nmax = 2; cu_atom->update_nmax = 2;
cu_atom->nmax = atom->nmax; cu_atom->nmax = atom->nmax;
delete cu_x_type; cu_x_type = new cCudaData<X_FLOAT, X_FLOAT, x> (x_type , & cu_atom->x_type , atom->nmax*4); delete cu_x_type;
cu_x_type = new cCudaData<X_FLOAT, X_FLOAT, x> (x_type , & cu_atom->x_type , atom->nmax * 4);
} }
if(((cu_xhold==NULL)||(cu_xhold->get_dim()[0]<neighbor->maxhold))&&neighbor->xhold) if(((cu_xhold == NULL) || (cu_xhold->get_dim()[0] < neighbor->maxhold)) && neighbor->xhold) {
{ delete cu_xhold;
delete cu_xhold; cu_xhold = new cCudaData<double, X_FLOAT, yx> ((double*)neighbor->xhold, & cu_atom->xhold , neighbor->maxhold, 3); cu_xhold = new cCudaData<double, X_FLOAT, yx> ((double*)neighbor->xhold, & cu_atom->xhold , neighbor->maxhold, 3);
shared_data.atom.maxhold=neighbor->maxhold; shared_data.atom.maxhold = neighbor->maxhold;
}
if(atom->mass && !cu_mass) {
cu_mass = new cCudaData<double, V_FLOAT, x > (atom->mass , & cu_atom->mass , atom->ntypes + 1);
} }
if(atom->mass && !cu_mass)
{cu_mass = new cCudaData<double, V_FLOAT, x > (atom->mass , & cu_atom->mass , atom->ntypes+1);}
cu_atom->mass_host = atom->mass; cu_atom->mass_host = atom->mass;
if(atom->map_style==1) if(atom->map_style == 1) {
{ if((cu_map_array == NULL)) {
if((cu_map_array==NULL)) cu_map_array = new cCudaData<int, int, x > (atom->get_map_array() , & cu_atom->map_array , atom->get_map_size());
{ } else if(cu_map_array->dev_size() / sizeof(int) < atom->get_map_size()) {
cu_map_array = new cCudaData<int, int, x > (atom->get_map_array() , & cu_atom->map_array , atom->get_map_size() );
}
else
if(cu_map_array->dev_size()/sizeof(int)<atom->get_map_size())
{
delete cu_map_array; delete cu_map_array;
cu_map_array = new cCudaData<int, int, x > (atom->get_map_array() , & cu_atom->map_array , atom->get_map_size() ); cu_map_array = new cCudaData<int, int, x > (atom->get_map_array() , & cu_atom->map_array , atom->get_map_size());
} }
} }
// if any of the host pointers have changed (e.g. re-allocated somewhere else), set to correct pointer // if any of the host pointers have changed (e.g. re-allocated somewhere else), set to correct pointer
if(cu_x ->get_host_data() != atom->x) cu_x ->set_host_data((double*) (atom->x)); if(cu_x ->get_host_data() != atom->x) cu_x ->set_host_data((double*)(atom->x));
if(cu_v ->get_host_data() != atom->v) cu_v ->set_host_data((double*) (atom->v));
if(cu_f ->get_host_data() != atom->f) cu_f ->set_host_data((double*) (atom->f)); if(cu_v ->get_host_data() != atom->v) cu_v ->set_host_data((double*)(atom->v));
if(cu_f ->get_host_data() != atom->f) cu_f ->set_host_data((double*)(atom->f));
if(cu_tag ->get_host_data() != atom->tag) cu_tag ->set_host_data(atom->tag); if(cu_tag ->get_host_data() != atom->tag) cu_tag ->set_host_data(atom->tag);
if(cu_type->get_host_data() != atom->type) cu_type->set_host_data(atom->type); if(cu_type->get_host_data() != atom->type) cu_type->set_host_data(atom->type);
if(cu_mask->get_host_data() != atom->mask) cu_mask->set_host_data(atom->mask); if(cu_mask->get_host_data() != atom->mask) cu_mask->set_host_data(atom->mask);
if(cu_image->get_host_data() != atom->image) cu_mask->set_host_data(atom->image); if(cu_image->get_host_data() != atom->image) cu_mask->set_host_data(atom->image);
if(cu_xhold) if(cu_xhold)
if(cu_xhold->get_host_data()!= neighbor->xhold) cu_xhold->set_host_data((double*)(neighbor->xhold)); if(cu_xhold->get_host_data() != neighbor->xhold) cu_xhold->set_host_data((double*)(neighbor->xhold));
if(atom->rmass) if(atom->rmass)
if(cu_rmass->get_host_data() != atom->rmass) cu_rmass->set_host_data((double*) (atom->rmass)); if(cu_rmass->get_host_data() != atom->rmass) cu_rmass->set_host_data((double*)(atom->rmass));
if(cu_atom->q_flag) if(cu_atom->q_flag)
if(cu_q->get_host_data() != atom->q) cu_q->set_host_data((double*) (atom->q)); if(cu_q->get_host_data() != atom->q) cu_q->set_host_data((double*)(atom->q));
if(atom->radius) if(atom->radius)
if(cu_radius->get_host_data() != atom->radius) cu_radius->set_host_data((double*) (atom->radius)); if(cu_radius->get_host_data() != atom->radius) cu_radius->set_host_data((double*)(atom->radius));
if(atom->omega) if(atom->omega)
if(cu_omega->get_host_data() != atom->omega) cu_omega->set_host_data((double*) (atom->omega)); if(cu_omega->get_host_data() != atom->omega) cu_omega->set_host_data((double*)(atom->omega));
if(atom->torque) if(atom->torque)
if(cu_torque->get_host_data() != atom->torque) cu_torque->set_host_data((double*) (atom->torque)); if(cu_torque->get_host_data() != atom->torque) cu_torque->set_host_data((double*)(atom->torque));
if(atom->special) if(atom->special)
if(cu_special->get_host_data() != atom->special) if(cu_special->get_host_data() != atom->special) {
{delete cu_special; cu_special = new cCudaData<int, int, yx > (((int*) atom->special) , & cu_atom->special , atom->nmax,atom->maxspecial ); shared_data.atom.maxspecial=atom->maxspecial;} delete cu_special;
cu_special = new cCudaData<int, int, yx > (((int*) atom->special) , & cu_atom->special , atom->nmax, atom->maxspecial);
shared_data.atom.maxspecial = atom->maxspecial;
}
if(atom->nspecial) if(atom->nspecial)
if(cu_nspecial->get_host_data() != atom->nspecial) cu_nspecial->set_host_data((int*) (atom->nspecial)); if(cu_nspecial->get_host_data() != atom->nspecial) cu_nspecial->set_host_data((int*)(atom->nspecial));
if(atom->molecule) if(atom->molecule)
if(cu_molecule->get_host_data() != atom->molecule) cu_molecule->set_host_data((int*) (atom->molecule)); if(cu_molecule->get_host_data() != atom->molecule) cu_molecule->set_host_data((int*)(atom->molecule));
if(force) if(force)
if(cu_virial ->get_host_data() != force->pair->virial) cu_virial ->set_host_data(force->pair->virial); if(cu_virial ->get_host_data() != force->pair->virial) cu_virial ->set_host_data(force->pair->virial);
if(force) if(force)
if(cu_eng_vdwl ->get_host_data() != &force->pair->eng_vdwl) cu_eng_vdwl ->set_host_data(&force->pair->eng_vdwl); if(cu_eng_vdwl ->get_host_data() != &force->pair->eng_vdwl) cu_eng_vdwl ->set_host_data(&force->pair->eng_vdwl);
if(force) if(force)
if(cu_eng_coul ->get_host_data() != &force->pair->eng_coul) cu_eng_coul ->set_host_data(&force->pair->eng_coul); if(cu_eng_coul ->get_host_data() != &force->pair->eng_coul) cu_eng_coul ->set_host_data(&force->pair->eng_coul);
@ -539,32 +592,32 @@ void Cuda::checkResize()
MYDBG(printf("# CUDA: Cuda::checkResize done...\n");) MYDBG(printf("# CUDA: Cuda::checkResize done...\n");)
} }
void Cuda::evsetup_eatom_vatom(int eflag_atom,int vflag_atom) void Cuda::evsetup_eatom_vatom(int eflag_atom, int vflag_atom)
{ {
if(eflag_atom) if(eflag_atom) {
{
if(not cu_eatom) if(not cu_eatom)
cu_eatom = new cCudaData<double, ENERGY_FLOAT, x > (force->pair->eatom, & (shared_data.atom.eatom) , atom->nmax );// cu_eatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);} cu_eatom = new cCudaData<double, ENERGY_FLOAT, x > (force->pair->eatom, & (shared_data.atom.eatom) , atom->nmax); // cu_eatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);}
if(cu_eatom->get_dim()[0]!=atom->nmax)
{ if(cu_eatom->get_dim()[0] != atom->nmax) {
//delete cu_eatom; //delete cu_eatom;
//cu_eatom = new cCudaData<double, ENERGY_FLOAT, x > (force->pair->eatom, & (shared_data.atom.eatom) , atom->nmax );// cu_eatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);} //cu_eatom = new cCudaData<double, ENERGY_FLOAT, x > (force->pair->eatom, & (shared_data.atom.eatom) , atom->nmax );// cu_eatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);}
shared_data.atom.update_nmax=2; shared_data.atom.update_nmax = 2;
} }
cu_eatom->set_host_data(force->pair->eatom); cu_eatom->set_host_data(force->pair->eatom);
cu_eatom->memset_device(0); cu_eatom->memset_device(0);
} }
if(vflag_atom)
{ if(vflag_atom) {
if(not cu_vatom) if(not cu_vatom)
cu_vatom = new cCudaData<double, ENERGY_FLOAT, yx > ((double*)force->pair->vatom, & (shared_data.atom.vatom) , atom->nmax ,6 );// cu_vatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);} cu_vatom = new cCudaData<double, ENERGY_FLOAT, yx > ((double*)force->pair->vatom, & (shared_data.atom.vatom) , atom->nmax , 6);// cu_vatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);}
if(cu_vatom->get_dim()[0]!=atom->nmax)
{ if(cu_vatom->get_dim()[0] != atom->nmax) {
//delete cu_vatom; //delete cu_vatom;
//cu_vatom = new cCudaData<double, ENERGY_FLOAT, yx > ((double*)force->pair->vatom, & (shared_data.atom.vatom) , atom->nmax ,6 );// cu_vatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);} //cu_vatom = new cCudaData<double, ENERGY_FLOAT, yx > ((double*)force->pair->vatom, & (shared_data.atom.vatom) , atom->nmax ,6 );// cu_vatom->set_buffer(&(copy_buffer),&(copy_buffersize),true);}
shared_data.atom.update_nmax=2; shared_data.atom.update_nmax = 2;
} }
cu_vatom->set_host_data((double*)force->pair->vatom); cu_vatom->set_host_data((double*)force->pair->vatom);
cu_vatom->memset_device(0); cu_vatom->memset_device(0);
} }
@ -576,8 +629,9 @@ void Cuda::uploadAll()
timespec starttime; timespec starttime;
timespec endtime; timespec endtime;
if(atom->nmax!=shared_data.atom.nmax) checkResize(); if(atom->nmax != shared_data.atom.nmax) checkResize();
clock_gettime(CLOCK_REALTIME,&starttime);
clock_gettime(CLOCK_REALTIME, &starttime);
cu_x ->upload(); cu_x ->upload();
cu_v ->upload(); cu_v ->upload();
cu_f ->upload(); cu_f ->upload();
@ -585,25 +639,33 @@ void Cuda::uploadAll()
cu_type->upload(); cu_type->upload();
cu_mask->upload(); cu_mask->upload();
cu_image->upload(); cu_image->upload();
if(shared_data.atom.q_flag) cu_q ->upload(); if(shared_data.atom.q_flag) cu_q ->upload();
if(atom->rmass) cu_rmass->upload(); if(atom->rmass) cu_rmass->upload();
if(atom->radius) cu_radius->upload(); if(atom->radius) cu_radius->upload();
if(atom->omega) cu_omega->upload(); if(atom->omega) cu_omega->upload();
if(atom->torque) cu_torque->upload(); if(atom->torque) cu_torque->upload();
if(atom->special) cu_special->upload(); if(atom->special) cu_special->upload();
if(atom->nspecial) cu_nspecial->upload(); if(atom->nspecial) cu_nspecial->upload();
if(atom->molecule) cu_molecule->upload(); if(atom->molecule) cu_molecule->upload();
if(cu_eatom) cu_eatom->upload(); if(cu_eatom) cu_eatom->upload();
if(cu_vatom) cu_vatom->upload(); if(cu_vatom) cu_vatom->upload();
clock_gettime(CLOCK_REALTIME,&endtime); clock_gettime(CLOCK_REALTIME, &endtime);
uploadtime+=(endtime.tv_sec-starttime.tv_sec+1.0*(endtime.tv_nsec-starttime.tv_nsec)/1000000000); uploadtime += (endtime.tv_sec - starttime.tv_sec + 1.0 * (endtime.tv_nsec - starttime.tv_nsec) / 1000000000);
CUDA_IF_BINNING(Cuda_PreBinning(& shared_data);) CUDA_IF_BINNING(Cuda_PreBinning(& shared_data);)
CUDA_IF_BINNING(Cuda_Binning (& shared_data);) CUDA_IF_BINNING(Cuda_Binning(& shared_data);)
shared_data.atom.triggerneighsq=neighbor->triggersq; shared_data.atom.triggerneighsq = neighbor->triggersq;
MYDBG(printf("# CUDA: Cuda::uploadAll() ... end\n");) MYDBG(printf("# CUDA: Cuda::uploadAll() ... end\n");)
} }
@ -613,10 +675,10 @@ void Cuda::downloadAll()
timespec starttime; timespec starttime;
timespec endtime; timespec endtime;
if(atom->nmax!=shared_data.atom.nmax) checkResize(); if(atom->nmax != shared_data.atom.nmax) checkResize();
CUDA_IF_BINNING( Cuda_ReverseBinning(& shared_data); ) CUDA_IF_BINNING(Cuda_ReverseBinning(& shared_data);)
clock_gettime(CLOCK_REALTIME,&starttime); clock_gettime(CLOCK_REALTIME, &starttime);
cu_x ->download(); cu_x ->download();
cu_v ->download(); cu_v ->download();
cu_f ->download(); cu_f ->download();
@ -629,19 +691,27 @@ void Cuda::downloadAll()
//if(shared_data.atom.need_vatom) cu_vatom->download(); //if(shared_data.atom.need_vatom) cu_vatom->download();
if(shared_data.atom.q_flag) cu_q ->download(); if(shared_data.atom.q_flag) cu_q ->download();
if(atom->rmass) cu_rmass->download(); if(atom->rmass) cu_rmass->download();
if(atom->radius) cu_radius->download(); if(atom->radius) cu_radius->download();
if(atom->omega) cu_omega->download(); if(atom->omega) cu_omega->download();
if(atom->torque) cu_torque->download(); if(atom->torque) cu_torque->download();
if(atom->special) cu_special->download(); if(atom->special) cu_special->download();
if(atom->nspecial) cu_nspecial->download(); if(atom->nspecial) cu_nspecial->download();
if(atom->molecule) cu_molecule->download(); if(atom->molecule) cu_molecule->download();
if(cu_eatom) cu_eatom->download(); if(cu_eatom) cu_eatom->download();
if(cu_vatom) cu_vatom->download(); if(cu_vatom) cu_vatom->download();
clock_gettime(CLOCK_REALTIME,&endtime); clock_gettime(CLOCK_REALTIME, &endtime);
downloadtime+=(endtime.tv_sec-starttime.tv_sec+1.0*(endtime.tv_nsec-starttime.tv_nsec)/1000000000); downloadtime += (endtime.tv_sec - starttime.tv_sec + 1.0 * (endtime.tv_nsec - starttime.tv_nsec) / 1000000000);
MYDBG(printf("# CUDA: Cuda::downloadAll() ... end\n");) MYDBG(printf("# CUDA: Cuda::downloadAll() ... end\n");)
} }
@ -657,12 +727,12 @@ CudaNeighList* Cuda::registerNeighborList(class NeighList* neigh_list)
std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.find(neigh_list); std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.find(neigh_list);
if(p != neigh_lists.end()) return p->second; if(p != neigh_lists.end()) return p->second;
else else {
{
CudaNeighList* neigh_list_cuda = new CudaNeighList(lmp, neigh_list); CudaNeighList* neigh_list_cuda = new CudaNeighList(lmp, neigh_list);
neigh_lists.insert(std::pair<NeighList*, CudaNeighList*>(neigh_list, neigh_list_cuda)); neigh_lists.insert(std::pair<NeighList*, CudaNeighList*>(neigh_list, neigh_list_cuda));
return neigh_list_cuda; return neigh_list_cuda;
} }
MYDBG(printf("# CUDA: Cuda::registerNeighborList() ... end b\n");) MYDBG(printf("# CUDA: Cuda::registerNeighborList() ... end b\n");)
} }
@ -670,14 +740,17 @@ void Cuda::uploadAllNeighborLists()
{ {
MYDBG(printf("# CUDA: Cuda::uploadAllNeighborList() ... start\n");) MYDBG(printf("# CUDA: Cuda::uploadAllNeighborList() ... start\n");)
std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.begin(); std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.begin();
while(p != neigh_lists.end())
{ while(p != neigh_lists.end()) {
p->second->nl_upload(); p->second->nl_upload();
if(not (p->second->neigh_list->cuda_list->build_cuda))
for(int i=0;i<atom->nlocal;i++) if(not(p->second->neigh_list->cuda_list->build_cuda))
p->second->sneighlist.maxneighbors=MAX(p->second->neigh_list->numneigh[i],p->second->sneighlist.maxneighbors) ; for(int i = 0; i < atom->nlocal; i++)
p->second->sneighlist.maxneighbors = MAX(p->second->neigh_list->numneigh[i], p->second->sneighlist.maxneighbors) ;
++p; ++p;
} }
MYDBG(printf("# CUDA: Cuda::uploadAllNeighborList() ... done\n");) MYDBG(printf("# CUDA: Cuda::uploadAllNeighborList() ... done\n");)
} }
@ -685,28 +758,29 @@ void Cuda::downloadAllNeighborLists()
{ {
MYDBG(printf("# CUDA: Cuda::downloadAllNeighborList() ... start\n");) MYDBG(printf("# CUDA: Cuda::downloadAllNeighborList() ... start\n");)
std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.begin(); std::map<NeighList*, CudaNeighList*>::iterator p = neigh_lists.begin();
while(p != neigh_lists.end())
{ while(p != neigh_lists.end()) {
p->second->nl_download(); p->second->nl_download();
++p; ++p;
} }
} }
void Cuda::update_xhold(int &maxhold,double* xhold) void Cuda::update_xhold(int &maxhold, double* xhold)
{ {
if(this->shared_data.atom.maxhold<atom->nmax) if(this->shared_data.atom.maxhold < atom->nmax) {
{
maxhold = atom->nmax; maxhold = atom->nmax;
delete this->cu_xhold; this->cu_xhold = new cCudaData<double, X_FLOAT, yx> ((double*)xhold, & this->shared_data.atom.xhold , maxhold, 3); delete this->cu_xhold;
this->cu_xhold = new cCudaData<double, X_FLOAT, yx> ((double*)xhold, & this->shared_data.atom.xhold , maxhold, 3);
} }
this->shared_data.atom.maxhold=maxhold;
CudaWrapper_CopyData(this->cu_xhold->dev_data(),this->cu_x->dev_data(),3*atom->nmax*sizeof(X_FLOAT)); this->shared_data.atom.maxhold = maxhold;
CudaWrapper_CopyData(this->cu_xhold->dev_data(), this->cu_x->dev_data(), 3 * atom->nmax * sizeof(X_FLOAT));
} }
void Cuda::setTimingsZero() void Cuda::setTimingsZero()
{ {
shared_data.cuda_timings.test1=0; shared_data.cuda_timings.test1 = 0;
shared_data.cuda_timings.test2=0; shared_data.cuda_timings.test2 = 0;
//communication //communication
shared_data.cuda_timings.comm_forward_total = 0; shared_data.cuda_timings.comm_forward_total = 0;
@ -722,7 +796,7 @@ void Cuda::setTimingsZero()
shared_data.cuda_timings.comm_exchange_kernel_pack = 0; shared_data.cuda_timings.comm_exchange_kernel_pack = 0;
shared_data.cuda_timings.comm_exchange_kernel_unpack = 0; shared_data.cuda_timings.comm_exchange_kernel_unpack = 0;
shared_data.cuda_timings.comm_exchange_kernel_fill = 0; shared_data.cuda_timings.comm_exchange_kernel_fill = 0;
shared_data.cuda_timings.comm_exchange_cpu_pack= 0; shared_data.cuda_timings.comm_exchange_cpu_pack = 0;
shared_data.cuda_timings.comm_exchange_upload = 0; shared_data.cuda_timings.comm_exchange_upload = 0;
shared_data.cuda_timings.comm_exchange_download = 0; shared_data.cuda_timings.comm_exchange_download = 0;
@ -763,76 +837,77 @@ void Cuda::setTimingsZero()
void Cuda::print_timings() void Cuda::print_timings()
{ {
if(universe->me!=0) return; if(universe->me != 0) return;
if(not dotiming) return; if(not dotiming) return;
printf("\n # CUDA: Special timings\n\n"); printf("\n # CUDA: Special timings\n\n");
printf("\n Transfer Times\n"); printf("\n Transfer Times\n");
printf(" PCIe Upload: \t %lf s\n",CudaWrapper_CheckUploadTime()); printf(" PCIe Upload: \t %lf s\n", CudaWrapper_CheckUploadTime());
printf(" PCIe Download:\t %lf s\n",CudaWrapper_CheckDownloadTime()); printf(" PCIe Download:\t %lf s\n", CudaWrapper_CheckDownloadTime());
printf(" CPU Tempbbuf Upload: \t %lf \n",CudaWrapper_CheckCPUBufUploadTime()); printf(" CPU Tempbbuf Upload: \t %lf \n", CudaWrapper_CheckCPUBufUploadTime());
printf(" CPU Tempbbuf Download: \t %lf \n",CudaWrapper_CheckCPUBufDownloadTime()); printf(" CPU Tempbbuf Download: \t %lf \n", CudaWrapper_CheckCPUBufDownloadTime());
printf("\n Communication \n"); printf("\n Communication \n");
printf(" Forward Total \t %lf \n",shared_data.cuda_timings.comm_forward_total); printf(" Forward Total \t %lf \n", shared_data.cuda_timings.comm_forward_total);
printf(" Forward MPI Upper Bound \t %lf \n",shared_data.cuda_timings.comm_forward_mpi_upper); printf(" Forward MPI Upper Bound \t %lf \n", shared_data.cuda_timings.comm_forward_mpi_upper);
printf(" Forward MPI Lower Bound \t %lf \n",shared_data.cuda_timings.comm_forward_mpi_lower); printf(" Forward MPI Lower Bound \t %lf \n", shared_data.cuda_timings.comm_forward_mpi_lower);
printf(" Forward Kernel Pack \t %lf \n",shared_data.cuda_timings.comm_forward_kernel_pack); printf(" Forward Kernel Pack \t %lf \n", shared_data.cuda_timings.comm_forward_kernel_pack);
printf(" Forward Kernel Unpack \t %lf \n",shared_data.cuda_timings.comm_forward_kernel_unpack); printf(" Forward Kernel Unpack \t %lf \n", shared_data.cuda_timings.comm_forward_kernel_unpack);
printf(" Forward Kernel Self \t %lf \n",shared_data.cuda_timings.comm_forward_kernel_self); printf(" Forward Kernel Self \t %lf \n", shared_data.cuda_timings.comm_forward_kernel_self);
printf(" Forward Upload \t %lf \n",shared_data.cuda_timings.comm_forward_upload); printf(" Forward Upload \t %lf \n", shared_data.cuda_timings.comm_forward_upload);
printf(" Forward Download \t %lf \n",shared_data.cuda_timings.comm_forward_download); printf(" Forward Download \t %lf \n", shared_data.cuda_timings.comm_forward_download);
printf(" Forward Overlap Split Ratio\t %lf \n",shared_data.comm.overlap_split_ratio); printf(" Forward Overlap Split Ratio\t %lf \n", shared_data.comm.overlap_split_ratio);
printf("\n"); printf("\n");
printf(" Exchange Total \t %lf \n",shared_data.cuda_timings.comm_exchange_total); printf(" Exchange Total \t %lf \n", shared_data.cuda_timings.comm_exchange_total);
printf(" Exchange MPI \t %lf \n",shared_data.cuda_timings.comm_exchange_mpi); printf(" Exchange MPI \t %lf \n", shared_data.cuda_timings.comm_exchange_mpi);
printf(" Exchange Kernel Pack \t %lf \n",shared_data.cuda_timings.comm_exchange_kernel_pack); printf(" Exchange Kernel Pack \t %lf \n", shared_data.cuda_timings.comm_exchange_kernel_pack);
printf(" Exchange Kernel Unpack \t %lf \n",shared_data.cuda_timings.comm_exchange_kernel_unpack); printf(" Exchange Kernel Unpack \t %lf \n", shared_data.cuda_timings.comm_exchange_kernel_unpack);
printf(" Exchange Kernel Fill \t %lf \n",shared_data.cuda_timings.comm_exchange_kernel_fill); printf(" Exchange Kernel Fill \t %lf \n", shared_data.cuda_timings.comm_exchange_kernel_fill);
printf(" Exchange CPU Pack \t %lf \n",shared_data.cuda_timings.comm_exchange_cpu_pack); printf(" Exchange CPU Pack \t %lf \n", shared_data.cuda_timings.comm_exchange_cpu_pack);
printf(" Exchange Upload \t %lf \n",shared_data.cuda_timings.comm_exchange_upload); printf(" Exchange Upload \t %lf \n", shared_data.cuda_timings.comm_exchange_upload);
printf(" Exchange Download \t %lf \n",shared_data.cuda_timings.comm_exchange_download); printf(" Exchange Download \t %lf \n", shared_data.cuda_timings.comm_exchange_download);
printf("\n"); printf("\n");
printf(" Border Total \t %lf \n",shared_data.cuda_timings.comm_border_total); printf(" Border Total \t %lf \n", shared_data.cuda_timings.comm_border_total);
printf(" Border MPI \t %lf \n",shared_data.cuda_timings.comm_border_mpi); printf(" Border MPI \t %lf \n", shared_data.cuda_timings.comm_border_mpi);
printf(" Border Kernel Pack \t %lf \n",shared_data.cuda_timings.comm_border_kernel_pack); printf(" Border Kernel Pack \t %lf \n", shared_data.cuda_timings.comm_border_kernel_pack);
printf(" Border Kernel Unpack \t %lf \n",shared_data.cuda_timings.comm_border_kernel_unpack); printf(" Border Kernel Unpack \t %lf \n", shared_data.cuda_timings.comm_border_kernel_unpack);
printf(" Border Kernel Self \t %lf \n",shared_data.cuda_timings.comm_border_kernel_self); printf(" Border Kernel Self \t %lf \n", shared_data.cuda_timings.comm_border_kernel_self);
printf(" Border Kernel BuildList \t %lf \n",shared_data.cuda_timings.comm_border_kernel_buildlist); printf(" Border Kernel BuildList \t %lf \n", shared_data.cuda_timings.comm_border_kernel_buildlist);
printf(" Border Upload \t %lf \n",shared_data.cuda_timings.comm_border_upload); printf(" Border Upload \t %lf \n", shared_data.cuda_timings.comm_border_upload);
printf(" Border Download \t %lf \n",shared_data.cuda_timings.comm_border_download); printf(" Border Download \t %lf \n", shared_data.cuda_timings.comm_border_download);
printf("\n"); printf("\n");
//pair forces //pair forces
printf(" Pair XType Conversion \t %lf \n",shared_data.cuda_timings.pair_xtype_conversion ); printf(" Pair XType Conversion \t %lf \n", shared_data.cuda_timings.pair_xtype_conversion);
printf(" Pair Kernel \t %lf \n",shared_data.cuda_timings.pair_kernel ); printf(" Pair Kernel \t %lf \n", shared_data.cuda_timings.pair_kernel);
printf(" Pair Virial \t %lf \n",shared_data.cuda_timings.pair_virial ); printf(" Pair Virial \t %lf \n", shared_data.cuda_timings.pair_virial);
printf(" Pair Force Collection \t %lf \n",shared_data.cuda_timings.pair_force_collection ); printf(" Pair Force Collection \t %lf \n", shared_data.cuda_timings.pair_force_collection);
printf("\n"); printf("\n");
//neighbor //neighbor
printf(" Neighbor Binning \t %lf \n",shared_data.cuda_timings.neigh_bin ); printf(" Neighbor Binning \t %lf \n", shared_data.cuda_timings.neigh_bin);
printf(" Neighbor Build \t %lf \n",shared_data.cuda_timings.neigh_build ); printf(" Neighbor Build \t %lf \n", shared_data.cuda_timings.neigh_build);
printf(" Neighbor Special \t %lf \n",shared_data.cuda_timings.neigh_special ); printf(" Neighbor Special \t %lf \n", shared_data.cuda_timings.neigh_special);
printf("\n"); printf("\n");
//pppm //pppm
if(force->kspace) if(force->kspace) {
{ printf(" PPPM Total \t %lf \n", shared_data.cuda_timings.pppm_compute);
printf(" PPPM Total \t %lf \n",shared_data.cuda_timings.pppm_compute ); printf(" PPPM Particle Map \t %lf \n", shared_data.cuda_timings.pppm_particle_map);
printf(" PPPM Particle Map \t %lf \n",shared_data.cuda_timings.pppm_particle_map ); printf(" PPPM Make Rho \t %lf \n", shared_data.cuda_timings.pppm_make_rho);
printf(" PPPM Make Rho \t %lf \n",shared_data.cuda_timings.pppm_make_rho ); printf(" PPPM Brick2fft \t %lf \n", shared_data.cuda_timings.pppm_brick2fft);
printf(" PPPM Brick2fft \t %lf \n",shared_data.cuda_timings.pppm_brick2fft ); printf(" PPPM Poisson \t %lf \n", shared_data.cuda_timings.pppm_poisson);
printf(" PPPM Poisson \t %lf \n",shared_data.cuda_timings.pppm_poisson ); printf(" PPPM Fillbrick \t %lf \n", shared_data.cuda_timings.pppm_fillbrick);
printf(" PPPM Fillbrick \t %lf \n",shared_data.cuda_timings.pppm_fillbrick ); printf(" PPPM Fieldforce \t %lf \n", shared_data.cuda_timings.pppm_fieldforce);
printf(" PPPM Fieldforce \t %lf \n",shared_data.cuda_timings.pppm_fieldforce );
printf("\n"); printf("\n");
} }
printf(" Debug Test 1 \t %lf \n",shared_data.cuda_timings.test1); printf(" Debug Test 1 \t %lf \n", shared_data.cuda_timings.test1);
printf(" Debug Test 2 \t %lf \n",shared_data.cuda_timings.test2); printf(" Debug Test 2 \t %lf \n", shared_data.cuda_timings.test2);
printf("\n"); printf("\n");
} }

View File

@ -23,28 +23,30 @@
#include "group.h" #include "group.h"
#include "memory.h" #include "memory.h"
#include "error.h" #include "error.h"
#include "update.h"
using namespace LAMMPS_NS; using namespace LAMMPS_NS;
enum{NSQ,BIN,MULTI}; // also in neigh_list.cpp enum {NSQ, BIN, MULTI}; // also in neigh_list.cpp
/* ---------------------------------------------------------------------- */ /* ---------------------------------------------------------------------- */
NeighborCuda::NeighborCuda(LAMMPS *lmp) : Neighbor(lmp) NeighborCuda::NeighborCuda(LAMMPS* lmp) : Neighbor(lmp)
{ {
cuda = lmp->cuda; cuda = lmp->cuda;
if(cuda == NULL) if(cuda == NULL)
error->all(FLERR,"You cannot use a /cuda class, without activating 'cuda' acceleration. Provide '-c on' as command-line argument to LAMMPS.."); error->all(FLERR, "You cannot use a /cuda class, without activating 'cuda' acceleration. Provide '-c on' as command-line argument to LAMMPS..");
} }
/* ---------------------------------------------------------------------- */ /* ---------------------------------------------------------------------- */
void NeighborCuda::init() void NeighborCuda::init()
{ {
cuda->set_neighinit(dist_check,0.25*skin*skin); cuda->set_neighinit(dist_check, 0.25 * skin * skin);
cudable = 1; cudable = 1;
Neighbor::init(); Neighbor::init();
@ -55,13 +57,13 @@ void NeighborCuda::init()
any other neighbor build method is unchanged any other neighbor build method is unchanged
------------------------------------------------------------------------- */ ------------------------------------------------------------------------- */
void NeighborCuda::choose_build(int index, NeighRequest *rq) void NeighborCuda::choose_build(int index, NeighRequest* rq)
{ {
Neighbor::choose_build(index,rq); Neighbor::choose_build(index, rq);
if (rq->full && style == NSQ && rq->cudable) if(rq->full && style == NSQ && rq->cudable)
pair_build[index] = (Neighbor::PairPtr) &NeighborCuda::full_nsq_cuda; pair_build[index] = (Neighbor::PairPtr) &NeighborCuda::full_nsq_cuda;
else if (rq->full && style == BIN && rq->cudable) else if(rq->full && style == BIN && rq->cudable)
pair_build[index] = (Neighbor::PairPtr) &NeighborCuda::full_bin_cuda; pair_build[index] = (Neighbor::PairPtr) &NeighborCuda::full_bin_cuda;
} }
@ -69,93 +71,104 @@ void NeighborCuda::choose_build(int index, NeighRequest *rq)
int NeighborCuda::check_distance() int NeighborCuda::check_distance()
{ {
double delx,dely,delz,rsq; double delx, dely, delz, rsq;
double delta,deltasq,delta1,delta2; double delta, deltasq, delta1, delta2;
if (boxcheck) { if(boxcheck) {
if (triclinic == 0) { if(triclinic == 0) {
delx = bboxlo[0] - boxlo_hold[0]; delx = bboxlo[0] - boxlo_hold[0];
dely = bboxlo[1] - boxlo_hold[1]; dely = bboxlo[1] - boxlo_hold[1];
delz = bboxlo[2] - boxlo_hold[2]; delz = bboxlo[2] - boxlo_hold[2];
delta1 = sqrt(delx*delx + dely*dely + delz*delz); delta1 = sqrt(delx * delx + dely * dely + delz * delz);
delx = bboxhi[0] - boxhi_hold[0]; delx = bboxhi[0] - boxhi_hold[0];
dely = bboxhi[1] - boxhi_hold[1]; dely = bboxhi[1] - boxhi_hold[1];
delz = bboxhi[2] - boxhi_hold[2]; delz = bboxhi[2] - boxhi_hold[2];
delta2 = sqrt(delx*delx + dely*dely + delz*delz); delta2 = sqrt(delx * delx + dely * dely + delz * delz);
delta = 0.5 * (skin - (delta1+delta2)); delta = 0.5 * (skin - (delta1 + delta2));
deltasq = delta*delta; deltasq = delta * delta;
} else { } else {
domain->box_corners(); domain->box_corners();
delta1 = delta2 = 0.0; delta1 = delta2 = 0.0;
for (int i = 0; i < 8; i++) {
for(int i = 0; i < 8; i++) {
delx = corners[i][0] - corners_hold[i][0]; delx = corners[i][0] - corners_hold[i][0];
dely = corners[i][1] - corners_hold[i][1]; dely = corners[i][1] - corners_hold[i][1];
delz = corners[i][2] - corners_hold[i][2]; delz = corners[i][2] - corners_hold[i][2];
delta = sqrt(delx*delx + dely*dely + delz*delz); delta = sqrt(delx * delx + dely * dely + delz * delz);
if (delta > delta1) delta1 = delta;
else if (delta > delta2) delta2 = delta; if(delta > delta1) delta1 = delta;
else if(delta > delta2) delta2 = delta;
} }
delta = 0.5 * (skin - (delta1+delta2));
deltasq = delta*delta; delta = 0.5 * (skin - (delta1 + delta2));
deltasq = delta * delta;
} }
} else deltasq = triggersq; } else deltasq = triggersq;
double **x = atom->x; double** x = atom->x;
int nlocal = atom->nlocal; int nlocal = atom->nlocal;
if (includegroup) nlocal = atom->nfirst;
if(includegroup) nlocal = atom->nfirst;
int flag = 0; int flag = 0;
if (not cuda->neighbor_decide_by_integrator) { if(not cuda->neighbor_decide_by_integrator) {
cuda->cu_x_download(); cuda->cu_x_download();
for (int i = 0; i < nlocal; i++) {
for(int i = 0; i < nlocal; i++) {
delx = x[i][0] - xhold[i][0]; delx = x[i][0] - xhold[i][0];
dely = x[i][1] - xhold[i][1]; dely = x[i][1] - xhold[i][1];
delz = x[i][2] - xhold[i][2]; delz = x[i][2] - xhold[i][2];
rsq = delx*delx + dely*dely + delz*delz; rsq = delx * delx + dely * dely + delz * delz;
if (rsq > deltasq) flag = 1;
if(rsq > deltasq) flag = 1;
} }
} } else flag = cuda->shared_data.atom.reneigh_flag;
else flag = cuda->shared_data.atom.reneigh_flag;
int flagall; int flagall;
MPI_Allreduce(&flag,&flagall,1,MPI_INT,MPI_MAX,world); MPI_Allreduce(&flag, &flagall, 1, MPI_INT, MPI_MAX, world);
if (flagall && ago == MAX(every,delay)) ndanger++;
if(flagall && ago == MAX(every, delay)) ndanger++;
return flagall; return flagall;
} }
/* ---------------------------------------------------------------------- */ /* ---------------------------------------------------------------------- */
void NeighborCuda::build() void NeighborCuda::build(int topoflag)
{ {
int i; int i;
ago = 0; ago = 0;
ncalls++; ncalls++;
lastcall = update->ntimestep;
// store current atom positions and box size if needed // store current atom positions and box size if needed
if (dist_check) { if(dist_check) {
if (cuda->decide_by_integrator()) if(cuda->decide_by_integrator())
cuda->update_xhold(maxhold, &xhold[0][0]); cuda->update_xhold(maxhold, &xhold[0][0]);
else { else {
if (cuda->finished_setup) cuda->cu_x_download(); if(cuda->finished_setup) cuda->cu_x_download();
double **x = atom->x; double** x = atom->x;
int nlocal = atom->nlocal; int nlocal = atom->nlocal;
if (includegroup) nlocal = atom->nfirst;
if (nlocal > maxhold) { if(includegroup) nlocal = atom->nfirst;
if(nlocal > maxhold) {
maxhold = atom->nmax; maxhold = atom->nmax;
memory->destroy(xhold); memory->destroy(xhold);
memory->create(xhold,maxhold,3,"neigh:xhold"); memory->create(xhold, maxhold, 3, "neigh:xhold");
} }
for (i = 0; i < nlocal; i++) {
for(i = 0; i < nlocal; i++) {
xhold[i][0] = x[i][0]; xhold[i][0] = x[i][0];
xhold[i][1] = x[i][1]; xhold[i][1] = x[i][1];
xhold[i][2] = x[i][2]; xhold[i][2] = x[i][2];
} }
if (boxcheck) {
if (triclinic == 0) { if(boxcheck) {
if(triclinic == 0) {
boxlo_hold[0] = bboxlo[0]; boxlo_hold[0] = bboxlo[0];
boxlo_hold[1] = bboxlo[1]; boxlo_hold[1] = bboxlo[1];
boxlo_hold[2] = bboxlo[2]; boxlo_hold[2] = bboxlo[2];
@ -165,7 +178,8 @@ void NeighborCuda::build()
} else { } else {
domain->box_corners(); domain->box_corners();
corners = domain->corners; corners = domain->corners;
for (i = 0; i < 8; i++) {
for(i = 0; i < 8; i++) {
corners_hold[i][0] = corners[i][0]; corners_hold[i][0] = corners[i][0];
corners_hold[i][1] = corners[i][1]; corners_hold[i][1] = corners[i][1];
corners_hold[i][2] = corners[i][2]; corners_hold[i][2] = corners[i][2];
@ -175,9 +189,10 @@ void NeighborCuda::build()
} }
} }
if (not cudable && cuda->finished_setup && atom->avec->cudable) if(not cudable && cuda->finished_setup && atom->avec->cudable)
cuda->downloadAll(); cuda->downloadAll();
if (cudable && (not cuda->finished_setup)) {
if(cudable && (not cuda->finished_setup)) {
cuda->checkResize(); cuda->checkResize();
cuda->uploadAll(); cuda->uploadAll();
} }
@ -187,37 +202,39 @@ void NeighborCuda::build()
// else only invoke grow() if nlocal exceeds previous list size // else only invoke grow() if nlocal exceeds previous list size
// only done for lists with growflag set and which are perpetual // only done for lists with growflag set and which are perpetual
if (anyghostlist && atom->nlocal+atom->nghost > maxatom) { if(anyghostlist && atom->nlocal + atom->nghost > maxatom) {
maxatom = atom->nmax; maxatom = atom->nmax;
for (i = 0; i < nglist; i++) lists[glist[i]]->grow(maxatom);
} else if (atom->nlocal > maxatom) { for(i = 0; i < nglist; i++) lists[glist[i]]->grow(maxatom);
} else if(atom->nlocal > maxatom) {
maxatom = atom->nmax; maxatom = atom->nmax;
for (i = 0; i < nglist; i++) lists[glist[i]]->grow(maxatom);
for(i = 0; i < nglist; i++) lists[glist[i]]->grow(maxatom);
} }
// extend atom bin list if necessary // extend atom bin list if necessary
if (style != NSQ && atom->nmax > maxbin) { if(style != NSQ && atom->nmax > maxbin) {
maxbin = atom->nmax; maxbin = atom->nmax;
memory->destroy(bins); memory->destroy(bins);
memory->create(bins,maxbin,"bins"); memory->create(bins, maxbin, "bins");
} }
// check that neighbor list with special bond flags will not overflow // check that neighbor list with special bond flags will not overflow
if (atom->nlocal+atom->nghost > NEIGHMASK) if(atom->nlocal + atom->nghost > NEIGHMASK)
error->one(FLERR,"Too many local+ghost atoms for neighbor list"); error->one(FLERR, "Too many local+ghost atoms for neighbor list");
// invoke building of pair and molecular neighbor lists // invoke building of pair and molecular neighbor lists
// only for pairwise lists with buildflag set // only for pairwise lists with buildflag set
for (i = 0; i < nblist; i++) for(i = 0; i < nblist; i++)
(this->*pair_build[blist[i]])(lists[blist[i]]); (this->*pair_build[blist[i]])(lists[blist[i]]);
if (atom->molecular) { if(atom->molecular && topoflag) {
if (force->bond) (this->*bond_build)(); if(force->bond)(this->*bond_build)();
if (force->angle) (this->*angle_build)(); if(force->angle)(this->*angle_build)();
if (force->dihedral) (this->*dihedral_build)(); if(force->dihedral)(this->*dihedral_build)();
if (force->improper) (this->*improper_build)(); if(force->improper)(this->*improper_build)();
} }
} }

View File

@ -23,7 +23,7 @@ class NeighborCuda : public Neighbor {
NeighborCuda(class LAMMPS *); NeighborCuda(class LAMMPS *);
void init(); void init();
int check_distance(); int check_distance();
void build(); void build(int do_build_bonded=1);
private: private:
class Cuda *cuda; class Cuda *cuda;

View File

@ -52,6 +52,9 @@
#include "cuda.h" #include "cuda.h"
#include <ctime> #include <ctime>
#include <cmath> #include <cmath>
#ifdef _OPENMP
#include "omp.h"
#endif
using namespace LAMMPS_NS; using namespace LAMMPS_NS;
@ -834,11 +837,6 @@ void VerletCuda::run(int n)
cuda->shared_data.buffer_new = 2; cuda->shared_data.buffer_new = 2;
if(atom->molecular) {
cuda->cu_molecule->download();
cuda->cu_x->download();
}
MYDBG(printf("# CUDA VerletCuda::iterate: neighbor build\n");) MYDBG(printf("# CUDA VerletCuda::iterate: neighbor build\n");)
timer->stamp(TIME_COMM); timer->stamp(TIME_COMM);
clock_gettime(CLOCK_REALTIME, &endtime); clock_gettime(CLOCK_REALTIME, &endtime);
@ -847,21 +845,19 @@ void VerletCuda::run(int n)
//rebuild neighbor list //rebuild neighbor list
test_atom(testatom, "Pre Neighbor"); test_atom(testatom, "Pre Neighbor");
neighbor->build(); neighbor->build(0);
timer->stamp(TIME_NEIGHBOR); timer->stamp(TIME_NEIGHBOR);
MYDBG(printf("# CUDA VerletCuda::iterate: neighbor done\n");) MYDBG(printf("# CUDA VerletCuda::iterate: neighbor done\n");)
//if bonded interactions are used (in this case collect_forces_later is true), transfer data which only changes upon exchange/border routines from GPU to CPU //if bonded interactions are used (in this case collect_forces_later is true), transfer data which only changes upon exchange/border routines from GPU to CPU
if(cuda->shared_data.pair.collect_forces_later) { if(cuda->shared_data.pair.collect_forces_later) {
if(cuda->cu_molecule) cuda->cu_molecule->download(); if(cuda->cu_molecule) cuda->cu_molecule->downloadAsync(2);
cuda->cu_tag->download(); cuda->cu_tag->downloadAsync(2);
cuda->cu_type->download(); cuda->cu_type->downloadAsync(2);
cuda->cu_mask->download(); cuda->cu_mask->downloadAsync(2);
if(cuda->cu_q) cuda->cu_q->download(); if(cuda->cu_q) cuda->cu_q->downloadAsync(2);
} }
cuda->shared_data.comm.comm_phase = 3; cuda->shared_data.comm.comm_phase = 3;
} }
@ -949,6 +945,11 @@ void VerletCuda::run(int n)
timer->stamp(TIME_PAIR); timer->stamp(TIME_PAIR);
if(neighbor->lastcall == update->ntimestep) {
neighbor->build_topology();
timer->stamp(TIME_NEIGHBOR);
}
test_atom(testatom, "pre bond force"); test_atom(testatom, "pre bond force");
if(force->bond) force->bond->compute(eflag, vflag); if(force->bond) force->bond->compute(eflag, vflag);

View File

@ -1384,14 +1384,25 @@ void Atom::update_callback(int ifix)
void *Atom::extract(char *name) void *Atom::extract(char *name)
{ {
if (strcmp(name,"mass") == 0) return (void *) mass;
if (strcmp(name,"id") == 0) return (void *) tag; if (strcmp(name,"id") == 0) return (void *) tag;
if (strcmp(name,"type") == 0) return (void *) type; if (strcmp(name,"type") == 0) return (void *) type;
if (strcmp(name,"mask") == 0) return (void *) mask; if (strcmp(name,"mask") == 0) return (void *) mask;
if (strcmp(name,"image") == 0) return (void *) image;
if (strcmp(name,"x") == 0) return (void *) x; if (strcmp(name,"x") == 0) return (void *) x;
if (strcmp(name,"v") == 0) return (void *) v; if (strcmp(name,"v") == 0) return (void *) v;
if (strcmp(name,"f") == 0) return (void *) f; if (strcmp(name,"f") == 0) return (void *) f;
if (strcmp(name,"mass") == 0) return (void *) mass; if (strcmp(name,"molecule") == 0) return (void *) molecule;
if (strcmp(name,"q") == 0) return (void *) q;
if (strcmp(name,"mu") == 0) return (void *) mu;
if (strcmp(name,"omega") == 0) return (void *) omega;
if (strcmp(name,"amgmom") == 0) return (void *) angmom;
if (strcmp(name,"torque") == 0) return (void *) torque;
if (strcmp(name,"radius") == 0) return (void *) radius;
if (strcmp(name,"rmass") == 0) return (void *) rmass; if (strcmp(name,"rmass") == 0) return (void *) rmass;
if (strcmp(name,"vfrac") == 0) return (void *) vfrac;
if (strcmp(name,"s0") == 0) return (void *) s0;
return NULL; return NULL;
} }

View File

@ -649,8 +649,10 @@ void Finish::end(int flag)
if (atom->molecular && atom->natoms > 0) if (atom->molecular && atom->natoms > 0)
fprintf(screen,"Ave special neighs/atom = %g\n", fprintf(screen,"Ave special neighs/atom = %g\n",
nspec_all/atom->natoms); nspec_all/atom->natoms);
fprintf(screen,"Neighbor list builds = %d\n",neighbor->ncalls); fprintf(screen,"Neighbor list builds = " BIGINT_FORMAT "\n",
fprintf(screen,"Dangerous builds = %d\n",neighbor->ndanger); neighbor->ncalls);
fprintf(screen,"Dangerous builds = " BIGINT_FORMAT "\n",
neighbor->ndanger);
} }
if (logfile) { if (logfile) {
if (nall < 2.0e9) if (nall < 2.0e9)
@ -662,8 +664,10 @@ void Finish::end(int flag)
if (atom->molecular && atom->natoms > 0) if (atom->molecular && atom->natoms > 0)
fprintf(logfile,"Ave special neighs/atom = %g\n", fprintf(logfile,"Ave special neighs/atom = %g\n",
nspec_all/atom->natoms); nspec_all/atom->natoms);
fprintf(logfile,"Neighbor list builds = %d\n",neighbor->ncalls); fprintf(logfile,"Neighbor list builds = " BIGINT_FORMAT "\n",
fprintf(logfile,"Dangerous builds = %d\n",neighbor->ndanger); neighbor->ncalls);
fprintf(logfile,"Dangerous builds = " BIGINT_FORMAT "\n",
neighbor->ndanger);
} }
} }
} }

View File

@ -30,6 +30,7 @@
#include "modify.h" #include "modify.h"
#include "compute.h" #include "compute.h"
#include "fix.h" #include "fix.h"
#include "memory.h"
using namespace LAMMPS_NS; using namespace LAMMPS_NS;
@ -157,11 +158,19 @@ void *lammps_extract_atom(void *ptr, char *name)
id = compute ID id = compute ID
style = 0 for global data, 1 for per-atom data, 2 for local data style = 0 for global data, 1 for per-atom data, 2 for local data
type = 0 for scalar, 1 for vector, 2 for array type = 0 for scalar, 1 for vector, 2 for array
for global data, returns a pointer to the
compute's internal data structure for the entity
caller should cast it to (double *) for a scalar or vector
caller should cast it to (double **) for an array
for per-atom or local data, returns a pointer to the
compute's internal data structure for the entity
caller should cast it to (double *) for a vector
caller should cast it to (double **) for an array
returns a void pointer to the compute's internal data structure returns a void pointer to the compute's internal data structure
for the entity which the caller can cast to the proper data type for the entity which the caller can cast to the proper data type
returns a NULL if id is not recognized or style/type not supported returns a NULL if id is not recognized or style/type not supported
IMPORTANT: if the compute is not current it will be invoked IMPORTANT: if the compute is not current it will be invoked
LAMMPS cannot easily check if it is valid to invoke the compute, LAMMPS cannot easily check here if it is valid to invoke the compute,
so caller must insure that it is OK so caller must insure that it is OK
------------------------------------------------------------------------- */ ------------------------------------------------------------------------- */
@ -236,7 +245,8 @@ void *lammps_extract_compute(void *ptr, char *id, int style, int type)
which the caller can cast to a (double *) which points to the value which the caller can cast to a (double *) which points to the value
for per-atom or local data, returns a pointer to the for per-atom or local data, returns a pointer to the
fix's internal data structure for the entity fix's internal data structure for the entity
which the caller can cast to the proper data type caller should cast it to (double *) for a vector
caller should cast it to (double **) for an array
returns a NULL if id is not recognized or style/type not supported returns a NULL if id is not recognized or style/type not supported
IMPORTANT: for global data, IMPORTANT: for global data,
this function allocates a double to store the value in, this function allocates a double to store the value in,
@ -244,7 +254,7 @@ void *lammps_extract_compute(void *ptr, char *id, int style, int type)
double *dptr = (double *) lammps_extract_fix(); double *dptr = (double *) lammps_extract_fix();
double value = *dptr; double value = *dptr;
free(dptr); free(dptr);
IMPORTANT: LAMMPS cannot easily check when info extracted from IMPORTANT: LAMMPS cannot easily check here when info extracted from
the fix is valid, so caller must insure that it is OK the fix is valid, so caller must insure that it is OK
------------------------------------------------------------------------- */ ------------------------------------------------------------------------- */
@ -300,7 +310,7 @@ void *lammps_extract_fix(void *ptr, char *id, int style, int type,
which the caller can cast to a (double *) which points to the value which the caller can cast to a (double *) which points to the value
for atom-style variable, returns a pointer to the for atom-style variable, returns a pointer to the
vector of per-atom values on each processor, vector of per-atom values on each processor,
which the caller can cast to the proper data type which the caller can cast to a (double *) which points to the values
returns a NULL if name is not recognized or not equal-style or atom-style returns a NULL if name is not recognized or not equal-style or atom-style
IMPORTANT: for both equal-style and atom-style variables, IMPORTANT: for both equal-style and atom-style variables,
this function allocates memory to store the variable data in this function allocates memory to store the variable data in
@ -313,7 +323,7 @@ void *lammps_extract_fix(void *ptr, char *id, int style, int type,
double *vector = (double *) lammps_extract_variable(); double *vector = (double *) lammps_extract_variable();
use the vector values use the vector values
free(vector); free(vector);
IMPORTANT: LAMMPS cannot easily check when it is valid to evaluate IMPORTANT: LAMMPS cannot easily check here when it is valid to evaluate
the variable or any fixes or computes or thermodynamic info it references, the variable or any fixes or computes or thermodynamic info it references,
so caller must insure that it is OK so caller must insure that it is OK
------------------------------------------------------------------------- */ ------------------------------------------------------------------------- */
@ -343,7 +353,10 @@ void *lammps_extract_variable(void *ptr, char *name, char *group)
return NULL; return NULL;
} }
/* ---------------------------------------------------------------------- */ /* ----------------------------------------------------------------------
return the total number of atoms in the system
useful before call to lammps_get_atoms() so can pre-allocate vector
------------------------------------------------------------------------- */
int lammps_get_natoms(void *ptr) int lammps_get_natoms(void *ptr)
{ {
@ -353,9 +366,18 @@ int lammps_get_natoms(void *ptr)
return natoms; return natoms;
} }
/* ---------------------------------------------------------------------- */ /* ----------------------------------------------------------------------
gather the named atom-based entity across all processors
name = desired quantity, e.g. x or charge
type = 0 for integer values, 1 for double values
count = # of per-atom values, e.g. 1 for type or charge, 3 for x or f
return atom-based values in data, ordered by count, then by atom ID
e.g. x[0][0],x[0][1],x[0][2],x[1][0],x[1][1],x[1][2],x[2][0],...
data must be pre-allocated by caller to correct length
------------------------------------------------------------------------- */
void lammps_get_coords(void *ptr, double *coords) void lammps_gather_atoms(void *ptr, char *name,
int type, int count, void *data)
{ {
LAMMPS *lmp = (LAMMPS *) ptr; LAMMPS *lmp = (LAMMPS *) ptr;
@ -365,47 +387,135 @@ void lammps_get_coords(void *ptr, double *coords)
if (lmp->atom->natoms > MAXSMALLINT) return; if (lmp->atom->natoms > MAXSMALLINT) return;
int natoms = static_cast<int> (lmp->atom->natoms); int natoms = static_cast<int> (lmp->atom->natoms);
double *copy = new double[3*natoms];
for (int i = 0; i < 3*natoms; i++) copy[i] = 0.0;
double **x = lmp->atom->x; int i,j,offset;
void *vptr = lmp->atom->extract(name);
// copy = Natom length vector of per-atom values
// use atom ID to insert each atom's values into copy
// MPI_Allreduce with MPI_SUM to merge into data, ordered by atom ID
if (type == 0) {
int *vector = NULL;
int **array = NULL;
if (count == 1) vector = (int *) vptr;
else array = (int **) vptr;
int *copy;
lmp->memory->create(copy,count*natoms,"lib/gather:copy");
for (i = 0; i < count*natoms; i++) copy[i] = 0;
int *tag = lmp->atom->tag; int *tag = lmp->atom->tag;
int nlocal = lmp->atom->nlocal; int nlocal = lmp->atom->nlocal;
int id,offset; if (count == 1)
for (int i = 0; i < nlocal; i++) { for (i = 0; i < nlocal; i++)
id = tag[i]; copy[tag[i]-1] = vector[i];
offset = 3*(id-1); else
copy[offset+0] = x[i][0]; for (i = 0; i < nlocal; i++) {
copy[offset+1] = x[i][1]; offset = count*(tag[i]-1);
copy[offset+2] = x[i][2]; for (j = 0; j < count; j++)
copy[offset++] = array[i][0];
} }
MPI_Allreduce(copy,coords,3*natoms,MPI_DOUBLE,MPI_SUM,lmp->world); MPI_Allreduce(copy,data,count*natoms,MPI_INT,MPI_SUM,lmp->world);
delete [] copy; lmp->memory->destroy(copy);
} else {
double *vector = NULL;
double **array = NULL;
if (count == 1) vector = (double *) vptr;
else array = (double **) vptr;
double *copy;
lmp->memory->create(copy,count*natoms,"lib/gather:copy");
for (i = 0; i < count*natoms; i++) copy[i] = 0.0;
int *tag = lmp->atom->tag;
int nlocal = lmp->atom->nlocal;
if (count == 1) {
for (i = 0; i < nlocal; i++)
copy[tag[i]-1] = vector[i];
} else {
for (i = 0; i < nlocal; i++) {
offset = count*(tag[i]-1);
for (j = 0; j < count; j++)
copy[offset++] = array[i][j];
}
}
MPI_Allreduce(copy,data,count*natoms,MPI_DOUBLE,MPI_SUM,lmp->world);
lmp->memory->destroy(copy);
}
} }
/* ---------------------------------------------------------------------- */ /* ----------------------------------------------------------------------
scatter the named atom-based entity across all processors
name = desired quantity, e.g. x or charge
type = 0 for integer values, 1 for double values
count = # of per-atom values, e.g. 1 for type or charge, 3 for x or f
data = atom-based values in data, ordered by count, then by atom ID
e.g. x[0][0],x[0][1],x[0][2],x[1][0],x[1][1],x[1][2],x[2][0],...
------------------------------------------------------------------------- */
void lammps_put_coords(void *ptr, double *coords) void lammps_scatter_atoms(void *ptr, char *name,
int type, int count, void *data)
{ {
LAMMPS *lmp = (LAMMPS *) ptr; LAMMPS *lmp = (LAMMPS *) ptr;
// error if no map defined by LAMMPS // error if tags are not defined or not consecutive
if (lmp->atom->map_style == 0) return; if (lmp->atom->tag_enable == 0 || lmp->atom->tag_consecutive() == 0) return;
if (lmp->atom->natoms > MAXSMALLINT) return; if (lmp->atom->natoms > MAXSMALLINT) return;
int natoms = static_cast<int> (lmp->atom->natoms); int natoms = static_cast<int> (lmp->atom->natoms);
double **x = lmp->atom->x;
int m,offset; int i,j,m,offset;
for (int i = 0; i < natoms; i++) { void *vptr = lmp->atom->extract(name);
// copy = Natom length vector of per-atom values
// use atom ID to insert each atom's values into copy
// MPI_Allreduce with MPI_SUM to merge into data, ordered by atom ID
if (type == 0) {
int *vector = NULL;
int **array = NULL;
if (count == 1) vector = (int *) vptr;
else array = (int **) vptr;
int *dptr = (int *) data;
if (count == 1)
for (i = 0; i < natoms; i++)
if ((m = lmp->atom->map(i+1)) >= 0)
vector[m] = dptr[i];
else
for (i = 0; i < natoms; i++)
if ((m = lmp->atom->map(i+1)) >= 0) { if ((m = lmp->atom->map(i+1)) >= 0) {
offset = 3*i; offset = count*i;
x[m][0] = coords[offset+0]; for (j = 0; j < count; j++)
x[m][1] = coords[offset+1]; array[m][j] = dptr[offset++];
x[m][2] = coords[offset+2]; }
} else {
double *vector = NULL;
double **array = NULL;
if (count == 1) vector = (double *) vptr;
else array = (double **) vptr;
double *dptr = (double *) data;
if (count == 1) {
for (i = 0; i < natoms; i++)
if ((m = lmp->atom->map(i+1)) >= 0)
vector[m] = dptr[i];
} else {
for (i = 0; i < natoms; i++) {
if ((m = lmp->atom->map(i+1)) >= 0) {
offset = count*i;
for (j = 0; j < count; j++)
array[m][j] = dptr[offset++];
}
}
} }
} }
} }

View File

@ -38,12 +38,13 @@ void *lammps_extract_fix(void *, char *, int, int, int, int);
void *lammps_extract_variable(void *, char *, char *); void *lammps_extract_variable(void *, char *, char *);
int lammps_get_natoms(void *); int lammps_get_natoms(void *);
void lammps_get_coords(void *, double *); void lammps_gather_atoms(void *, char *, int, int, void *);
void lammps_put_coords(void *, double *); void lammps_scatter_atoms(void *, char *, int, int, void *);
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif
/* ERROR/WARNING messages: /* ERROR/WARNING messages:
*/ */

View File

@ -1259,14 +1259,16 @@ int Neighbor::check_distance()
/* ---------------------------------------------------------------------- /* ----------------------------------------------------------------------
build all perpetual neighbor lists every few timesteps build all perpetual neighbor lists every few timesteps
pairwise & topology lists are created as needed pairwise & topology lists are created as needed
topology lists only built if topoflag = 1
------------------------------------------------------------------------- */ ------------------------------------------------------------------------- */
void Neighbor::build() void Neighbor::build(int topoflag)
{ {
int i; int i;
ago = 0; ago = 0;
ncalls++; ncalls++;
lastcall = update->ntimestep;
// store current atom positions and box size if needed // store current atom positions and box size if needed
@ -1336,12 +1338,20 @@ void Neighbor::build()
for (i = 0; i < nblist; i++) for (i = 0; i < nblist; i++)
(this->*pair_build[blist[i]])(lists[blist[i]]); (this->*pair_build[blist[i]])(lists[blist[i]]);
if (atom->molecular) { if (atom->molecular && topoflag) build_topology();
}
/* ----------------------------------------------------------------------
build all topology neighbor lists every few timesteps
normally built with pair lists, but USER-CUDA separates them
------------------------------------------------------------------------- */
void Neighbor::build_topology()
{
if (force->bond) (this->*bond_build)(); if (force->bond) (this->*bond_build)();
if (force->angle) (this->*angle_build)(); if (force->angle) (this->*angle_build)();
if (force->dihedral) (this->*dihedral_build)(); if (force->dihedral) (this->*dihedral_build)();
if (force->improper) (this->*improper_build)(); if (force->improper) (this->*improper_build)();
}
} }
/* ---------------------------------------------------------------------- /* ----------------------------------------------------------------------

View File

@ -38,8 +38,9 @@ class Neighbor : protected Pointers {
double cutneighmax; // max neighbor cutoff for all type pairs double cutneighmax; // max neighbor cutoff for all type pairs
double *cuttype; // for each type, max neigh cut w/ others double *cuttype; // for each type, max neigh cut w/ others
int ncalls; // # of times build has been called bigint ncalls; // # of times build has been called
int ndanger; // # of dangerous builds bigint ndanger; // # of dangerous builds
bigint lastcall; // timestep of last neighbor::build() call
int nrequest; // requests for pairwise neighbor lists int nrequest; // requests for pairwise neighbor lists
class NeighRequest **requests; // from Pair, Fix, Compute, Command classes class NeighRequest **requests; // from Pair, Fix, Compute, Command classes
@ -70,7 +71,8 @@ class Neighbor : protected Pointers {
int decide(); // decide whether to build or not int decide(); // decide whether to build or not
virtual int check_distance(); // check max distance moved since last build virtual int check_distance(); // check max distance moved since last build
void setup_bins(); // setup bins based on box and cutoff void setup_bins(); // setup bins based on box and cutoff
virtual void build(); // create all neighbor lists (pair,bond) virtual void build(int topoflag=1); // create all neighbor lists (pair,bond)
virtual void build_topology(); // create all topology neighbor lists
void build_one(int); // create a single neighbor list void build_one(int); // create a single neighbor list
void set(int, char **); // set neighbor style and skin distance void set(int, char **); // set neighbor style and skin distance
void modify_params(int, char**); // modify parameters that control builds void modify_params(int, char**); // modify parameters that control builds

View File

@ -1 +1 @@
#define LAMMPS_VERSION "16 Aug 2012" #define LAMMPS_VERSION "21 Aug 2012"