remove references to Make.py and USER-CUDA

This commit is contained in:
Axel Kohlmeyer
2017-07-18 13:24:32 -04:00
parent a351977c59
commit bdd2f3a6b2
4 changed files with 29 additions and 158 deletions

View File

@ -1,55 +1,21 @@
These are input scripts used to run versions of several of the
benchmarks in the top-level bench directory using the GPU and
USER-CUDA accelerator packages. The results of running these scripts
on two different machines (a desktop with 2 Tesla GPUs and the ORNL
Titan supercomputer) are shown on the "GPU (Fermi)" section of the
Benchmark page of the LAMMPS WWW site: lammps.sandia.gov/bench.
benchmarks in the top-level bench directory using the GPU accelerator
package. The results of running these scripts on two different machines
(a desktop with 2 Tesla GPUs and the ORNL Titan supercomputer) are shown
on the "GPU (Fermi)" section of the Benchmark page of the LAMMPS WWW
site: lammps.sandia.gov/bench.
Examples are shown below of how to run these scripts. This assumes
you have built 3 executables with both the GPU and USER-CUDA packages
you have built 3 executables with the GPU package
installed, e.g.
lmp_linux_single
lmp_linux_mixed
lmp_linux_double
The precision (single, mixed, double) refers to the GPU and USER-CUDA
package precision. See the README files in the lib/gpu and lib/cuda
directories for instructions on how to build the packages with
different precisions. The GPU and USER-CUDA sub-sections of the
doc/Section_accelerate.html file also describes this process.
Make.py -d ~/lammps -j 16 -p #all orig -m linux -o cpu -a exe
Make.py -d ~/lammps -j 16 -p #all opt orig -m linux -o opt -a exe
Make.py -d ~/lammps -j 16 -p #all omp orig -m linux -o omp -a exe
Make.py -d ~/lammps -j 16 -p #all gpu orig -m linux \
-gpu mode=double arch=20 -o gpu_double -a libs exe
Make.py -d ~/lammps -j 16 -p #all gpu orig -m linux \
-gpu mode=mixed arch=20 -o gpu_mixed -a libs exe
Make.py -d ~/lammps -j 16 -p #all gpu orig -m linux \
-gpu mode=single arch=20 -o gpu_single -a libs exe
Make.py -d ~/lammps -j 16 -p #all cuda orig -m linux \
-cuda mode=double arch=20 -o cuda_double -a libs exe
Make.py -d ~/lammps -j 16 -p #all cuda orig -m linux \
-cuda mode=mixed arch=20 -o cuda_mixed -a libs exe
Make.py -d ~/lammps -j 16 -p #all cuda orig -m linux \
-cuda mode=single arch=20 -o cuda_single -a libs exe
Make.py -d ~/lammps -j 16 -p #all intel orig -m linux -o intel_cpu -a exe
Make.py -d ~/lammps -j 16 -p #all kokkos orig -m linux -o kokkos_omp -a exe
Make.py -d ~/lammps -j 16 -p #all kokkos orig -kokkos cuda arch=20 \
-m cuda -o kokkos_cuda -a exe
Make.py -d ~/lammps -j 16 -p #all opt omp gpu cuda intel kokkos orig \
-gpu mode=double arch=20 -cuda mode=double arch=20 -m linux \
-o all -a libs exe
Make.py -d ~/lammps -j 16 -p #all opt omp gpu cuda intel kokkos orig \
-kokkos cuda arch=20 -gpu mode=double arch=20 \
-cuda mode=double arch=20 -m cuda -o all_cuda -a libs exe
------------------------------------------------------------------------
To run on just CPUs (without using the GPU or USER-CUDA styles),
To run on just CPUs (without using the GPU styles),
do something like the following:
mpirun -np 1 lmp_linux_double -v x 8 -v y 8 -v z 8 -v t 100 < in.lj
@ -81,23 +47,5 @@ node via a "-ppn" setting.
------------------------------------------------------------------------
To run with the USER-CUDA package, do something like the following:
mpirun -np 1 lmp_linux_single -c on -sf cuda -v x 16 -v y 16 -v z 16 -v t 100 < in.lj
mpirun -np 2 lmp_linux_double -c on -sf cuda -pk cuda 2 -v x 32 -v y 64 -v z 64 -v t 100 < in.eam
The "xyz" settings determine the problem size. The "t" setting
determines the number of timesteps. The "np" setting determines how
many MPI tasks (per node) the problem will run on. The numeric
argument to the "-pk" setting is the number of GPUs (per node); 1 GPU
is the default. Note that the number of MPI tasks must equal the
number of GPUs (both per node) with the USER-CUDA package.
These mpirun commands run on a single node. To run on multiple nodes,
scale up the "-np" setting, and control the number of MPI tasks per
node via a "-ppn" setting.
------------------------------------------------------------------------
If the script has "titan" in its name, it was run on the Titan
supercomputer at ORNL.

View File

@ -71,49 +71,33 @@ integration
----------------------------------------------------------------------
Here is a src/Make.py command which will perform a parallel build of a
LAMMPS executable "lmp_mpi" with all the packages needed by all the
examples. This assumes you have an MPI installed on your machine so
that "mpicxx" can be used as the wrapper compiler. It also assumes
you have an Intel compiler to use as the base compiler. You can leave
off the "-cc mpi wrap=icc" switch if that is not the case. You can
also leave off the "-fft fftw3" switch if you do not have the FFTW
(v3) installed as an FFT package, in which case the default KISS FFT
library will be used.
cd src
Make.py -j 16 -p none molecule manybody kspace granular rigid orig \
-cc mpi wrap=icc -fft fftw3 -a file mpi
----------------------------------------------------------------------
Here is how to run each problem, assuming the LAMMPS executable is
named lmp_mpi, and you are using the mpirun command to launch parallel
runs:
Serial (one processor runs):
lmp_mpi < in.lj
lmp_mpi < in.chain
lmp_mpi < in.eam
lmp_mpi < in.chute
lmp_mpi < in.rhodo
lmp_mpi -in in.lj
lmp_mpi -in in.chain
lmp_mpi -in in.eam
lmp_mpi -in in.chute
lmp_mpi -in in.rhodo
Parallel fixed-size runs (on 8 procs in this case):
mpirun -np 8 lmp_mpi < in.lj
mpirun -np 8 lmp_mpi < in.chain
mpirun -np 8 lmp_mpi < in.eam
mpirun -np 8 lmp_mpi < in.chute
mpirun -np 8 lmp_mpi < in.rhodo
mpirun -np 8 lmp_mpi -in in.lj
mpirun -np 8 lmp_mpi -in in.chain
mpirun -np 8 lmp_mpi -in in.eam
mpirun -np 8 lmp_mpi -in in.chute
mpirun -np 8 lmp_mpi -in in.rhodo
Parallel scaled-size runs (on 16 procs in this case):
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.lj
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.chain.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.eam
mpirun -np 16 lmp_mpi -var x 4 -var y 4 < in.chute.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 < in.rhodo.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.lj
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.chain.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.eam
mpirun -np 16 lmp_mpi -var x 4 -var y 4 -in in.chute.scaled
mpirun -np 16 lmp_mpi -var x 2 -var y 2 -var z 4 -in in.rhodo.scaled
For each of the scaled-size runs you must set 3 variables as -var
command line switches. The variables x,y,z are used in the input

View File

@ -105,20 +105,11 @@ tad: temperature-accelerated dynamics of vacancy diffusion in bulk Si
vashishta: models using the Vashishta potential
voronoi: Voronoi tesselation via compute voronoi/atom command
Here is a src/Make.py command which will perform a parallel build of a
LAMMPS executable "lmp_mpi" with all the packages needed by all the
examples, with the exception of the accelerate sub-directory. See the
accelerate/README for Make.py commands suitable for its example
scripts.
cd src
Make.py -j 16 -p none std no-lib reax meam poems reaxc orig -a lib-all mpi
Here is how you might run and visualize one of the sample problems:
cd indent
cp ../../src/lmp_mpi . # copy LAMMPS executable to this dir
lmp_mpi < in.indent # run the problem
lmp_mpi -in in.indent # run the problem
Running the simulation produces the files {dump.indent} and
{log.lammps}. You can visualize the dump file as follows:

View File

@ -1,14 +1,11 @@
These are example scripts that can be run with any of
the acclerator packages in LAMMPS:
USER-CUDA, GPU, USER-INTEL, KOKKOS, USER-OMP, OPT
GPU, USER-INTEL, KOKKOS, USER-OMP, OPT
The easiest way to build LAMMPS with these packages
is via the src/Make.py tool described in Section 2.4
of the manual. You can also type "Make.py -h" to see
its options. The easiest way to run these scripts
is by using the appropriate
is via the flags described in Section 4 of the manual.
The easiest way to run these scripts is by using the appropriate
Details on the individual accelerator packages
can be found in doc/Section_accelerate.html.
@ -16,21 +13,6 @@ can be found in doc/Section_accelerate.html.
Build LAMMPS with one or more of the accelerator packages
The following command will invoke the src/Make.py tool with one of the
command-lines from the Make.list file:
../../src/Make.py -r Make.list target
target = one or more of the following:
cpu, omp, opt
cuda_double, cuda_mixed, cuda_single
gpu_double, gpu_mixed, gpu_single
intel_cpu, intel_phi
kokkos_omp, kokkos_cuda, kokkos_phi
If successful, the build will produce the file lmp_target in this
directory.
Note that in addition to any accelerator packages, these packages also
need to be installed to run all of the example scripts: ASPHERE,
MOLECULE, KSPACE, RIGID.
@ -38,39 +20,11 @@ MOLECULE, KSPACE, RIGID.
These two targets will build a single LAMMPS executable with all the
CPU accelerator packages installed (USER-INTEL for CPU, KOKKOS for
OMP, USER-OMP, OPT) or all the GPU accelerator packages installed
(USER-CUDA, GPU, KOKKOS for CUDA):
(GPU, KOKKOS for CUDA):
target = all_cpu, all_gpu
Note that the Make.py commands in Make.list assume an MPI environment
exists on your machine and use mpicxx as the wrapper compiler with
whatever underlying compiler it wraps by default. If you add "-cc mpi
wrap=g++" or "-cc mpi wrap=icc" after the target, you can choose the
underlying compiler for mpicxx to invoke. E.g.
../../src/Make.py -r Make.list intel_cpu -cc mpi wrap=icc
You should do this for any build that includes the USER-INTEL
package, since it will perform best with the Intel compilers.
Note that for kokkos_cuda, it needs to be "-cc nvcc" instead of "mpi",
since a KOKKOS for CUDA build requires NVIDIA nvcc as the wrapper
compiler.
Also note that the Make.py commands in Make.list use the default
FFT support which is via the KISS library. If you want to
build with another FFT library, e.g. FFTW3, then you can add
"-fft fftw3" after the target, e.g.
../../src/Make.py -r Make.list gpu -fft fftw3
For any build with USER-CUDA, GPU, or KOKKOS for CUDA, be sure to set
For any build with GPU, or KOKKOS for CUDA, be sure to set
the arch=XX setting to the appropriate value for the GPUs and Cuda
environment on your system. What is defined in the Make.list file is
arch=21 for older Fermi GPUs. This can be overridden as follows,
e.g. for Kepler GPUs:
../../src/Make.py -r Make.list gpu_double -gpu mode=double arch=35
environment on your system.
---------------------
@ -118,12 +72,6 @@ Note that when running in.lj.5.0 (which has a long cutoff) with the
GPU package, the "-pk tpa" setting should be > 1 (e.g. 8) for best
performance.
** USER-CUDA package
lmp_machine -c on -sf cuda < in.lj
mpirun -np 1 lmp_machine -c on -sf cuda < in.lj # 1 MPI, 1 MPI/GPU
mpirun -np 2 lmp_machine -c on -sf cuda -pk cuda 2 < in.lj # 2 MPI, 1 MPI/GPU
** KOKKOS package for OMP
lmp_kokkos_omp -k on t 1 -sf kk -pk kokkos neigh half < in.lj