git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@12592 f3b2605a-c512-4ea7-a41b-209d697bcdaa

2014-10-07 15:08:33 +00:00
parent fcad656d92
commit 16dcead2d2
16 changed files with 1085 additions and 583 deletions
--- a/doc/Section_accelerate.html
+++ b/doc/Section_accelerate.html
@ -165,8 +165,8 @@ coprocessors.
 </P>
 <P>All of these commands are in packages provided with LAMMPS.  An
 overview of packages is give in <A HREF = "Section_packages.html">Section
-packages</A>.  Currently, there are 6 accelerator
+packages</A>.  These are the accelerator packages
-packages in LAMMPS, either as standard or user packages:
+currently in LAMMPS, either as standard or user packages:
 </P>
 <DIV ALIGN=center><TABLE  BORDER=1 >
 <TR><TD ><A HREF = "accelerate_cuda.html">USER-CUDA</A> </TD><TD > for NVIDIA GPUs</TD></TR>
@ -201,8 +201,8 @@ for that style.
 </P>
 <P>To use an accelerator package in LAMMPS, and one or more of the styles
 it provides, follow these general steps.  Details vary from package to
-package and are explained in the individual accelerator sub-section
+package and are explained in the individual accelerator doc pages,
-doc pages, listed above:
+listed above:
 </P>
 <DIV ALIGN=center><TABLE  BORDER=1 >
 <TR><TD >build the accelerator library </TD><TD >  only for USER-CUDA and GPU packages </TD></TR>
@ -215,26 +215,46 @@ doc pages, listed above:
 <TR><TD >use accelerated styles in your input script </TD><TD >  via "-sf" <A HREF = "Section_start.html#start_7">command-line switch</A> or  <A HREF = "suffix.html">suffix</A> command 
 </TD></TR></TABLE></DIV>
-<P>The first 4 steps typically only need to be done once, to create an
+<P>The first 4 steps can be done as a single command, using the
-executable that uses one or more accelerator packages.  We are working
+src/Make.py tool.  The Make.py tool is discussed in <A HREF = "Section_start.html#start_4">Section
-to create a "make" tool that will perform all these 4 steps in a
+2.4</A> of the manual, and its use is
-single command.
+illustrated in the individual accelerator sections.  Typically these
 steps only need to be done once, to create an executable that uses one
 or more accelerator packages.
 </P>
 <P>The last 4 steps can all be done from the command-line when LAMMPS is
-launched, without changing your input script.  Or you can add
+launched, without changing your input script, as illustrated in the
 individual accelerator sections.  Or you can add
 <A HREF = "package.html">package</A> and <A HREF = "suffix.html">suffix</A> commands to your input
 script.
 </P>
-<P>The examples directory has several sub-directories with scripts and
+<P>IMPORTANT NOTE: With a few exceptions, you can build a single LAMMPS
-README files for how to use the following accelerator packages:
+executable with all its accelerator packages installed.  Note that the
 USER-INTEL and KOKKOS packages require you to choose one of their
 options when building.  I.e. CPU or Phi for USER-INTEL.  OpenMP, Cuda,
 or Phi for KOKKOS.  Here are the exceptions; you cannot build a single
 executable with:
 </P>
-<UL><LI>examples/cuda for USER-CUDA package
+<UL><LI>both the USER-INTEL Phi and KOKKOS Phi options
-<LI>examples/gpu for GPU package
+<LI>the USER-INTEL Phi or Kokkos Phi option, and either the USER-CUDA or GPU packages 
 <LI>examples/intel for USER-INTEL package
 <LI>examples/kokkos for KOKKOS package 
 </UL>
-<P>Likewise, the bench directory has FERMI and KEPLER sub-directories
+<P>See the examples/accelerate/README and make.list files for sample
-with scripts and README files for using all the accelerator packages.
+Make.py commands that build LAMMPS with any or all of the accelerator
 packages.  As an example, here is a command that builds with all the
 GPU related packages installed (USER-CUDA, GPU, KOKKOS with Cuda),
 including settings to build the needed auxiliary USER-CUDA and GPU
 libraries for Kepler GPUs:
 </P>
 <PRE>Make.py -j 16 -p omp gpu cuda kokkos -cc nvcc wrap=mpi   -cuda mode=double arch=35 -gpu mode=double arch=35 \  -kokkos cuda arch=35 lib-all file mpi 
 </PRE>
 <P>The examples/accelerate directory also has input scripts that can be
 used with all of the accelerator packages.  See its README file for
 details.
 </P>
 <P>Likewise, the bench directory has FERMI and KEPLER and PHI
 sub-directories with Make.py commands and input scripts for using all
 the accelerator packages on various machines.  See the README files in
 those dirs.
 </P>
 <P>As mentioned above, the <A HREF = "http://lammps.sandia.gov/bench.html">Benchmark
 page</A> of the LAMMPS web site gives
@ -243,23 +263,25 @@ of the standard LAMMPS benchmark problems, as a function of problem
 size and number of compute nodes, on different hardware platforms.
 </P>
 <P>Here is a brief summary of what the various packages provide.  Details
-are in the individual package sub-sections listed above.
+are in the individual accelerator sections.
 </P>
 <UL><LI>Styles with a "cuda" or "gpu" suffix are part of the USER-CUDA or GPU
 packages, and can be run on NVIDIA GPUs.  The speed-up on a GPU
-depends on a variety of factors, as discussed below. 
+depends on a variety of factors, discussed in the accelerator
 sections. 
 <LI>Styles with an "intel" suffix are part of the USER-INTEL
 package. These styles support vectorized single and mixed precision
 calculations, in addition to full double precision.  In extreme cases,
 this can provide speedups over 3.5x on CPUs.  The package also
-supports acceleration with offload to Intel(R) Xeon Phi(TM)
+supports acceleration in "offload" mode to Intel(R) Xeon Phi(TM)
 coprocessors.  This can result in additional speedup over 2x depending
 on the hardware configuration. 
 <LI>Styles with a "kk" suffix are part of the KOKKOS package, and can be
-run using OpenMP, on an NVIDIA GPU, or on an Intel Xeon Phi.  The
+run using OpenMP on multicore CPUs, on an NVIDIA GPU, or on an Intel
-speed-up depends on a variety of factors, as discussed below. 
+Xeon Phi in "native" mode.  The speed-up depends on a variety of
 factors, as discussed on the KOKKOS accelerator page. 
 <LI>Styles with an "omp" suffix are part of the USER-OMP package and allow
 a pair-style to be run in multi-threaded mode using OpenMP.  This can
@ -272,7 +294,7 @@ overload the available bandwidth for communication.
 speed-up the pairwise calculations of your simulation by 5-25% on a
 CPU. 
 </UL>
-<P>The individual accelerator package sub-sections explain:
+<P>The individual accelerator package doc pages explain:
 </P>
 <UL><LI>what hardware and software the accelerated package requires
 <LI>how to build LAMMPS with the accelerated package
--- a/doc/Section_accelerate.txt
+++ b/doc/Section_accelerate.txt
@ -152,8 +152,8 @@ coprocessors.
 All of these commands are in packages provided with LAMMPS.  An
 overview of packages is give in "Section
-packages"_Section_packages.html.  Currently, there are 6 accelerator
+packages"_Section_packages.html.  These are the accelerator packages
-packages in LAMMPS, either as standard or user packages:
+currently in LAMMPS, either as standard or user packages:
 "USER-CUDA"_accelerate_cuda.html : for NVIDIA GPUs
 "GPU"_accelerate_gpu.html : for NVIDIA GPUs as well as OpenCL support
@ -186,8 +186,8 @@ for that style.
 To use an accelerator package in LAMMPS, and one or more of the styles
 it provides, follow these general steps.  Details vary from package to
-package and are explained in the individual accelerator sub-section
+package and are explained in the individual accelerator doc pages,
-doc pages, listed above:
+listed above:
 build the accelerator library |
  only for USER-CUDA and GPU packages |
@ -211,26 +211,48 @@ use accelerated styles in your input script |
  via "-sf" "command-line switch"_Section_start.html#start_7 or
  "suffix"_suffix.html command :tb(c=2,s=|)
-The first 4 steps typically only need to be done once, to create an
+The first 4 steps can be done as a single command, using the
-executable that uses one or more accelerator packages.  We are working
+src/Make.py tool.  The Make.py tool is discussed in "Section
-to create a "make" tool that will perform all these 4 steps in a
+2.4"_Section_start.html#start_4 of the manual, and its use is
-single command.
+illustrated in the individual accelerator sections.  Typically these
 steps only need to be done once, to create an executable that uses one
 or more accelerator packages.
 The last 4 steps can all be done from the command-line when LAMMPS is
-launched, without changing your input script.  Or you can add
+launched, without changing your input script, as illustrated in the
 individual accelerator sections.  Or you can add
 "package"_package.html and "suffix"_suffix.html commands to your input
 script.
-The examples directory has several sub-directories with scripts and
+IMPORTANT NOTE: With a few exceptions, you can build a single LAMMPS
-README files for how to use the following accelerator packages:
+executable with all its accelerator packages installed.  Note that the
 USER-INTEL and KOKKOS packages require you to choose one of their
 options when building.  I.e. CPU or Phi for USER-INTEL.  OpenMP, Cuda,
 or Phi for KOKKOS.  Here are the exceptions; you cannot build a single
 executable with:
-examples/cuda for USER-CUDA package
+both the USER-INTEL Phi and KOKKOS Phi options
-examples/gpu for GPU package
+the USER-INTEL Phi or Kokkos Phi option, and either the USER-CUDA or GPU packages :ul
 examples/intel for USER-INTEL package
 examples/kokkos for KOKKOS package :ul
-Likewise, the bench directory has FERMI and KEPLER sub-directories
+See the examples/accelerate/README and make.list files for sample
-with scripts and README files for using all the accelerator packages.
+Make.py commands that build LAMMPS with any or all of the accelerator
 packages.  As an example, here is a command that builds with all the
 GPU related packages installed (USER-CUDA, GPU, KOKKOS with Cuda),
 including settings to build the needed auxiliary USER-CUDA and GPU
 libraries for Kepler GPUs:
 Make.py -j 16 -p omp gpu cuda kokkos -cc nvcc wrap=mpi \
  -cuda mode=double arch=35 -gpu mode=double arch=35 \\
  -kokkos cuda arch=35 lib-all file mpi :pre
 The examples/accelerate directory also has input scripts that can be
 used with all of the accelerator packages.  See its README file for
 details.
 Likewise, the bench directory has FERMI and KEPLER and PHI
 sub-directories with Make.py commands and input scripts for using all
 the accelerator packages on various machines.  See the README files in
 those dirs.
 As mentioned above, the "Benchmark
 page"_http://lammps.sandia.gov/bench.html of the LAMMPS web site gives
@ -239,23 +261,25 @@ of the standard LAMMPS benchmark problems, as a function of problem
 size and number of compute nodes, on different hardware platforms.
 Here is a brief summary of what the various packages provide.  Details
-are in the individual package sub-sections listed above.
+are in the individual accelerator sections.
 Styles with a "cuda" or "gpu" suffix are part of the USER-CUDA or GPU
 packages, and can be run on NVIDIA GPUs.  The speed-up on a GPU
-depends on a variety of factors, as discussed below. :ulb,l
+depends on a variety of factors, discussed in the accelerator
 sections. :ulb,l
 Styles with an "intel" suffix are part of the USER-INTEL
 package. These styles support vectorized single and mixed precision
 calculations, in addition to full double precision.  In extreme cases,
 this can provide speedups over 3.5x on CPUs.  The package also
-supports acceleration with offload to Intel(R) Xeon Phi(TM)
+supports acceleration in "offload" mode to Intel(R) Xeon Phi(TM)
 coprocessors.  This can result in additional speedup over 2x depending
 on the hardware configuration. :l
 Styles with a "kk" suffix are part of the KOKKOS package, and can be
-run using OpenMP, on an NVIDIA GPU, or on an Intel Xeon Phi.  The
+run using OpenMP on multicore CPUs, on an NVIDIA GPU, or on an Intel
-speed-up depends on a variety of factors, as discussed below. :l
+Xeon Phi in "native" mode.  The speed-up depends on a variety of
 factors, as discussed on the KOKKOS accelerator page. :l
 Styles with an "omp" suffix are part of the USER-OMP package and allow
 a pair-style to be run in multi-threaded mode using OpenMP.  This can
@ -268,7 +292,7 @@ Styles with an "opt" suffix are part of the OPT package and typically
 speed-up the pairwise calculations of your simulation by 5-25% on a
 CPU. :l,ule
-The individual accelerator package sub-sections explain:
+The individual accelerator package doc pages explain:
 what hardware and software the accelerated package requires
 how to build LAMMPS with the accelerated package
--- a/doc/Section_start.html
+++ b/doc/Section_start.html
@ -85,14 +85,27 @@ launch a LAMMPS Windows executable on a Windows box.
 <A NAME = "start_2_1"></A><B><I>Read this first:</I></B> 
-<P>If you want to avoid building LAMMPS, read the preceeding section
+<P>If you want to avoid building LAMMPS yourself, read the preceeding
-about options available for downloading and installing executables.
+section about options available for downloading and installing
-Details are discussed on the <A HREF = "download">download</A> page.
+executables.  Details are discussed on the <A HREF = "download">download</A> page.
 </P>
-<P>Building LAMMPS can be simple or not-so-simple.  If MPI is already
+<P>Building LAMMPS can be simple or not-so-simple.  If all you need are
-installed on your machine (or you just want to run LAMMPS in serial)
+the default packages installed in LAMMPS, and MPI is already installed
-and you can use one of the provided machine Makefiles and the build
+on your machine, or you just want to run LAMMPS in serial, then you
-works on your platform, then it's simple.
+can typically use the Makefile.mpi or Makefile.serial files in
 src/MAKE and type one of these lines (from the src dir):
 </P>
 <PRE>make mpi
 make serial 
 </PRE>
 <P>Or if one of the other Makefile.machine files in the src/MAKE
 sub-directories matches your system (type "make" to see a list), you
 can use it as-is by typing (for example):
 </P>
 <PRE>make stampede 
 </PRE>
 <P>If any of these builds with an existing Makefile.machine works on your
 system, then you're done!
 </P>
 <P>If you want to do one of these:
 </P>
@ -105,7 +118,14 @@ works on your platform, then it's simple.
 auxiliary libraries exist on your machine or install them if they
 don't.  You may need to build additional libraries that are part of
 the LAMMPS package, before building LAMMPS.  You may need to edit a
-machine Makefile to make it compatible with your system.
+Makefile.machine file to make it compatible with your system.
 </P>
 <P>Note that there is a Make.py tool in the src directory that automates
 several of these steps, but you still have to know what you are doing.
 <A HREF = "#start_4">Section 2.4</A> below describes the tool.  It is a convenient
 way to work with installing/un-installing various packages, the
 Makefile.machine changes required by some packages, and the auxiliary
 libraries some of them use.
 </P>
 <P>Please read the following sections carefully.  If you are not
 comfortable with makefiles, or building codes on a Unix platform, or
@ -121,7 +141,7 @@ please post the issue to the <A HREF = "http://lammps.sandia.gov/mail.html">LAMM
 list</A>.
 </P>
 <P>If you succeed in building LAMMPS on a new kind of machine, for which
-there isn't a similar machine Makefile included in the src/MAKE
+there isn't a similar machine Makefile included in the src/MAKE/MORE
 directory, then send it to the developers and we can include it in the
 LAMMPS distribution.
 </P>
@ -133,21 +153,43 @@ LAMMPS distribution.
 </P>
 <P>The src directory contains the C++ source and header files for LAMMPS.
 It also contains a top-level Makefile and a MAKE sub-directory with
-low-level Makefile.* files for many machines.  From within the src
+low-level Makefile.* files for many systems and machines.  See the
-directory, type "make" or "gmake".  You should see a list of available
+src/MAKE/README file for a quick overview of what files are available
-choices.  If one of those is the machine and options you want, you can
+and what sub-directories they are in.
 type a command like:
 </P>
-<PRE>make linux
+<P>The src/MAKE dir has a few files that should work as-is on many
 platforms.  The src/MAKE/OPTIONS dir has more that inovke additional
 compiler, MPI, and other setting options commonly used by LAMMPS, to
 illustrate their syntax.  The src/MAKE/MACHINES dir has many more that
 have been tweaked or optimized for specific machines.  These files are
 all good starting points if you find you need to change them for your
 machine.  Put any file you edit into the src/MAKE/MINE directory and
 it will be never be touched by any LAMMPS updates.
 </P>
 <P>From within the src directory, type "make" or "gmake".  You should see
 a list of available choices from src/MAKE and all of its
 sub-directories.  If one of those has the options you want or is the
 machine you want, you can type a command like:
 </P>
 <PRE>make mpi
 or
 make serial_icc
 or
 gmake mac 
 </PRE>
 <P>Note that the corresponding Makefile.machine can exist in src/MAKE or
 any of its sub-directories.  If a file with the same name appears in
 multiple places (not a good idea), the order they are used is as
 follows: src/MAKE/MINE, src/MAKE, src/MAKE/OPTIONS, src/MAKE/MACHINES.
 This gives preference to a file you have created/edited and put in
 src/MAKE/MINE.
 </P>
 <P>Note that on a multi-processor or multi-core platform you can launch a
 parallel make, by using the "-j" switch with the make command, which
 will build LAMMPS more quickly.
 </P>
-<P>If you get no errors and an executable like lmp_linux or lmp_mac is
+<P>If you get no errors and an executable like lmp_mpi or lmp_g++_serial
-produced, you're done; it's your lucky day.
+or lmp_mac is produced, then you're done; it's your lucky day.
 </P>
 <P>Note that by default only a few of LAMMPS optional packages are
 installed.  To build LAMMPS with optional packages, see <A HREF = "#start_3">this
@ -157,43 +199,47 @@ section</A> below.
 </P>
 <P>If Step 0 did not work, you will need to create a low-level Makefile
 for your machine, like Makefile.foo.  You should make a copy of an
-existing src/MAKE/Makefile.* as a starting point.  The only portions
+existing Makefile.* in src/MAKE or one of its sub-directories as a
-of the file you need to edit are the first line, the "compiler/linker
+starting point.  The only portions of the file you need to edit are
-settings" section, and the "LAMMPS-specific settings" section.
+the first line, the "compiler/linker settings" section, and the
 "LAMMPS-specific settings" section.  When it works, put the edited
 file in src/MAKE/MINE and it will not be altered by any future LAMMPS
 updates.
 </P>
 <P><B>Step 2</B>
 </P>
-<P>Change the first line of src/MAKE/Makefile.foo to list the word "foo"
+<P>Change the first line of Makefile.foo to list the word "foo" after the
-after the "#", and whatever other options it will set.  This is the
+"#", and whatever other options it will set.  This is the line you
-line you will see if you just type "make".
+will see if you just type "make".
 </P>
 <P><B>Step 3</B>
 </P>
 <P>The "compiler/linker settings" section lists compiler and linker
 settings for your C++ compiler, including optimization flags.  You can
 use g++, the open-source GNU compiler, which is available on all Unix
-systems.  You can also use mpicc which will typically be available if
+systems.  You can also use mpicxx which will typically be available if
 MPI is installed on your system, though you should check which actual
 compiler it wraps.  Vendor compilers often produce faster code.  On
-boxes with Intel CPUs, we suggest using the commercial Intel icc
+boxes with Intel CPUs, we suggest using the Intel icc compiler, which
-compiler, which can be downloaded from <A HREF = "http://www.intel.com/software/products/noncom">Intel's compiler site</A>.
+can be downloaded from <A HREF = "http://www.intel.com/software/products/noncom">Intel's compiler site</A>.
 </P>
 <P>If building a C++ code on your machine requires additional libraries,
-then you should list them as part of the LIB variable.
+then you should list them as part of the LIB variable.  You should
 not need to do this if you use mpicxx.
 </P>
 <P>The DEPFLAGS setting is what triggers the C++ compiler to create a
 dependency list for a source file.  This speeds re-compilation when
 source (*.cpp) or header (*.h) files are edited.  Some compilers do
 not support dependency file creation, or may use a different switch
-than -D.  GNU g++ works with -D.  If your compiler can't create
+than -D.  GNU g++ and Intel icc works with -D.  If your compiler can't
-dependency files, then you'll need to create a Makefile.foo patterned
+create dependency files, then you'll need to create a Makefile.foo
-after Makefile.storm, which uses different rules that do not involve
+patterned after Makefile.storm, which uses different rules that do not
-dependency files.  Note that when you build LAMMPS for the first time
+involve dependency files.  Note that when you build LAMMPS for the
-on a new platform, a long list of *.d files will be printed out
+first time on a new platform, a long list of *.d files will be printed
-rapidly.  This is not an error; it is the Makefile doing its normal
+out rapidly.  This is not an error; it is the Makefile doing its
-creation of dependencies.
+normal creation of dependencies.
 </P>
 <P><B>Step 4</B>
 </P>
@ -277,20 +323,21 @@ Step 6 below for info about building LAMMPS with an FFT library.
 <P><B>Step 5</B>
 </P>
 <P>The 3 MPI variables are used to specify an MPI library to build LAMMPS
-with. 
+with.  Note that you do not need to set these if you use the MPI
 compiler mpicxx for your CC and LINK setting in the section above.
 The MPI wrapper knows where to find the needed files.
 </P>
 <P>If you want LAMMPS to run in parallel, you must have an MPI library
-installed on your platform.  If you use an MPI-wrapped compiler, such
+installed on your platform.  If MPI is installed on your system in the
-as "mpicc" to build LAMMPS, you should be able to leave these 3
+usual place (under /usr/local), you also may not need to specify these
-variables blank; the MPI wrapper knows where to find the needed files.
+3 variables, assuming /usr/local is in your path.  On some large
-If not, and MPI is installed on your system in the usual place (under
+parallel machines which use "modules" for their compile/link
-/usr/local), you also may not need to specify these 3 variables.  On
+environements, you may simply need to include the correct module in
-some large parallel machines which use "modules" for their
+your build environment, before building LAMMPS.  Or the parallel
-compile/link environements, you may simply need to include the correct
+machine may have a vendor-provided MPI which the compiler has no
-module in your build environment.  Or the parallel machine may have a
+trouble finding.
 vendor-provided MPI which the compiler has no trouble finding.
 </P>
-<P>Failing this, with these 3 variables you can specify where the mpi.h
+<P>Failing this, these 3 variables can be used to specify where the mpi.h
 file (MPI_INC) and the MPI library file (MPI_PATH) are found and the
 name of the library file (MPI_LIB).
 </P>
@ -310,20 +357,22 @@ arise when linking LAMMPS to the MPI library.
 </P>
 <P>If you just want to run LAMMPS on a single processor, you can use the
 dummy MPI library provided in src/STUBS, since you don't need a true
-MPI library installed on your system.  See the
+MPI library installed on your system.  See src/MAKE/Makefile.serial
-src/MAKE/Makefile.serial file for how to specify the 3 MPI variables
+for how to specify the 3 MPI variables in this case.  You will also
-in this case.  You will also need to build the STUBS library for your
+need to build the STUBS library for your platform before making LAMMPS
-platform before making LAMMPS itself.  To build from the src
+itself.  Note that if you are building with src/MAKE/Makefile.serial,
-directory, type "make stubs", or from the STUBS dir, type "make".
+e.g. by typing "make serial", then the STUBS library is built for you.
 This should create a libmpi_stubs.a file suitable for linking to
 LAMMPS.  If the build fails, you will need to edit the STUBS/Makefile
 for your platform.
 </P>
-<P>The file STUBS/mpi.c provides a CPU timer function called
+<P>To build the STUBS library from the src directory, type "make stubs",
-MPI_Wtime() that calls gettimeofday() .  If your system doesn't
+or from the src/STUBS dir, type "make".  This should create a
-support gettimeofday() , you'll need to insert code to call another
+libmpi_stubs.a file suitable for linking to LAMMPS.  If the build
-timer.  Note that the ANSI-standard function clock() rolls over after
+fails, you will need to edit the STUBS/Makefile for your platform.
-an hour or so, and is therefore insufficient for timing long LAMMPS
+</P>
 <P>The file STUBS/mpi.c provides a CPU timer function called MPI_Wtime()
 that calls gettimeofday() .  If your system doesn't support
 gettimeofday() , you'll need to insert code to call another timer.
 Note that the ANSI-standard function clock() rolls over after an hour
 or so, and is therefore insufficient for timing long LAMMPS
 simulations.
 </P>
 <P><B>Step 6</B>
@ -410,11 +459,9 @@ section</A> below, before proceeding to Step 9.
 </P>
 <P><B>Step 9</B>
 </P>
-<P>That's it.  Once you have a correct Makefile.foo, you have installed
+<P>That's it.  Once you have a correct Makefile.foo, and you have
-the optional LAMMPS packages you want to include in your build, and
+pre-built any other needed libraries (e.g. MPI, FFT, etc) all you need
-you have pre-built any other needed libraries (e.g. MPI, FFT, package
+to do from the src directory is type something like this:
 libraries), all you need to do from the src directory is type
 something like this:
 </P>
 <PRE>make foo
 or
@ -530,7 +577,7 @@ neighbor lists and would run very slowly in terms of CPU secs/timestep.
 <A NAME = "start_2_5"></A><B><I>Building for a Mac:</I></B> 
 <P>OS X is BSD Unix, so it should just work.  See the
-src/MAKE/Makefile.mac file.
+src/MAKE/MACHINES/Makefile.mac and Makefile.mac_mpi files.
 </P>
 <HR>
@ -559,7 +606,7 @@ excluded, you can build it yourself.
 </P>
 <P>One way to do this is install and use cygwin to build LAMMPS with a
 standard unix style make program, just as you would on a Linux box;
-see src/MAKE/Makefile.cygwin.
+see src/MAKE/MACHINES/Makefile.cygwin.
 </P>
 <P>The other way to do this is using Visual Studio and project files.
 See the src/WINDOWS directory and its README.txt file for instructions
@ -574,8 +621,13 @@ on both a basic build and a customized build with pacakges you select.
 <UL><LI><A HREF = "#start_3_1">Package basics</A>
 <LI><A HREF = "#start_3_2">Including/excluding packages</A>
 <LI><A HREF = "#start_3_3">Packages that require extra libraries</A>
-<LI><A HREF = "#start_3_4">Packages that use make variable settings</A> 
+<LI><A HREF = "#start_3_4">Packages that require Makefile.machine settings</A> 
 </UL>
 <P>Note that the following <A HREF = "#start_4">Section 2.4</A> describes the Make.py
 tool which can be used to install/un-install packages and build the
 auxiliary libraries which some of them use.  It can also auto-edit a
 Makefile.machine to add settings needed by some packages.
 </P>
 <HR>
 <A NAME = "start_3_1"></A><B><I>Package basics:</I></B> 
@ -583,9 +635,11 @@ on both a basic build and a customized build with pacakges you select.
 <P>The source code for LAMMPS is structured as a set of core files which
 are always included, plus optional packages.  Packages are groups of
 files that enable a specific set of features.  For example, force
-fields for molecular systems or granular systems are in packages.  You
+fields for molecular systems or granular systems are in packages.
-can see the list of all packages by typing "make package" from within
+</P>
-the src directory of the LAMMPS distribution.
+<P>You can see the list of all packages by typing "make package" from
 within the src directory of the LAMMPS distribution.  This also lists
 various make commands that can be used to manipulate packages.
 </P>
 <P>If you use a command in a LAMMPS input script that is specific to a
 particular package, you must have built LAMMPS with that package, else
@ -652,10 +706,11 @@ I.e. individual files are only included if their dependencies are
 already included.  Likewise, if a package is excluded, other files
 dependent on that package are also excluded.
 </P>
-<P>The reason to exclude packages is if you will never run certain kinds
+<P>If you will never run simulations that use the features in a
-of simulations.  For some packages, this will keep you from having to
+particular packages, there is no reason to include it in your build.
-build auxiliary libraries (see below), and will also produce a smaller
+For some packages, this will keep you from having to build auxiliary
-executable which may run a bit faster.
+libraries (see below), and will also produce a smaller executable
 which may run a bit faster.
 </P>
 <P>When you download a LAMMPS tarball, these packages are pre-installed
 in the src directory: KSPACE, MANYBODY,MOLECULE.  When you download
@ -666,9 +721,10 @@ pre-installed.
 no-name", where "name" is the name of the package in lower-case, e.g.
 name = kspace for the KSPACE package or name = user-atc for the
 USER-ATC package.  You can also type "make yes-standard", "make
-no-standard", "make yes-user", "make no-user", "make yes-all" or "make
+no-standard", "make yes-std", "make no-std", "make yes-user", "make
-no-all" to include/exclude various sets of packages.  Type "make
+no-user", "make yes-all" or "make no-all" to include/exclude various
-package" to see the all of the package-related make options.
+sets of packages.  Type "make package" to see the all of the
 package-related make options.
 </P>
 <P>IMPORTANT NOTE: Inclusion/exclusion of a package works by simply
 moving files back and forth between the main src directory and
@ -682,18 +738,19 @@ sub-directories.  You do not normally need to use these commands
 unless you are editing LAMMPS files or have downloaded a patch from
 the LAMMPS WWW site.
 </P>
-<P>Typing "make package-update" will overwrite src files with files from
+<P>Typing "make package-update" or "make pu" will overwrite src files
-the package sub-directories if the package has been included.  It
+with files from the package sub-directories if the package has been
-should be used after a patch is installed, since patches only update
+included.  It should be used after a patch is installed, since patches
-the files in the package sub-directory, but not the src files.  Typing
+only update the files in the package sub-directory, but not the src
-"make package-overwrite" will overwrite files in the package
+files.  Typing "make package-overwrite" will overwrite files in the
-sub-directories with src files.
+package sub-directories with src files.
 </P>
-<P>Typing "make package-status" will show which packages are currently
+<P>Typing "make package-status" or "make ps" will show which packages are
-included. Of those that are included, it will list files that are
+currently included. Of those that are included, it will list files
-different in the src directory and package sub-directory.  Typing
+that are different in the src directory and package sub-directory.
-"make package-diff" lists all differences between these files.  Again,
+Typing "make package-diff" lists all differences between these files.
-type "make package" to see all of the package-related make options.
+Again, type "make package" to see all of the package-related make
 options.
 </P>
 <HR>
@ -705,16 +762,16 @@ you get a LAMMPS build error about a missing library, this is likely
 the reason.  See the <A HREF = "Section_packages.html">Section_packages</A> doc page
 for a list of packages that have auxiliary libraries.
 </P>
-<P>Code for some of these auxiliary libraries is included in the LAMMPS
+<P>Code for most of these auxiliary libraries is included in the LAMMPS
 distribution under the lib directory.  Examples are the USER-ATC and
-MEAM packages.  Some auxiliary libraries are NOT included with LAMMPS;
+MEAM packages.  A few auxiliary libraries are NOT included with
-to use the associated package you must download and install the
+LAMMPS; to use the associated package you must download and install
-auxiliary library yourself.  Examples are the KIM and VORONOI and
+the auxiliary library yourself.  Examples are the KIM and VORONOI and
 USER-MOLFILE packages.
 </P>
-<P>For libraries with provided source code, each lib directory has a
+<P>For provided libraries, each lib directory has a README file
-README file (e.g. lib/reax/README) with instructions on how to build
+(e.g. lib/reax/README) with instructions on how to build that library.
-that library.  Typically this is done by typing something like:
+Typically this is done by typing something like:
 </P>
 <PRE>make -f Makefile.g++ 
 </PRE>
@ -746,168 +803,205 @@ is built with, typically requires additional Fortran-to-C libraries be
 included in the link.  Another example are the BLAS and LAPACK
 libraries needed to use the USER-ATC or USER-AWPMD packages.
 </P>
-<P>For libraries without provided source code, see the
+<P>For libraries without provided source code, the file
-src/package/Makefile.lammps file for information on where to find the
+src/package/README has information on where to find the library and
-library and how to build it.  E.g. the file src/KIM/Makefile.lammps or
+how to build it, e.g. src/VORONOI/README.  There is also a
-src/VORONOI/Makefile.lammps or src/UESR-MOLFILE/Makefile.lammps.
+Makefile.lammps file in the src/package directory.  E.g. files
-These files serve the same purpose as the lib/package/Makefile.lammps
+src/KIM/Makefile.lammps or src/VORONOI/Makefile.lammps or
-files described above.  The files have settings needed when LAMMPS is
+src/UESR-MOLFILE/Makefile.lammps.  These files serve the same purpose
-built to link with the corresponding auxiliary library.
+as the lib/package/Makefile.lammps files described above.  The files
 have settings needed when LAMMPS is built to link with the
 corresponding auxiliary library.
 </P>
 <P>Again, you must insure that the settings in
 src/package/Makefile.lammps are appropriate for your system and where
 you installed the auxiliary library.  If they are not, the LAMMPS
-build will fail.
+build will typically fail.
 </P>
 <HR>
-<A NAME = "start_3_4"></A><B><I>Packages that use make variable settings</I></B> 
+<A NAME = "start_3_4"></A><B><I>Packages that require Makefile.machine settings</I></B> 
-<P>One package, the KOKKOS package, allows its build options to be
+<P>A few packages require specific settings in Makefile.machine, to
-specified by setting variables via the "make" command, rather than by
+either build or use the package effectively.  These are the
-first building an auxiliary library and editing a Makefile.lammps
+USER-INTEL, KOKKOS, USER-OMP, and OPT packages.  The details of what
-file, as discussed in the previous sub-section for other packages.
+flags to add or what variables to define are given on the doc pages
-This is for convenience since it is common to want to experiment with
+that describe each of these accelerator packages in detail:
 different Kokkos library options.  Using variables enables a direct
 re-build of LAMMPS and its Kokkos dependencies, so that a benchmark
 test with different Kokkos options can be quickly performed.
 </P>
-<P>The syntax for setting make variables is as follows.  You must
+<UL><LI><A HREF = "accelerate_intel.html">USER-INTEL package</A>
-use a GNU-compatible make command for this to work.  Try "gmake"
+<LI><A HREF = "accelerate_kokkos.html">KOKKOS package</A>
-if your system's standard make complains.
+<LI><A HREF = "accelerate_omp.html">USER-OMP package</A>
-</P>
+<LI><A HREF = "accelerate_opt.html">OPT package</A> 
 <PRE>make yes-kokkos
 make g++ VAR1=value VAR2=value ... 
 </PRE>
 <P>The first line installs the KOKKOS package, which only needs to be
 done once.  The second line builds LAMMPS with src/MAKE/Makefile.g++
 and optionally sets one or more variables that affect the build.  Each
 variable is specified in upper-case; its value follows an equal sign
 with no spaces.  The second line can be repeated with different
 variable settings, though a "clean" must be done before the rebuild.
 Type "make clean" to see options for this operation.
 </P>
 <P>These are the variables that can be specified.  Each takes a value of
 <I>yes</I> or <I>no</I>.  The default value is listed, which is set in the
 lib/kokkos/Makefile.lammps file.  See <A HREF = "Section_accelerate.html#acc_8">this
 section</A> for a discussion of what is
 meant by "host" and "device" in the Kokkos context.
 </P>
 <UL><LI>OMP, default = <I>yes</I>
 <LI>CUDA, default = <I>no</I>
 <LI>HWLOC, default = <I>no</I>
 <LI>AVX, default = <I>no</I>
 <LI>MIC, default = <I>no</I>
 <LI>LIBRT, default = <I>no</I>
 <LI>DEBUG, default = <I>no</I> 
 </UL>
-<P>OMP sets the parallelization method used for Kokkos code (within
+<P>Here is a brief summary of what Makefile.machine changes are needed.
-LAMMPS) that runs on the host.  OMP=yes means that OpenMP will be
+Note that the Make.py tool, described in the next <A HREF = "#start_4">Section
-used.  OMP=no means that pthreads will be used.
+2.4</A> can automatically add the needed info to an existing
 machine Makefile, using simple command-line arguments.
 </P>
-<P>CUDA sets the parallelization method used for Kokkos code (within
+<P>In src/MAKE/OPTIONS see the following Makefiles for examples of the
-LAMMPS) that runs on the device.  CUDA=yes means an NVIDIA GPU running
+changes described below:
 CUDA will be used.  CUDA=no means that the OMP=yes or OMP=no setting
 will be used for the device as well as the host.
 </P>
-<P>If CUDA=yes, then the lo-level Makefile in the src/MAKE directory must
+<UL><LI>Makefile.intel_cpu
-use "nvcc" as its compiler, via its CC setting.  For best performance
+<LI>Makefile.intel_phi
-its CCFLAGS setting should use -O3 and have an -arch setting that
+<LI>Makefile.kokkos_omp
-matches the compute capability of your NVIDIA hardware and software
+<LI>Makefile.kokkos_cuda
-installation, e.g. -arch=sm_20.  Generally Fermi Generation GPUs are
+<LI>Makefile.kokkos_phi
-sm_20, while Kepler generation GPUs are sm_30 or sm_35 and Maxwell
+<LI>Makefile.omp 
-cards are sm_50.  A complete list can be found on
+</UL>
-<A HREF = "http://en.wikipedia.org/wiki/CUDA#Supported_GPUs">wikipedia</A>. You can
+<P>For the USER-INTEL package, you have 2 choices when building.  You can
-also use the deviceQuery tool that comes with the CUDA samples.  Note
+build with CPU or Phi support.  The latter uses Xeon Phi chips in
-the minimal required compute capability is 2.0, but this will give
+"offload" mode.  Each of these modes requires additional settings in
-signicantly reduced performance compared to Kepler generation GPUs
+your Makefile.machine for CCFLAGS and LINKFLAGS.
 with compute capability 3.x.  For the LINK setting, "nvcc" should not
 be used; instead use g++ or another compiler suitable for linking C++
 applications.  Often you will want to use your MPI compiler wrapper
 for this setting (i.e. mpicxx).  Finally, the lo-level Makefile must
 also have a "Compilation rule" for creating *.o files from *.cu files.
 See src/Makefile.cuda for an example of a lo-level Makefile with all
 of these settings.
 </P>
-<P>HWLOC binds threads to hardware cores, so they do not migrate during a
+<P>For CPU mode (if using an Intel compiler):
 simulation.  HWLOC=yes should always be used if running with OMP=no
 for pthreads.  It is not necessary for OMP=yes for OpenMP, because
 OpenMP provides alternative methods via environment variables for
 binding threads to hardware cores.  More info on binding threads to
 cores is given in <A HREF = "Section_accelerate.html#acc_8">this section</A>.
 </P>
-<P>AVX enables Intel advanced vector extensions when compiling for an
+<UL><LI>CCFLAGS: add -fopenmp, -DLAMMPS_MEMALIGN=64, -restrict, -xHost, -fno-alias, -ansi-alias, -override-limits
-Intel-compatible chip.  AVX=yes should only be set if your host
+<LI>LINKFLAGS: add -fopenmp 
-hardware supports AVX.  If it does not support it, this will cause a
+</UL>
-run-time crash.
+<P>For Phi mode add the following in addition to the CPU mode flags:
 </P>
-<P>MIC enables compiler switches needed when compling for an Intel Phi
+<UL><LI>CCFLAGS: add -DLMP_INTEL_OFFLOAD and 
-processor.
+<LI>LINKFLAGS: add -offload 
 </UL>
 <P>And also add this to CCFLAGS:
 </P>
-<P>LIBRT enables use of a more accurate timer mechanism on most Unix
+<PRE>-offload-option,mic,compiler,"-fp-model fast=2 -mGLOB_default_function_attrs=\"gather_scatter_loop_unroll=4\"" 
-platforms.  This library is not available on all platforms.
+</PRE>
 <P>For the KOKKOS package, you have 3 choices when building.  You can
 build with OMP or Cuda or Phi support.  Phi support uses Xeon Phi
 chips in "native" mode.  This can be done by setting the following
 variables in your Makefile.machine:
 </P>
-<P>DEBUG is only useful when developing a Kokkos-enabled style within
+<UL><LI>for OMP support, set OMP = yes
-LAMMPS.  DEBUG=yes enables printing of run-time debugging information
+<LI>for Cuda support, set OMP = yes and CUDA = yes
-that can be useful.  It also enables runtime bounds checking on Kokkos
+<LI>for Phi support, set OMP = yes and MIC = yes 
-data structures.
+</UL>
 <P>These can also be set as additional arguments to the make command, e.g.
 </P>
 <PRE>make g++ OMP=yes MIC=yes 
 </PRE>
 <P>Building the KOKKOS package with CUDA support requires a Makefile
 machine that uses the NVIDIA "nvcc" compiler, as well as an
 appropriate "arch" setting appropriate to the GPU hardware and NVIDIA
 software you have on your machine.  See
 src/MAKE/OPTIONS/Makefile.kokkos_cuda for an example of such a machine
 Makefile.
 </P>
 <P>For the USER-OMP package, your Makefile.machine needs additional
 settings for CCFLAGS and LINKFLAGS.
 </P>
 <UL><LI>CCFLAGS: add -fopenmp and -restrict
 <LI>LINKFLAGS: add -fopenmp 
 </UL>
 <P>For the OPT package, your Makefile.machine needs an additional
 settings for CCFLAGS.
 </P>
 <UL><LI>CCFLAGS: add -restrict 
 </UL>
 <HR>
 <H4><A NAME = "start_4"></A>2.4 Building LAMMPS via the Make.py script 
 </H4>
-<P>The src directory includes a Make.py script, written
+<P>The src directory includes a Make.py script, written in Python, which
-in Python, which can be used to automate various steps
+can be used to automate various steps of the build process.  It is
-of the build process.
+particularly useful for working with the accelerator packages, as well
 as other packages which require auxiliary libraries to be built.
 </P>
-<P>You can run the script from the src directory by typing either:
+<P>You can run Make.py from the src directory by typing either:
 </P>
-<PRE>Make.py
+<PRE>Make.py -h
-python Make.py 
+python Make.py -h 
 </PRE>
-<P>which will give you info about the tool.  For the former to work, you
+<P>which will give you help info about the tool.  For the former to work,
-may need to edit the 1st line of the script to point to your local
+you may need to edit the first line of Make.py to point to your local
 Python.  And you may need to insure the script is executable:
 </P>
 <PRE>chmod +x Make.py 
 </PRE>
-<P>The following options are supported as switches:
+<P>Here are examples of build tasks you can perform with Make.py:
 </P>
-<UL><LI>-i file1 file2 ...
+<DIV ALIGN=center><TABLE  BORDER=1 >
-<LI>-p package1 package2 ...
+<TR><TD >Install/uninstall packages</TD><TD > Make.py -p no-lib kokkos omp intel</TD></TR>
-<LI>-u package1 package2 ...
+<TR><TD >Build specific auxiliary libs</TD><TD > Make.py lib-atc lib-meam</TD></TR>
-<LI>-e package1 arg1 arg2 package2 ...
+<TR><TD >Build libs for all installed packages</TD><TD > Make.py -p cuda gpu -gpu mode=double arch=31 lib-all</TD></TR>
-<LI>-o dir
+<TR><TD >Create a Makefile from scratch with a compiler and MPI</TD><TD > Make.py -m none -cc g++ -mpi mpich file</TD></TR>
-<LI>-b machine
+<TR><TD >Augment Makefile.serial with settings for installed packages</TD><TD > Make.py -p intel -intel cpu -m serial file</TD></TR>
-<LI>-s suffix1 suffix2 ...
+<TR><TD >Add JPG and FFTW support to Makefile.mpi</TD><TD > Make.py -m mpi -jpg -fft fftw file</TD></TR>
-<LI>-l dir
+<TR><TD >Build LAMMPS with a parallel make using Makefile.mpi</TD><TD > Make.py -j 16 -m mpi exe</TD></TR>
-<LI>-j N
+<TR><TD >Build LAMMPS and libs it needs using Makefile.serial with accelerator settings</TD><TD > Make.py -p gpu intel -intel cpu lib-all file serial 
-<LI>-h switch1 switch2 ... 
+</TD></TR></TABLE></DIV>
 <P>The bench and examples directories give Make.py commands that can be
 used to build LAMMPS with the various packages and options needed to
 run all the benchmark and example input scripts.  See these files for
 more details:
 </P>
 <UL><LI>bench/README
 <LI>bench/FERMI/README
 <LI>bench/KEPLER/README
 <LI>bench/PHI/README
 <LI>examples/README
 <LI>examples/accelerate/README
 <LI>examples/accelerate/make.list 
 </UL>
-<P>Help on any switch can be listed by using -h, e.g.
+<P>All of the Make.py options and syntax help can be accessed by using
 the "-h" switch.
 </P>
-<PRE>Make.py -h -i -p 
+<P>E.g. typing "Make.py -h" gives
 </P>
 <PRE>Syntax: Make.py switch args ... <I>action1</I> <I>action2</I> ...
  actions:
    lib-all, lib-dir, clean, file, exe or machine
    zero or more actions, in any order (machine must be last)
  switches:
    -d (dir), -j (jmake), -m (makefile), -o (output),
    -p (packages), -r (redo), -s (settings), -v (verbose)
  switches for libs:
    -atc, -awpmd, -colvars, -cuda
    -gpu, -meam, -poems, -qmmm, -reax
  switches for build and makefile options:
    -intel, -kokkos, -cc, -mpi, -fft, -jpg, -png 
 </PRE>
-<P>At a hi-level, these are the kinds of package management
+<P>Using the "-h" switch with other switches and actions gives additional
-and build tasks that can be performed easily, using
+info on all the other specified switches or actions.  The "-h" can be
-the Make.py tool:
+anywhere in the command-line and the other switches do not need their
 arguments.  E.g. type "Make.py -h -d -atc -intel" will print:
 </P>
-<UL><LI>install/uninstall packages and build the associated external libs (use -p and -u and -e)
+<PRE>-d dir
-<LI>install packages needed for one or more input scripts (use -i and -p)
+  dir = LAMMPS home dir
-<LI>build LAMMPS, either in the src dir or new dir (use -b)
+  if -d not specified, working dir must be lammps/src 
-<LI>create a new dir with only the source code needed for one or more input scripts (use -i and -o) 
+</PRE>
-</UL>
+<PRE>-atc make=suffix lammps=suffix2
-<P>The last bullet can be useful when you wish to build a stripped-down
+  all args are optional and can be in any order
-version of LAMMPS to run a specific script(s).  Or when you wish to
+  make = use Makefile.suffix (def = g++)
-move the minimal amount of files to another platform for a remote
+  lammps = use Makefile.lammps.suffix2 (def = EXTRAMAKE in makefile) 
-LAMMPS build.
+</PRE>
 <PRE>-intel mode
  mode = cpu or phi (def = cpu)
    build Intel package for CPU or Xeon Phi 
 </PRE>
 <P>Note that Make.py never overwrites an existing Makefile.machine.
 Instead, it creates src/MAKE/MINE/Makefile.auto, which you can save or
 rename if desired.  Likewise it creates an executable named
 src/lmp_auto, which you can rename using the -o switch if desired.
 </P>
-<P>Note that using Make.py is not a substitute for insuring you have a
+<P>The most recently executed Make.py commmand is saved in
-valid src/MAKE/Makefile.foo for your system, or that external library
+src/Make.py.last.  You can use the "-r" switch (for redo) to re-invoke
-Makefiles in any lib/* directories you use are also valid for your
+the last command, or you can save a sequence of one or more Make.py
-system.  But once you have done that, you can use Make.py to quickly
+commands to a file and invoke the file of commands using "-r".  You
-include/exclude the packages and external libraries needed by your
+can also label the commands in the file and invoke one or more of them
-input scripts.
+by name.
 </P>
 <P>A typical use of Make.py is to start with a valid Makefile.machine for
 your system, that works for a vanilla LAMMPS build, i.e. when optional
 packages are not installed.  You can then use Make.py to add various
 settings (FFT, JPG, PNG) to the Makefile.machine as well as change its
 compiler and MPI options.  You can also add additional packages to the
 build, as well as build the needed supporting libraries.
 </P>
 <P>You can also use Make.py to create a new Makefile.machine from
 scratch, using the "-m none" switch, if you also specify what compiler
 and MPI options to use, via the "-cc" and "-mpi" switches.
 </P>
 <HR>
--- a/doc/Section_start.txt
+++ b/doc/Section_start.txt
@ -79,14 +79,27 @@ This section has the following sub-sections:
 [{Read this first:}] :link(start_2_1)
-If you want to avoid building LAMMPS, read the preceeding section
+If you want to avoid building LAMMPS yourself, read the preceeding
-about options available for downloading and installing executables.
+section about options available for downloading and installing
-Details are discussed on the "download"_download page.
+executables.  Details are discussed on the "download"_download page.
-Building LAMMPS can be simple or not-so-simple.  If MPI is already
+Building LAMMPS can be simple or not-so-simple.  If all you need are
-installed on your machine (or you just want to run LAMMPS in serial)
+the default packages installed in LAMMPS, and MPI is already installed
-and you can use one of the provided machine Makefiles and the build
+on your machine, or you just want to run LAMMPS in serial, then you
-works on your platform, then it's simple.
+can typically use the Makefile.mpi or Makefile.serial files in
 src/MAKE and type one of these lines (from the src dir):
 make mpi
 make serial :pre
 Or if one of the other Makefile.machine files in the src/MAKE
 sub-directories matches your system (type "make" to see a list), you
 can use it as-is by typing (for example):
 make stampede :pre
 If any of these builds with an existing Makefile.machine works on your
 system, then you're done!
 If you want to do one of these:
@ -99,7 +112,14 @@ then building LAMMPS is more complicated.  You may need to find where
 auxiliary libraries exist on your machine or install them if they
 don't.  You may need to build additional libraries that are part of
 the LAMMPS package, before building LAMMPS.  You may need to edit a
-machine Makefile to make it compatible with your system.
+Makefile.machine file to make it compatible with your system.
 Note that there is a Make.py tool in the src directory that automates
 several of these steps, but you still have to know what you are doing.
 "Section 2.4"_#start_4 below describes the tool.  It is a convenient
 way to work with installing/un-installing various packages, the
 Makefile.machine changes required by some packages, and the auxiliary
 libraries some of them use.
 Please read the following sections carefully.  If you are not
 comfortable with makefiles, or building codes on a Unix platform, or
@ -115,7 +135,7 @@ please post the issue to the "LAMMPS mail
 list"_http://lammps.sandia.gov/mail.html.
 If you succeed in building LAMMPS on a new kind of machine, for which
-there isn't a similar machine Makefile included in the src/MAKE
+there isn't a similar machine Makefile included in the src/MAKE/MORE
 directory, then send it to the developers and we can include it in the
 LAMMPS distribution.
@ -127,21 +147,43 @@ LAMMPS distribution.
 The src directory contains the C++ source and header files for LAMMPS.
 It also contains a top-level Makefile and a MAKE sub-directory with
-low-level Makefile.* files for many machines.  From within the src
+low-level Makefile.* files for many systems and machines.  See the
-directory, type "make" or "gmake".  You should see a list of available
+src/MAKE/README file for a quick overview of what files are available
-choices.  If one of those is the machine and options you want, you can
+and what sub-directories they are in.
 type a command like:
-make linux
+The src/MAKE dir has a few files that should work as-is on many
 platforms.  The src/MAKE/OPTIONS dir has more that inovke additional
 compiler, MPI, and other setting options commonly used by LAMMPS, to
 illustrate their syntax.  The src/MAKE/MACHINES dir has many more that
 have been tweaked or optimized for specific machines.  These files are
 all good starting points if you find you need to change them for your
 machine.  Put any file you edit into the src/MAKE/MINE directory and
 it will be never be touched by any LAMMPS updates.
 From within the src directory, type "make" or "gmake".  You should see
 a list of available choices from src/MAKE and all of its
 sub-directories.  If one of those has the options you want or is the
 machine you want, you can type a command like:
 make mpi
 or
 make serial_icc
 or
 gmake mac :pre
 Note that the corresponding Makefile.machine can exist in src/MAKE or
 any of its sub-directories.  If a file with the same name appears in
 multiple places (not a good idea), the order they are used is as
 follows: src/MAKE/MINE, src/MAKE, src/MAKE/OPTIONS, src/MAKE/MACHINES.
 This gives preference to a file you have created/edited and put in
 src/MAKE/MINE.
 Note that on a multi-processor or multi-core platform you can launch a
 parallel make, by using the "-j" switch with the make command, which
 will build LAMMPS more quickly.
-If you get no errors and an executable like lmp_linux or lmp_mac is
+If you get no errors and an executable like lmp_mpi or lmp_g++_serial
-produced, you're done; it's your lucky day.
+or lmp_mac is produced, then you're done; it's your lucky day.
 Note that by default only a few of LAMMPS optional packages are
 installed.  To build LAMMPS with optional packages, see "this
@ -151,43 +193,47 @@ section"_#start_3 below.
 If Step 0 did not work, you will need to create a low-level Makefile
 for your machine, like Makefile.foo.  You should make a copy of an
-existing src/MAKE/Makefile.* as a starting point.  The only portions
+existing Makefile.* in src/MAKE or one of its sub-directories as a
-of the file you need to edit are the first line, the "compiler/linker
+starting point.  The only portions of the file you need to edit are
-settings" section, and the "LAMMPS-specific settings" section.
+the first line, the "compiler/linker settings" section, and the
 "LAMMPS-specific settings" section.  When it works, put the edited
 file in src/MAKE/MINE and it will not be altered by any future LAMMPS
 updates.
 [Step 2]
-Change the first line of src/MAKE/Makefile.foo to list the word "foo"
+Change the first line of Makefile.foo to list the word "foo" after the
-after the "#", and whatever other options it will set.  This is the
+"#", and whatever other options it will set.  This is the line you
-line you will see if you just type "make".
+will see if you just type "make".
 [Step 3]
 The "compiler/linker settings" section lists compiler and linker
 settings for your C++ compiler, including optimization flags.  You can
 use g++, the open-source GNU compiler, which is available on all Unix
-systems.  You can also use mpicc which will typically be available if
+systems.  You can also use mpicxx which will typically be available if
 MPI is installed on your system, though you should check which actual
 compiler it wraps.  Vendor compilers often produce faster code.  On
-boxes with Intel CPUs, we suggest using the commercial Intel icc
+boxes with Intel CPUs, we suggest using the Intel icc compiler, which
-compiler, which can be downloaded from "Intel's compiler site"_intel.
+can be downloaded from "Intel's compiler site"_intel.
 :link(intel,http://www.intel.com/software/products/noncom)
 If building a C++ code on your machine requires additional libraries,
-then you should list them as part of the LIB variable.
+then you should list them as part of the LIB variable.  You should
 not need to do this if you use mpicxx.
 The DEPFLAGS setting is what triggers the C++ compiler to create a
 dependency list for a source file.  This speeds re-compilation when
 source (*.cpp) or header (*.h) files are edited.  Some compilers do
 not support dependency file creation, or may use a different switch
-than -D.  GNU g++ works with -D.  If your compiler can't create
+than -D.  GNU g++ and Intel icc works with -D.  If your compiler can't
-dependency files, then you'll need to create a Makefile.foo patterned
+create dependency files, then you'll need to create a Makefile.foo
-after Makefile.storm, which uses different rules that do not involve
+patterned after Makefile.storm, which uses different rules that do not
-dependency files.  Note that when you build LAMMPS for the first time
+involve dependency files.  Note that when you build LAMMPS for the
-on a new platform, a long list of *.d files will be printed out
+first time on a new platform, a long list of *.d files will be printed
-rapidly.  This is not an error; it is the Makefile doing its normal
+out rapidly.  This is not an error; it is the Makefile doing its
-creation of dependencies.
+normal creation of dependencies.
 [Step 4]
@ -271,20 +317,21 @@ Step 6 below for info about building LAMMPS with an FFT library.
 [Step 5]
 The 3 MPI variables are used to specify an MPI library to build LAMMPS
-with. 
+with.  Note that you do not need to set these if you use the MPI
 compiler mpicxx for your CC and LINK setting in the section above.
 The MPI wrapper knows where to find the needed files.
 If you want LAMMPS to run in parallel, you must have an MPI library
-installed on your platform.  If you use an MPI-wrapped compiler, such
+installed on your platform.  If MPI is installed on your system in the
-as "mpicc" to build LAMMPS, you should be able to leave these 3
+usual place (under /usr/local), you also may not need to specify these
-variables blank; the MPI wrapper knows where to find the needed files.
+3 variables, assuming /usr/local is in your path.  On some large
-If not, and MPI is installed on your system in the usual place (under
+parallel machines which use "modules" for their compile/link
-/usr/local), you also may not need to specify these 3 variables.  On
+environements, you may simply need to include the correct module in
-some large parallel machines which use "modules" for their
+your build environment, before building LAMMPS.  Or the parallel
-compile/link environements, you may simply need to include the correct
+machine may have a vendor-provided MPI which the compiler has no
-module in your build environment.  Or the parallel machine may have a
+trouble finding.
 vendor-provided MPI which the compiler has no trouble finding.
-Failing this, with these 3 variables you can specify where the mpi.h
+Failing this, these 3 variables can be used to specify where the mpi.h
 file (MPI_INC) and the MPI library file (MPI_PATH) are found and the
 name of the library file (MPI_LIB).
@ -304,20 +351,22 @@ arise when linking LAMMPS to the MPI library.
 If you just want to run LAMMPS on a single processor, you can use the
 dummy MPI library provided in src/STUBS, since you don't need a true
-MPI library installed on your system.  See the
+MPI library installed on your system.  See src/MAKE/Makefile.serial
-src/MAKE/Makefile.serial file for how to specify the 3 MPI variables
+for how to specify the 3 MPI variables in this case.  You will also
-in this case.  You will also need to build the STUBS library for your
+need to build the STUBS library for your platform before making LAMMPS
-platform before making LAMMPS itself.  To build from the src
+itself.  Note that if you are building with src/MAKE/Makefile.serial,
-directory, type "make stubs", or from the STUBS dir, type "make".
+e.g. by typing "make serial", then the STUBS library is built for you.
 This should create a libmpi_stubs.a file suitable for linking to
 LAMMPS.  If the build fails, you will need to edit the STUBS/Makefile
 for your platform.
-The file STUBS/mpi.c provides a CPU timer function called
+To build the STUBS library from the src directory, type "make stubs",
-MPI_Wtime() that calls gettimeofday() .  If your system doesn't
+or from the src/STUBS dir, type "make".  This should create a
-support gettimeofday() , you'll need to insert code to call another
+libmpi_stubs.a file suitable for linking to LAMMPS.  If the build
-timer.  Note that the ANSI-standard function clock() rolls over after
+fails, you will need to edit the STUBS/Makefile for your platform.
-an hour or so, and is therefore insufficient for timing long LAMMPS
+
 The file STUBS/mpi.c provides a CPU timer function called MPI_Wtime()
 that calls gettimeofday() .  If your system doesn't support
 gettimeofday() , you'll need to insert code to call another timer.
 Note that the ANSI-standard function clock() rolls over after an hour
 or so, and is therefore insufficient for timing long LAMMPS
 simulations.
 [Step 6]
@ -404,11 +453,9 @@ section"_#start_3 below, before proceeding to Step 9.
 [Step 9]
-That's it.  Once you have a correct Makefile.foo, you have installed
+That's it.  Once you have a correct Makefile.foo, and you have
-the optional LAMMPS packages you want to include in your build, and
+pre-built any other needed libraries (e.g. MPI, FFT, etc) all you need
-you have pre-built any other needed libraries (e.g. MPI, FFT, package
+to do from the src directory is type something like this:
 libraries), all you need to do from the src directory is type
 something like this:
 make foo
 or
@ -524,7 +571,7 @@ neighbor lists and would run very slowly in terms of CPU secs/timestep.
 [{Building for a Mac:}] :link(start_2_5)
 OS X is BSD Unix, so it should just work.  See the
-src/MAKE/Makefile.mac file.
+src/MAKE/MACHINES/Makefile.mac and Makefile.mac_mpi files.
 :line
@ -553,7 +600,7 @@ excluded, you can build it yourself.
 One way to do this is install and use cygwin to build LAMMPS with a
 standard unix style make program, just as you would on a Linux box;
-see src/MAKE/Makefile.cygwin.
+see src/MAKE/MACHINES/Makefile.cygwin.
 The other way to do this is using Visual Studio and project files.
 See the src/WINDOWS directory and its README.txt file for instructions
@ -568,7 +615,12 @@ This section has the following sub-sections:
 "Package basics"_#start_3_1
 "Including/excluding packages"_#start_3_2
 "Packages that require extra libraries"_#start_3_3
-"Packages that use make variable settings"_#start_3_4 :ul
+"Packages that require Makefile.machine settings"_#start_3_4 :ul
 Note that the following "Section 2.4"_#start_4 describes the Make.py
 tool which can be used to install/un-install packages and build the
 auxiliary libraries which some of them use.  It can also auto-edit a
 Makefile.machine to add settings needed by some packages.
 :line
@ -577,9 +629,11 @@ This section has the following sub-sections:
 The source code for LAMMPS is structured as a set of core files which
 are always included, plus optional packages.  Packages are groups of
 files that enable a specific set of features.  For example, force
-fields for molecular systems or granular systems are in packages.  You
+fields for molecular systems or granular systems are in packages.
-can see the list of all packages by typing "make package" from within
+
-the src directory of the LAMMPS distribution.
+You can see the list of all packages by typing "make package" from
 within the src directory of the LAMMPS distribution.  This also lists
 various make commands that can be used to manipulate packages.
 If you use a command in a LAMMPS input script that is specific to a
 particular package, you must have built LAMMPS with that package, else
@ -646,10 +700,11 @@ I.e. individual files are only included if their dependencies are
 already included.  Likewise, if a package is excluded, other files
 dependent on that package are also excluded.
-The reason to exclude packages is if you will never run certain kinds
+If you will never run simulations that use the features in a
-of simulations.  For some packages, this will keep you from having to
+particular packages, there is no reason to include it in your build.
-build auxiliary libraries (see below), and will also produce a smaller
+For some packages, this will keep you from having to build auxiliary
-executable which may run a bit faster.
+libraries (see below), and will also produce a smaller executable
 which may run a bit faster.
 When you download a LAMMPS tarball, these packages are pre-installed
 in the src directory: KSPACE, MANYBODY,MOLECULE.  When you download
@ -660,9 +715,10 @@ Packages are included or excluded by typing "make yes-name" or "make
 no-name", where "name" is the name of the package in lower-case, e.g.
 name = kspace for the KSPACE package or name = user-atc for the
 USER-ATC package.  You can also type "make yes-standard", "make
-no-standard", "make yes-user", "make no-user", "make yes-all" or "make
+no-standard", "make yes-std", "make no-std", "make yes-user", "make
-no-all" to include/exclude various sets of packages.  Type "make
+no-user", "make yes-all" or "make no-all" to include/exclude various
-package" to see the all of the package-related make options.
+sets of packages.  Type "make package" to see the all of the
 package-related make options.
 IMPORTANT NOTE: Inclusion/exclusion of a package works by simply
 moving files back and forth between the main src directory and
@ -676,18 +732,19 @@ sub-directories.  You do not normally need to use these commands
 unless you are editing LAMMPS files or have downloaded a patch from
 the LAMMPS WWW site.
-Typing "make package-update" will overwrite src files with files from
+Typing "make package-update" or "make pu" will overwrite src files
-the package sub-directories if the package has been included.  It
+with files from the package sub-directories if the package has been
-should be used after a patch is installed, since patches only update
+included.  It should be used after a patch is installed, since patches
-the files in the package sub-directory, but not the src files.  Typing
+only update the files in the package sub-directory, but not the src
-"make package-overwrite" will overwrite files in the package
+files.  Typing "make package-overwrite" will overwrite files in the
-sub-directories with src files.
+package sub-directories with src files.
-Typing "make package-status" will show which packages are currently
+Typing "make package-status" or "make ps" will show which packages are
-included. Of those that are included, it will list files that are
+currently included. Of those that are included, it will list files
-different in the src directory and package sub-directory.  Typing
+that are different in the src directory and package sub-directory.
-"make package-diff" lists all differences between these files.  Again,
+Typing "make package-diff" lists all differences between these files.
-type "make package" to see all of the package-related make options.
+Again, type "make package" to see all of the package-related make
 options.
 :line
@ -699,16 +756,16 @@ you get a LAMMPS build error about a missing library, this is likely
 the reason.  See the "Section_packages"_Section_packages.html doc page
 for a list of packages that have auxiliary libraries.
-Code for some of these auxiliary libraries is included in the LAMMPS
+Code for most of these auxiliary libraries is included in the LAMMPS
 distribution under the lib directory.  Examples are the USER-ATC and
-MEAM packages.  Some auxiliary libraries are NOT included with LAMMPS;
+MEAM packages.  A few auxiliary libraries are NOT included with
-to use the associated package you must download and install the
+LAMMPS; to use the associated package you must download and install
-auxiliary library yourself.  Examples are the KIM and VORONOI and
+the auxiliary library yourself.  Examples are the KIM and VORONOI and
 USER-MOLFILE packages.
-For libraries with provided source code, each lib directory has a
+For provided libraries, each lib directory has a README file
-README file (e.g. lib/reax/README) with instructions on how to build
+(e.g. lib/reax/README) with instructions on how to build that library.
-that library.  Typically this is done by typing something like:
+Typically this is done by typing something like:
 make -f Makefile.g++ :pre
@ -740,168 +797,203 @@ is built with, typically requires additional Fortran-to-C libraries be
 included in the link.  Another example are the BLAS and LAPACK
 libraries needed to use the USER-ATC or USER-AWPMD packages.
-For libraries without provided source code, see the
+For libraries without provided source code, the file
-src/package/Makefile.lammps file for information on where to find the
+src/package/README has information on where to find the library and
-library and how to build it.  E.g. the file src/KIM/Makefile.lammps or
+how to build it, e.g. src/VORONOI/README.  There is also a
-src/VORONOI/Makefile.lammps or src/UESR-MOLFILE/Makefile.lammps.
+Makefile.lammps file in the src/package directory.  E.g. files
-These files serve the same purpose as the lib/package/Makefile.lammps
+src/KIM/Makefile.lammps or src/VORONOI/Makefile.lammps or
-files described above.  The files have settings needed when LAMMPS is
+src/UESR-MOLFILE/Makefile.lammps.  These files serve the same purpose
-built to link with the corresponding auxiliary library.
+as the lib/package/Makefile.lammps files described above.  The files
 have settings needed when LAMMPS is built to link with the
 corresponding auxiliary library.
 Again, you must insure that the settings in
 src/package/Makefile.lammps are appropriate for your system and where
 you installed the auxiliary library.  If they are not, the LAMMPS
-build will fail.
+build will typically fail.
 :line
-[{Packages that use make variable settings}] :link(start_3_4)
+[{Packages that require Makefile.machine settings}] :link(start_3_4)
-One package, the KOKKOS package, allows its build options to be
+A few packages require specific settings in Makefile.machine, to
-specified by setting variables via the "make" command, rather than by
+either build or use the package effectively.  These are the
-first building an auxiliary library and editing a Makefile.lammps
+USER-INTEL, KOKKOS, USER-OMP, and OPT packages.  The details of what
-file, as discussed in the previous sub-section for other packages.
+flags to add or what variables to define are given on the doc pages
-This is for convenience since it is common to want to experiment with
+that describe each of these accelerator packages in detail:
 different Kokkos library options.  Using variables enables a direct
 re-build of LAMMPS and its Kokkos dependencies, so that a benchmark
 test with different Kokkos options can be quickly performed.
-The syntax for setting make variables is as follows.  You must
+"USER-INTEL package"_accelerate_intel.html
-use a GNU-compatible make command for this to work.  Try "gmake"
+"KOKKOS package"_accelerate_kokkos.html
-if your system's standard make complains.
+"USER-OMP package"_accelerate_omp.html
 "OPT package"_accelerate_opt.html :ul
-make yes-kokkos
+Here is a brief summary of what Makefile.machine changes are needed.
-make g++ VAR1=value VAR2=value ... :pre
+Note that the Make.py tool, described in the next "Section
 2.4"_#start_4 can automatically add the needed info to an existing
 machine Makefile, using simple command-line arguments.
-The first line installs the KOKKOS package, which only needs to be
+In src/MAKE/OPTIONS see the following Makefiles for examples of the
-done once.  The second line builds LAMMPS with src/MAKE/Makefile.g++
+changes described below:
 and optionally sets one or more variables that affect the build.  Each
 variable is specified in upper-case; its value follows an equal sign
 with no spaces.  The second line can be repeated with different
 variable settings, though a "clean" must be done before the rebuild.
 Type "make clean" to see options for this operation.
-These are the variables that can be specified.  Each takes a value of
+Makefile.intel_cpu
-{yes} or {no}.  The default value is listed, which is set in the
+Makefile.intel_phi
-lib/kokkos/Makefile.lammps file.  See "this
+Makefile.kokkos_omp
-section"_Section_accelerate.html#acc_8 for a discussion of what is
+Makefile.kokkos_cuda
-meant by "host" and "device" in the Kokkos context.
+Makefile.kokkos_phi
 Makefile.omp :ul
-OMP, default = {yes}
+For the USER-INTEL package, you have 2 choices when building.  You can
-CUDA, default = {no}
+build with CPU or Phi support.  The latter uses Xeon Phi chips in
-HWLOC, default = {no}
+"offload" mode.  Each of these modes requires additional settings in
-AVX, default = {no}
+your Makefile.machine for CCFLAGS and LINKFLAGS.
 MIC, default = {no}
 LIBRT, default = {no}
 DEBUG, default = {no} :ul
-OMP sets the parallelization method used for Kokkos code (within
+For CPU mode (if using an Intel compiler):
 LAMMPS) that runs on the host.  OMP=yes means that OpenMP will be
 used.  OMP=no means that pthreads will be used.
-CUDA sets the parallelization method used for Kokkos code (within
+CCFLAGS: add -fopenmp, -DLAMMPS_MEMALIGN=64, -restrict, -xHost, -fno-alias, -ansi-alias, -override-limits
-LAMMPS) that runs on the device.  CUDA=yes means an NVIDIA GPU running
+LINKFLAGS: add -fopenmp :ul
 CUDA will be used.  CUDA=no means that the OMP=yes or OMP=no setting
 will be used for the device as well as the host.
-If CUDA=yes, then the lo-level Makefile in the src/MAKE directory must
+For Phi mode add the following in addition to the CPU mode flags:
 use "nvcc" as its compiler, via its CC setting.  For best performance
 its CCFLAGS setting should use -O3 and have an -arch setting that
 matches the compute capability of your NVIDIA hardware and software
 installation, e.g. -arch=sm_20.  Generally Fermi Generation GPUs are
 sm_20, while Kepler generation GPUs are sm_30 or sm_35 and Maxwell
 cards are sm_50.  A complete list can be found on
 "wikipedia"_http://en.wikipedia.org/wiki/CUDA#Supported_GPUs. You can
 also use the deviceQuery tool that comes with the CUDA samples.  Note
 the minimal required compute capability is 2.0, but this will give
 signicantly reduced performance compared to Kepler generation GPUs
 with compute capability 3.x.  For the LINK setting, "nvcc" should not
 be used; instead use g++ or another compiler suitable for linking C++
 applications.  Often you will want to use your MPI compiler wrapper
 for this setting (i.e. mpicxx).  Finally, the lo-level Makefile must
 also have a "Compilation rule" for creating *.o files from *.cu files.
 See src/Makefile.cuda for an example of a lo-level Makefile with all
 of these settings.
-HWLOC binds threads to hardware cores, so they do not migrate during a
+CCFLAGS: add -DLMP_INTEL_OFFLOAD and 
-simulation.  HWLOC=yes should always be used if running with OMP=no
+LINKFLAGS: add -offload :ul
 for pthreads.  It is not necessary for OMP=yes for OpenMP, because
 OpenMP provides alternative methods via environment variables for
 binding threads to hardware cores.  More info on binding threads to
 cores is given in "this section"_Section_accelerate.html#acc_8.
-AVX enables Intel advanced vector extensions when compiling for an
+And also add this to CCFLAGS:
 Intel-compatible chip.  AVX=yes should only be set if your host
 hardware supports AVX.  If it does not support it, this will cause a
 run-time crash.
-MIC enables compiler switches needed when compling for an Intel Phi
+-offload-option,mic,compiler,"-fp-model fast=2 -mGLOB_default_function_attrs=\"gather_scatter_loop_unroll=4\"" :pre
 processor.
-LIBRT enables use of a more accurate timer mechanism on most Unix
+For the KOKKOS package, you have 3 choices when building.  You can
-platforms.  This library is not available on all platforms.
+build with OMP or Cuda or Phi support.  Phi support uses Xeon Phi
 chips in "native" mode.  This can be done by setting the following
 variables in your Makefile.machine:
-DEBUG is only useful when developing a Kokkos-enabled style within
+for OMP support, set OMP = yes
-LAMMPS.  DEBUG=yes enables printing of run-time debugging information
+for Cuda support, set OMP = yes and CUDA = yes
-that can be useful.  It also enables runtime bounds checking on Kokkos
+for Phi support, set OMP = yes and MIC = yes :ul
-data structures.
+
 These can also be set as additional arguments to the make command, e.g.
 make g++ OMP=yes MIC=yes :pre
 Building the KOKKOS package with CUDA support requires a Makefile
 machine that uses the NVIDIA "nvcc" compiler, as well as an
 appropriate "arch" setting appropriate to the GPU hardware and NVIDIA
 software you have on your machine.  See
 src/MAKE/OPTIONS/Makefile.kokkos_cuda for an example of such a machine
 Makefile.
 For the USER-OMP package, your Makefile.machine needs additional
 settings for CCFLAGS and LINKFLAGS.
 CCFLAGS: add -fopenmp and -restrict
 LINKFLAGS: add -fopenmp :ul
 For the OPT package, your Makefile.machine needs an additional
 settings for CCFLAGS.
 CCFLAGS: add -restrict :ul
 :line
 2.4 Building LAMMPS via the Make.py script :h4,link(start_4)
-The src directory includes a Make.py script, written
+The src directory includes a Make.py script, written in Python, which
-in Python, which can be used to automate various steps
+can be used to automate various steps of the build process.  It is
-of the build process.
+particularly useful for working with the accelerator packages, as well
 as other packages which require auxiliary libraries to be built.
-You can run the script from the src directory by typing either:
+You can run Make.py from the src directory by typing either:
-Make.py
+Make.py -h
-python Make.py :pre
+python Make.py -h :pre
-which will give you info about the tool.  For the former to work, you
+which will give you help info about the tool.  For the former to work,
-may need to edit the 1st line of the script to point to your local
+you may need to edit the first line of Make.py to point to your local
 Python.  And you may need to insure the script is executable:
 chmod +x Make.py :pre
-The following options are supported as switches:
+Here are examples of build tasks you can perform with Make.py:
-i file1 file2 ...
+Install/uninstall packages: Make.py -p no-lib kokkos omp intel
-p package1 package2 ...
+Build specific auxiliary libs: Make.py lib-atc lib-meam
-u package1 package2 ...
+Build libs for all installed packages: Make.py -p cuda gpu -gpu mode=double arch=31 lib-all
-e package1 arg1 arg2 package2 ...
+Create a Makefile from scratch with a compiler and MPI: Make.py -m none -cc g++ -mpi mpich file
-o dir
+Augment Makefile.serial with settings for installed packages: Make.py -p intel -intel cpu -m serial file
-b machine
+Add JPG and FFTW support to Makefile.mpi: Make.py -m mpi -jpg -fft fftw file
-s suffix1 suffix2 ...
+Build LAMMPS with a parallel make using Makefile.mpi: Make.py -j 16 -m mpi exe
-l dir
+Build LAMMPS and libs it needs using Makefile.serial with accelerator settings: Make.py -p gpu intel -intel cpu lib-all file serial :tb(s=:)
 -j N
 -h switch1 switch2 ... :ul
-Help on any switch can be listed by using -h, e.g.
+The bench and examples directories give Make.py commands that can be
 used to build LAMMPS with the various packages and options needed to
 run all the benchmark and example input scripts.  See these files for
 more details:
-Make.py -h -i -p :pre
+bench/README
 bench/FERMI/README
 bench/KEPLER/README
 bench/PHI/README
 examples/README
 examples/accelerate/README
 examples/accelerate/make.list :ul
-At a hi-level, these are the kinds of package management
+All of the Make.py options and syntax help can be accessed by using
-and build tasks that can be performed easily, using
+the "-h" switch.
 the Make.py tool:
-install/uninstall packages and build the associated external libs (use -p and -u and -e)
+E.g. typing "Make.py -h" gives
 install packages needed for one or more input scripts (use -i and -p)
 build LAMMPS, either in the src dir or new dir (use -b)
 create a new dir with only the source code needed for one or more input scripts (use -i and -o) :ul
-The last bullet can be useful when you wish to build a stripped-down
+Syntax: Make.py switch args ... {action1} {action2} ...
-version of LAMMPS to run a specific script(s).  Or when you wish to
+  actions:
-move the minimal amount of files to another platform for a remote
+    lib-all, lib-dir, clean, file, exe or machine
-LAMMPS build.
+    zero or more actions, in any order (machine must be last)
  switches:
    -d (dir), -j (jmake), -m (makefile), -o (output),
    -p (packages), -r (redo), -s (settings), -v (verbose)
  switches for libs:
    -atc, -awpmd, -colvars, -cuda
    -gpu, -meam, -poems, -qmmm, -reax
  switches for build and makefile options:
    -intel, -kokkos, -cc, -mpi, -fft, -jpg, -png :pre
-Note that using Make.py is not a substitute for insuring you have a
+Using the "-h" switch with other switches and actions gives additional
-valid src/MAKE/Makefile.foo for your system, or that external library
+info on all the other specified switches or actions.  The "-h" can be
-Makefiles in any lib/* directories you use are also valid for your
+anywhere in the command-line and the other switches do not need their
-system.  But once you have done that, you can use Make.py to quickly
+arguments.  E.g. type "Make.py -h -d -atc -intel" will print:
-include/exclude the packages and external libraries needed by your
+
-input scripts.
+-d dir
  dir = LAMMPS home dir
  if -d not specified, working dir must be lammps/src :pre
 -atc make=suffix lammps=suffix2
  all args are optional and can be in any order
  make = use Makefile.suffix (def = g++)
  lammps = use Makefile.lammps.suffix2 (def = EXTRAMAKE in makefile) :pre
 -intel mode
  mode = cpu or phi (def = cpu)
    build Intel package for CPU or Xeon Phi :pre
 Note that Make.py never overwrites an existing Makefile.machine.
 Instead, it creates src/MAKE/MINE/Makefile.auto, which you can save or
 rename if desired.  Likewise it creates an executable named
 src/lmp_auto, which you can rename using the -o switch if desired.
 The most recently executed Make.py commmand is saved in
 src/Make.py.last.  You can use the "-r" switch (for redo) to re-invoke
 the last command, or you can save a sequence of one or more Make.py
 commands to a file and invoke the file of commands using "-r".  You
 can also label the commands in the file and invoke one or more of them
 by name.
 A typical use of Make.py is to start with a valid Makefile.machine for
 your system, that works for a vanilla LAMMPS build, i.e. when optional
 packages are not installed.  You can then use Make.py to add various
 settings (FFT, JPG, PNG) to the Makefile.machine as well as change its
 compiler and MPI options.  You can also add additional packages to the
 build, as well as build the needed supporting libraries.
 You can also use Make.py to create a new Makefile.machine from
 scratch, using the "-m none" switch, if you also specify what compiler
 and MPI options to use, via the "-cc" and "-mpi" switches.
 :line
--- a/doc/accelerate_cuda.html
+++ b/doc/accelerate_cuda.html
@ -74,16 +74,30 @@ projects can be compiled without problems.
 <P>This requires two steps (a,b): build the USER-CUDA library, then build
 LAMMPS with the USER-CUDA package.
 </P>
 <P>You can do both these steps in one line, using the src/Make.py script,
 described in <A HREF = "Section_start.html#start_4">Section 2.4</A> of the manual.
 Type "Make.py -h" for help.  If run from the src directory, this
 command will create src/lmp_cuda using src/MAKE/Makefile.mpi as the
 starting Makefile.machine:
 </P>
 <PRE>Make.py -p cuda -cuda mode=single arch=20 -o cuda lib-cuda file mpi 
 </PRE>
 <P>Or you can follow these two (a,b) steps:
 </P>
 <P>(a) Build the USER-CUDA library
 </P>
 <P>The USER-CUDA library is in lammps/lib/cuda.  If your <I>CUDA</I> toolkit
 is not installed in the default system directoy <I>/usr/local/cuda</I> edit
 the file <I>lib/cuda/Makefile.common</I> accordingly.
 </P>
-<P>To set options for the library build, type "make OPTIONS", where
+<P>To build the library with the settings in lib/cuda/Makefile.default,
 simply type:
 </P>
 <PRE>make 
 </PRE>
 <P>To set options when the library is built, type "make OPTIONS", where
 <I>OPTIONS</I> are one or more of the following. The settings will be
-written to the <I>lib/cuda/Makefile.defaults</I> and used when
+written to the <I>lib/cuda/Makefile.defaults</I> before the build.
 the library is built.
 </P>
 <PRE><I>precision=N</I> to set the precision level
  N = 1 for single precision (default)
@ -107,11 +121,8 @@ the library is built.
  0 = no CUFFT support (default)
  in the future other CUDA-enabled FFT libraries might be supported 
 </PRE>
-<P>To build the library, simply type:
+<P>If the build is successful, it will produce the files liblammpscuda.a and
-</P>
+Makefile.lammps.
 <PRE>make 
 </PRE>
 <P>If successful, it will produce the files libcuda.a and Makefile.lammps.
 </P>
 <P>Note that if you change any of the options (like precision), you need
 to re-build the entire library.  Do a "make clean" first, followed by
@ -123,8 +134,7 @@ to re-build the entire library.  Do a "make clean" first, followed by
 make yes-user-cuda
 make machine 
 </PRE>
-<P>No additional compile/link flags are needed in your Makefile.machine
+<P>No additional compile/link flags are needed in Makefile.machine.
 in src/MAKE.
 </P>
 <P>Note that if you change the USER-CUDA library precision (discussed
 above) and rebuild the USER-CUDA library, then you also need to
--- a/doc/accelerate_cuda.txt
+++ b/doc/accelerate_cuda.txt
@ -71,16 +71,30 @@ projects can be compiled without problems.
 This requires two steps (a,b): build the USER-CUDA library, then build
 LAMMPS with the USER-CUDA package.
 You can do both these steps in one line, using the src/Make.py script,
 described in "Section 2.4"_Section_start.html#start_4 of the manual.
 Type "Make.py -h" for help.  If run from the src directory, this
 command will create src/lmp_cuda using src/MAKE/Makefile.mpi as the
 starting Makefile.machine:
 Make.py -p cuda -cuda mode=single arch=20 -o cuda lib-cuda file mpi :pre
 Or you can follow these two (a,b) steps:
 (a) Build the USER-CUDA library
 The USER-CUDA library is in lammps/lib/cuda.  If your {CUDA} toolkit
 is not installed in the default system directoy {/usr/local/cuda} edit
 the file {lib/cuda/Makefile.common} accordingly.
-To set options for the library build, type "make OPTIONS", where
+To build the library with the settings in lib/cuda/Makefile.default,
 simply type:
 make :pre
 To set options when the library is built, type "make OPTIONS", where
 {OPTIONS} are one or more of the following. The settings will be
-written to the {lib/cuda/Makefile.defaults} and used when
+written to the {lib/cuda/Makefile.defaults} before the build.
 the library is built.
 {precision=N} to set the precision level
  N = 1 for single precision (default)
@ -104,11 +118,8 @@ the library is built.
  0 = no CUFFT support (default)
  in the future other CUDA-enabled FFT libraries might be supported :pre
-To build the library, simply type:
+If the build is successful, it will produce the files liblammpscuda.a and
-
+Makefile.lammps.
 make :pre
 If successful, it will produce the files libcuda.a and Makefile.lammps.
 Note that if you change any of the options (like precision), you need
 to re-build the entire library.  Do a "make clean" first, followed by
@ -120,8 +131,7 @@ cd lammps/src
 make yes-user-cuda
 make machine :pre
-No additional compile/link flags are needed in your Makefile.machine
+No additional compile/link flags are needed in Makefile.machine.
 in src/MAKE.
 Note that if you change the USER-CUDA library precision (discussed
 above) and rebuild the USER-CUDA library, then you also need to
--- a/doc/accelerate_gpu.html
+++ b/doc/accelerate_gpu.html
@ -76,6 +76,16 @@ install the NVIDIA Cuda software on your system:
 <P>This requires two steps (a,b): build the GPU library, then build
 LAMMPS with the GPU package.
 </P>
 <P>You can do both these steps in one line, using the src/Make.py script,
 described in <A HREF = "Section_start.html#start_4">Section 2.4</A> of the manual.
 Type "Make.py -h" for help.  If run from the src directory, this
 command will create src/lmp_gpu using src/MAKE/Makefile.mpi as the
 starting Makefile.machine:
 </P>
 <PRE>Make.py -p gpu -gpu mode=single arch=31 -o gpu lib-gpu file mpi 
 </PRE>
 <P>Or you can follow these two (a,b) steps:
 </P>
 <P>(a) Build the GPU library
 </P>
 <P>The GPU library is in lammps/lib/gpu.  Select a Makefile.machine (in
@ -120,8 +130,7 @@ Makefile.linux clean", followed by the make command above.
 make yes-gpu
 make machine 
 </PRE>
-<P>No additional compile/link flags are needed in your Makefile.machine
+<P>No additional compile/link flags are needed in Makefile.machine.
 in src/MAKE.
 </P>
 <P>Note that if you change the GPU library precision (discussed above)
 and rebuild the GPU library, then you also need to re-install the GPU
--- a/doc/accelerate_gpu.txt
+++ b/doc/accelerate_gpu.txt
@ -73,6 +73,16 @@ Run lammps/lib/gpu/nvc_get_devices (after building the GPU library, see below) t
 This requires two steps (a,b): build the GPU library, then build
 LAMMPS with the GPU package.
 You can do both these steps in one line, using the src/Make.py script,
 described in "Section 2.4"_Section_start.html#start_4 of the manual.
 Type "Make.py -h" for help.  If run from the src directory, this
 command will create src/lmp_gpu using src/MAKE/Makefile.mpi as the
 starting Makefile.machine:
 Make.py -p gpu -gpu mode=single arch=31 -o gpu lib-gpu file mpi :pre
 Or you can follow these two (a,b) steps:
 (a) Build the GPU library
 The GPU library is in lammps/lib/gpu.  Select a Makefile.machine (in
@ -117,8 +127,7 @@ cd lammps/src
 make yes-gpu
 make machine :pre
-No additional compile/link flags are needed in your Makefile.machine
+No additional compile/link flags are needed in Makefile.machine.
 in src/MAKE.
 Note that if you change the GPU library precision (discussed above)
 and rebuild the GPU library, then you also need to re-install the GPU
--- a/doc/accelerate_intel.html
+++ b/doc/accelerate_intel.html
@ -41,6 +41,10 @@ suffix to "omp" so that styles from the USER-OMP package will be used
 if available, after first testing if a style from the USER-INTEL
 package is available.
 </P>
 <P>When using the USER-INTEL package, you must choose at build time
 whether you are building for CPU-only acceleration or for using the
 Xeon Phi in offload mode.
 </P>
 <P>Here is a quick overview of how to use the USER-INTEL package
 for CPU-only acceleration:
 </P>
@ -50,6 +54,9 @@ for CPU-only acceleration:
 <LI>specify how many OpenMP threads per MPI task to use
 <LI>use USER-INTEL and (optionally) USER-OMP styles in your input script 
 </UL>
 <P>Note that many of these settings can only be used with the Intel
 compiler, as discussed below.
 </P>
 <P>Using the USER-INTEL package to offload work to the Intel(R)
 Xeon Phi(TM) coprocessor is the same except for these additional
 steps:
@ -74,25 +81,41 @@ Phi(TM) coprocessors.
 Intel(R) compiler.  Use of other compilers may not result in
 vectorization or give poor performance.
 </P>
-<P>Use of an Intel C++ compiler is reccommended, but not required.  The
+<P>Use of an Intel C++ compiler is recommended, but not required (though
-compiler must support the OpenMP interface.
+g++ will not recognize some of the settings, so they cannot be used).
 The compiler must support the OpenMP interface.
 </P>
 <P><B>Building LAMMPS with the USER-INTEL package:</B>
 </P>
-<P>Include the package(s) and build LAMMPS:  
+<P>You must choose at build time whether to build for CPU acceleration or
 to use the Xeon Phi in offload mode.
 </P>
 <P>You can do either in one line, using the src/Make.py script, described
 in <A HREF = "Section_start.html#start_4">Section 2.4</A> of the manual.  Type
 "Make.py -h" for help.  If run from the src directory, these commands
 will create src/lmp_intel_cpu and lmp_intel_phi using
 src/MAKE/Makefile.mpi as the starting Makefile.machine:
 </P>
 <PRE>Make.py -p intel omp -intel cpu -o intel_cpu -cc icc file mpi 
 Make.py -p intel omp -intel phi -o intel_phi -cc icc file mpi 
 </PRE>
 <P>Note that this assumes that your MPI and its mpicxx wrapper
 is using the Intel compiler.  If it is not, you should
 leave off the "-cc icc" switch.
 </P>
 <P>Or you can follow these steps:
 </P>
 <PRE>cd lammps/src
 make yes-user-intel
 make yes-user-omp (if desired)
 make machine 
 </PRE>
-<P>If the USER-OMP package is also installed, you can use styles from
+<P>Note that if the USER-OMP package is also installed, you can use
-both packages, as described below.
+styles from both packages, as described below.
 </P>
-<P>The lo-level src/MAKE/Makefile.machine needs a flag for OpenMP support
+<P>The Makefile.machine needs a "-fopenmp" flag for OpenMP support in
-in both the CCFLAGS and LINKFLAGS variables, which is <I>-openmp</I> for
+both the CCFLAGS and LINKFLAGS variables.  You also need to add
-Intel compilers.  You also need to add -DLAMMPS_MEMALIGN=64 and
+-DLAMMPS_MEMALIGN=64 and -restrict to CCFLAGS.
 -restrict to CCFLAGS.
 </P>
 <P>If you are compiling on the same architecture that will be used for
 the runs, adding the flag <I>-xHost</I> to CCFLAGS will enable
@ -102,10 +125,10 @@ vectorization with the Intel(R) compiler.
 coprocessor, the flag <I>-offload</I> should be added to the LINKFLAGS line
 and the flag -DLMP_INTEL_OFFLOAD should be added to the CCFLAGS line.
 </P>
-<P>Note that the machine makefiles Makefile.intel and
+<P>Example makefiles Makefile.intel_cpu and Makefile.intel_phi are
-Makefile.intel_offload are included in the src/MAKE directory with
+included in the src/MAKE/OPTIONS directory with settings that perform
-options that perform well with the Intel(R) compiler. The latter file
+well with the Intel(R) compiler. The latter file has support for
-has support for offload to coprocessors; the former does not.
+offload to coprocessors; the former does not.
 </P>
 <P>If using an Intel compiler, it is recommended that Intel(R) Compiler
 2013 SP1 update 1 be used.  Newer versions have some performance
--- a/doc/accelerate_intel.txt
+++ b/doc/accelerate_intel.txt
@ -38,6 +38,10 @@ suffix to "omp" so that styles from the USER-OMP package will be used
 if available, after first testing if a style from the USER-INTEL
 package is available.
 When using the USER-INTEL package, you must choose at build time
 whether you are building for CPU-only acceleration or for using the
 Xeon Phi in offload mode.
 Here is a quick overview of how to use the USER-INTEL package
 for CPU-only acceleration:
@ -47,6 +51,9 @@ include the USER-INTEL package and (optionally) USER-OMP package and build LAMMP
 specify how many OpenMP threads per MPI task to use
 use USER-INTEL and (optionally) USER-OMP styles in your input script :ul
 Note that many of these settings can only be used with the Intel
 compiler, as discussed below.
 Using the USER-INTEL package to offload work to the Intel(R)
 Xeon Phi(TM) coprocessor is the same except for these additional
 steps:
@ -71,25 +78,41 @@ Optimizations for vectorization have only been tested with the
 Intel(R) compiler.  Use of other compilers may not result in
 vectorization or give poor performance.
-Use of an Intel C++ compiler is reccommended, but not required.  The
+Use of an Intel C++ compiler is recommended, but not required (though
-compiler must support the OpenMP interface.
+g++ will not recognize some of the settings, so they cannot be used).
 The compiler must support the OpenMP interface.
 [Building LAMMPS with the USER-INTEL package:]
-Include the package(s) and build LAMMPS:  
+You must choose at build time whether to build for CPU acceleration or
 to use the Xeon Phi in offload mode.
 You can do either in one line, using the src/Make.py script, described
 in "Section 2.4"_Section_start.html#start_4 of the manual.  Type
 "Make.py -h" for help.  If run from the src directory, these commands
 will create src/lmp_intel_cpu and lmp_intel_phi using
 src/MAKE/Makefile.mpi as the starting Makefile.machine:
 Make.py -p intel omp -intel cpu -o intel_cpu -cc icc file mpi 
 Make.py -p intel omp -intel phi -o intel_phi -cc icc file mpi :pre
 Note that this assumes that your MPI and its mpicxx wrapper
 is using the Intel compiler.  If it is not, you should
 leave off the "-cc icc" switch.
 Or you can follow these steps:
 cd lammps/src
 make yes-user-intel
 make yes-user-omp (if desired)
 make machine :pre
-If the USER-OMP package is also installed, you can use styles from
+Note that if the USER-OMP package is also installed, you can use
-both packages, as described below.
+styles from both packages, as described below.
-The lo-level src/MAKE/Makefile.machine needs a flag for OpenMP support
+The Makefile.machine needs a "-fopenmp" flag for OpenMP support in
-in both the CCFLAGS and LINKFLAGS variables, which is {-openmp} for
+both the CCFLAGS and LINKFLAGS variables.  You also need to add
-Intel compilers.  You also need to add -DLAMMPS_MEMALIGN=64 and
+-DLAMMPS_MEMALIGN=64 and -restrict to CCFLAGS.
 -restrict to CCFLAGS.
 If you are compiling on the same architecture that will be used for
 the runs, adding the flag {-xHost} to CCFLAGS will enable
@ -99,10 +122,10 @@ In order to build with support for an Intel(R) Xeon Phi(TM)
 coprocessor, the flag {-offload} should be added to the LINKFLAGS line
 and the flag -DLMP_INTEL_OFFLOAD should be added to the CCFLAGS line.
-Note that the machine makefiles Makefile.intel and
+Example makefiles Makefile.intel_cpu and Makefile.intel_phi are
-Makefile.intel_offload are included in the src/MAKE directory with
+included in the src/MAKE/OPTIONS directory with settings that perform
-options that perform well with the Intel(R) compiler. The latter file
+well with the Intel(R) compiler. The latter file has support for
-has support for offload to coprocessors; the former does not.
+offload to coprocessors; the former does not.
 If using an Intel compiler, it is recommended that Intel(R) Compiler
 2013 SP1 update 1 be used.  Newer versions have some performance
--- a/doc/accelerate_kokkos.html
+++ b/doc/accelerate_kokkos.html
@ -61,14 +61,17 @@ one or the other of the two modes.  The first mode is called the
 processor (running in native mode, not offload mode like the
 USER-INTEL package) are supported.  The second mode is called the
 "device" and is an accelerator chip of some kind.  Currently only an
-NVIDIA GPU is supported.  If your compute node does not have a GPU,
+NVIDIA GPU is supported via Cuda.  If your compute node does not have
-then there is only one mode of execution, i.e. the host and device are
+a GPU, then there is only one mode of execution, i.e. the host and
-the same.
+device are the same.
 </P>
-<P>Here is a quick overview of how to use the KOKKOS package
+<P>When using the KOKKOS package, you must choose at build time whether
-for GPU acceleration:
+you are building for OpenMP, GPU, or for using the Xeon Phi in native
 mode.
 </P>
-<UL><LI>specify variables and settings in your Makefile.machine that enable GPU, Phi, or OpenMP support
+<P>Here is a quick overview of how to use the KOKKOS package:
 </P>
 <UL><LI>specify variables and settings in your Makefile.machine that enable OpenMP, GPU, or Phi support
 <LI>include the KOKKOS package and build LAMMPS
 <LI>enable the KOKKOS package and its hardware options via the "-k on" command-line switch
 <LI>use KOKKOS styles in your input script 
@ -105,14 +108,23 @@ and GPU packages for details of how to check and do this.
 </P>
 <P><B>Building LAMMPS with the KOKKOS package:</B>
 </P>
-<P>Unlike other acceleration packages discussed in this section, the
+<P>You must choose at build time whether to build for OpenMP, Cuda, or
-Kokkos library in lib/kokkos does not have to be pre-built before
+Phi.
 building LAMMPS itself.  Instead, options for the Kokkos library are
 specified at compile time, when LAMMPS itself is built.  This can be
 done in one of two ways, as discussed below.
 </P>
-<P>Here are examples of how to build LAMMPS for the different compute-node
+<P>You can do any of these in one line, using the src/Make.py script,
-configurations listed above.
+described in <A HREF = "Section_start.html#start_4">Section 2.4</A> of the manual.
 Type "Make.py -h" for help.  If run from the src directory, these
 commands will create src/lmp_kokkos_omp, lmp_kokkos_cuda, and
 lmp_kokkos_phi.  The OMP and PHI options use src/MAKE/Makefile.mpi as
 the starting Makefile.machine.  The CUDA option uses
 src/MAKE/OPTIONS/Makefile.cuda since the NVIDIA nvcc compiler is
 required.
 </P>
 <P>Make.py -p kokkos -kokkos omp -o kokkos_omp file mpi 
 Make.py -p kokkos -kokkos cuda arch=31 -o kokkos_cuda file kokkos_cuda
 Make.py -p kokkos -kokkos phi -o kokkos_phi file mpi 
 </P>
 <P>Or you can follow these steps:
 </P>
 <P>CPU-only (run all-MPI or with OpenMP threading):
 </P>
@ -164,15 +176,76 @@ in <A HREF = "Section_start.html#start_3_4">Section 2.3.4</A> of the manual, as
 as other settings that must be included in the machine makefile, if
 you create your own.
 </P>
 <P>There are other allowed options when building with the KOKKOS package.
 As above, They can be set either as variables on the make command line
 or in the machine makefile in the src/MAKE directory.  See <A HREF = "Section_start.html#start_3_4">Section
 2.3.4</A> of the manual for details.
 </P>
 <P>IMPORTANT NOTE: Currently, there are no precision options with the
 KOKKOS package.  All compilation and computation is performed in
 double precision.
 </P>
 <P>There are other allowed options when building with the KOKKOS package.
 As above, they can be set either as variables on the make command line
 or in Makefile.machine.  This is the full list of options, including
 those discussed above, Each takes a value of <I>yes</I> or <I>no</I>.  The
 default value is listed, which is set in the
 lib/kokkos/Makefile.lammps file.
 </P>
 <UL><LI>OMP, default = <I>yes</I>
 <LI>CUDA, default = <I>no</I>
 <LI>HWLOC, default = <I>no</I>
 <LI>AVX, default = <I>no</I>
 <LI>MIC, default = <I>no</I>
 <LI>LIBRT, default = <I>no</I>
 <LI>DEBUG, default = <I>no</I> 
 </UL>
 <P>OMP sets the parallelization method used for Kokkos code (within
 LAMMPS) that runs on the host.  OMP=yes means that OpenMP will be
 used.  OMP=no means that pthreads will be used.
 </P>
 <P>CUDA sets the parallelization method used for Kokkos code (within
 LAMMPS) that runs on the device.  CUDA=yes means an NVIDIA GPU running
 CUDA will be used.  CUDA=no means that the OMP=yes or OMP=no setting
 will be used for the device as well as the host.
 </P>
 <P>If CUDA=yes, then the lo-level Makefile in the src/MAKE directory must
 use "nvcc" as its compiler, via its CC setting.  For best performance
 its CCFLAGS setting should use -O3 and have an -arch setting that
 matches the compute capability of your NVIDIA hardware and software
 installation, e.g. -arch=sm_20.  Generally Fermi Generation GPUs are
 sm_20, while Kepler generation GPUs are sm_30 or sm_35 and Maxwell
 cards are sm_50.  A complete list can be found on
 <A HREF = "http://en.wikipedia.org/wiki/CUDA#Supported_GPUs">wikipedia</A>. You can
 also use the deviceQuery tool that comes with the CUDA samples.  Note
 the minimal required compute capability is 2.0, but this will give
 signicantly reduced performance compared to Kepler generation GPUs
 with compute capability 3.x.  For the LINK setting, "nvcc" should not
 be used; instead use g++ or another compiler suitable for linking C++
 applications.  Often you will want to use your MPI compiler wrapper
 for this setting (i.e. mpicxx).  Finally, the lo-level Makefile must
 also have a "Compilation rule" for creating *.o files from *.cu files.
 See src/Makefile.cuda for an example of a lo-level Makefile with all
 of these settings.
 </P>
 <P>HWLOC binds threads to hardware cores, so they do not migrate during a
 simulation.  HWLOC=yes should always be used if running with OMP=no
 for pthreads.  It is not necessary for OMP=yes for OpenMP, because
 OpenMP provides alternative methods via environment variables for
 binding threads to hardware cores.  More info on binding threads to
 cores is given in <A HREF = "Section_accelerate.html#acc_8">this section</A>.
 </P>
 <P>AVX enables Intel advanced vector extensions when compiling for an
 Intel-compatible chip.  AVX=yes should only be set if your host
 hardware supports AVX.  If it does not support it, this will cause a
 run-time crash.
 </P>
 <P>MIC enables compiler switches needed when compling for an Intel Phi
 processor.
 </P>
 <P>LIBRT enables use of a more accurate timer mechanism on most Unix
 platforms.  This library is not available on all platforms.
 </P>
 <P>DEBUG is only useful when developing a Kokkos-enabled style within
 LAMMPS.  DEBUG=yes enables printing of run-time debugging information
 that can be useful.  It also enables runtime bounds checking on Kokkos
 data structures.
 </P>
 <P><B>Run with the KOKKOS package from the command line:</B>
 </P>
 <P>The mpirun or mpiexec command sets the total number of MPI tasks used
--- a/doc/accelerate_kokkos.txt
+++ b/doc/accelerate_kokkos.txt
@ -58,14 +58,17 @@ one or the other of the two modes.  The first mode is called the
 processor (running in native mode, not offload mode like the
 USER-INTEL package) are supported.  The second mode is called the
 "device" and is an accelerator chip of some kind.  Currently only an
-NVIDIA GPU is supported.  If your compute node does not have a GPU,
+NVIDIA GPU is supported via Cuda.  If your compute node does not have
-then there is only one mode of execution, i.e. the host and device are
+a GPU, then there is only one mode of execution, i.e. the host and
-the same.
+device are the same.
-Here is a quick overview of how to use the KOKKOS package
+When using the KOKKOS package, you must choose at build time whether
-for GPU acceleration:
+you are building for OpenMP, GPU, or for using the Xeon Phi in native
 mode.
-specify variables and settings in your Makefile.machine that enable GPU, Phi, or OpenMP support
+Here is a quick overview of how to use the KOKKOS package:
 specify variables and settings in your Makefile.machine that enable OpenMP, GPU, or Phi support
 include the KOKKOS package and build LAMMPS
 enable the KOKKOS package and its hardware options via the "-k on" command-line switch
 use KOKKOS styles in your input script :ul
@ -102,14 +105,23 @@ and GPU packages for details of how to check and do this.
 [Building LAMMPS with the KOKKOS package:]
-Unlike other acceleration packages discussed in this section, the
+You must choose at build time whether to build for OpenMP, Cuda, or
-Kokkos library in lib/kokkos does not have to be pre-built before
+Phi.
 building LAMMPS itself.  Instead, options for the Kokkos library are
 specified at compile time, when LAMMPS itself is built.  This can be
 done in one of two ways, as discussed below.
-Here are examples of how to build LAMMPS for the different compute-node
+You can do any of these in one line, using the src/Make.py script,
-configurations listed above.
+described in "Section 2.4"_Section_start.html#start_4 of the manual.
 Type "Make.py -h" for help.  If run from the src directory, these
 commands will create src/lmp_kokkos_omp, lmp_kokkos_cuda, and
 lmp_kokkos_phi.  The OMP and PHI options use src/MAKE/Makefile.mpi as
 the starting Makefile.machine.  The CUDA option uses
 src/MAKE/OPTIONS/Makefile.cuda since the NVIDIA nvcc compiler is
 required.
 Make.py -p kokkos -kokkos omp -o kokkos_omp file mpi 
 Make.py -p kokkos -kokkos cuda arch=31 -o kokkos_cuda file kokkos_cuda
 Make.py -p kokkos -kokkos phi -o kokkos_phi file mpi 
 Or you can follow these steps:
 CPU-only (run all-MPI or with OpenMP threading):
@ -161,15 +173,76 @@ in "Section 2.3.4"_Section_start.html#start_3_4 of the manual, as well
 as other settings that must be included in the machine makefile, if
 you create your own.
 There are other allowed options when building with the KOKKOS package.
 As above, They can be set either as variables on the make command line
 or in the machine makefile in the src/MAKE directory.  See "Section
 2.3.4"_Section_start.html#start_3_4 of the manual for details.
 IMPORTANT NOTE: Currently, there are no precision options with the
 KOKKOS package.  All compilation and computation is performed in
 double precision.
 There are other allowed options when building with the KOKKOS package.
 As above, they can be set either as variables on the make command line
 or in Makefile.machine.  This is the full list of options, including
 those discussed above, Each takes a value of {yes} or {no}.  The
 default value is listed, which is set in the
 lib/kokkos/Makefile.lammps file.
 OMP, default = {yes}
 CUDA, default = {no}
 HWLOC, default = {no}
 AVX, default = {no}
 MIC, default = {no}
 LIBRT, default = {no}
 DEBUG, default = {no} :ul
 OMP sets the parallelization method used for Kokkos code (within
 LAMMPS) that runs on the host.  OMP=yes means that OpenMP will be
 used.  OMP=no means that pthreads will be used.
 CUDA sets the parallelization method used for Kokkos code (within
 LAMMPS) that runs on the device.  CUDA=yes means an NVIDIA GPU running
 CUDA will be used.  CUDA=no means that the OMP=yes or OMP=no setting
 will be used for the device as well as the host.
 If CUDA=yes, then the lo-level Makefile in the src/MAKE directory must
 use "nvcc" as its compiler, via its CC setting.  For best performance
 its CCFLAGS setting should use -O3 and have an -arch setting that
 matches the compute capability of your NVIDIA hardware and software
 installation, e.g. -arch=sm_20.  Generally Fermi Generation GPUs are
 sm_20, while Kepler generation GPUs are sm_30 or sm_35 and Maxwell
 cards are sm_50.  A complete list can be found on
 "wikipedia"_http://en.wikipedia.org/wiki/CUDA#Supported_GPUs. You can
 also use the deviceQuery tool that comes with the CUDA samples.  Note
 the minimal required compute capability is 2.0, but this will give
 signicantly reduced performance compared to Kepler generation GPUs
 with compute capability 3.x.  For the LINK setting, "nvcc" should not
 be used; instead use g++ or another compiler suitable for linking C++
 applications.  Often you will want to use your MPI compiler wrapper
 for this setting (i.e. mpicxx).  Finally, the lo-level Makefile must
 also have a "Compilation rule" for creating *.o files from *.cu files.
 See src/Makefile.cuda for an example of a lo-level Makefile with all
 of these settings.
 HWLOC binds threads to hardware cores, so they do not migrate during a
 simulation.  HWLOC=yes should always be used if running with OMP=no
 for pthreads.  It is not necessary for OMP=yes for OpenMP, because
 OpenMP provides alternative methods via environment variables for
 binding threads to hardware cores.  More info on binding threads to
 cores is given in "this section"_Section_accelerate.html#acc_8.
 AVX enables Intel advanced vector extensions when compiling for an
 Intel-compatible chip.  AVX=yes should only be set if your host
 hardware supports AVX.  If it does not support it, this will cause a
 run-time crash.
 MIC enables compiler switches needed when compling for an Intel Phi
 processor.
 LIBRT enables use of a more accurate timer mechanism on most Unix
 platforms.  This library is not available on all platforms.
 DEBUG is only useful when developing a Kokkos-enabled style within
 LAMMPS.  DEBUG=yes enables printing of run-time debugging information
 that can be useful.  It also enables runtime bounds checking on Kokkos
 data structures.
 [Run with the KOKKOS package from the command line:]
 The mpirun or mpiexec command sets the total number of MPI tasks used
--- a/doc/accelerate_omp.html
+++ b/doc/accelerate_omp.html
@ -42,17 +42,27 @@ MPI task running on a CPU.
 </P>
 <P><B>Building LAMMPS with the USER-OMP package:</B>
 </P>
-<P>Include the package and build LAMMPS:
+<P>To do this in one line, use the src/Make.py script, described in
 <A HREF = "Section_start.html#start_4">Section 2.4</A> of the manual.  Type "Make.py
 -h" for help.  If run from the src directory, this command will create
 src/lmp_omp using src/MAKE/Makefile.mpi as the starting
 Makefile.machine:
 </P>
 <PRE>Make.py -p omp -o omp file mpi 
 </PRE>
 <P>Or you can follow these steps:
 </P>
 <PRE>cd lammps/src
 make yes-user-omp
 make machine 
 </PRE>
-<P>The CCFLAGS setting in your src/MAKE/Makefile.machine needs "-fopenmp"
+<P>The CCFLAGS setting in Makefile.machine needs "-fopenmp" to add OpenMP
-to add OpenMP support.  This works for both the GNU and Intel
+support.  This works for both the GNU and Intel compilers.  Without
-compilers.  Without this flag the USER-OMP styles will still be
+this flag the USER-OMP styles will still be compiled and work, but
-compiled and work, but will not support multi-threading.  For the
+will not support multi-threading.  For the Intel compilers the CCFLAGS
-Intel compilers the CCFLAGS setting also needs to include "-restrict".
+setting also needs to include "-restrict".
 </P>
 <P><B>Run with the USER-OMP package from the command line:</B>
 </P>
 <P>The mpirun or mpiexec command sets the total number of MPI tasks used
 by LAMMPS (one or multiple per compute node) and the number of MPI
--- a/doc/accelerate_omp.txt
+++ b/doc/accelerate_omp.txt
@ -39,17 +39,27 @@ MPI task running on a CPU.
 [Building LAMMPS with the USER-OMP package:]
-Include the package and build LAMMPS:
+To do this in one line, use the src/Make.py script, described in
 "Section 2.4"_Section_start.html#start_4 of the manual.  Type "Make.py
 -h" for help.  If run from the src directory, this command will create
 src/lmp_omp using src/MAKE/Makefile.mpi as the starting
 Makefile.machine:
 Make.py -p omp -o omp file mpi :pre
 Or you can follow these steps:
 cd lammps/src
 make yes-user-omp
 make machine :pre
-The CCFLAGS setting in your src/MAKE/Makefile.machine needs "-fopenmp"
+The CCFLAGS setting in Makefile.machine needs "-fopenmp" to add OpenMP
-to add OpenMP support.  This works for both the GNU and Intel
+support.  This works for both the GNU and Intel compilers.  Without
-compilers.  Without this flag the USER-OMP styles will still be
+this flag the USER-OMP styles will still be compiled and work, but
-compiled and work, but will not support multi-threading.  For the
+will not support multi-threading.  For the Intel compilers the CCFLAGS
-Intel compilers the CCFLAGS setting also needs to include "-restrict".
+setting also needs to include "-restrict".
 [Run with the USER-OMP package from the command line:]
 The mpirun or mpiexec command sets the total number of MPI tasks used
 by LAMMPS (one or multiple per compute node) and the number of MPI
--- a/doc/accelerate_opt.html
+++ b/doc/accelerate_opt.html
@ -38,12 +38,22 @@ input script.
 </P>
 <P>Include the package and build LAMMPS:
 </P>
 <P>To do this in one line, use the src/Make.py script, described in
 <A HREF = "Section_start.html#start_4">Section 2.4</A> of the manual.  Type "Make.py
 -h" for help.  If run from the src directory, this command will create
 src/lmp_opt using src/MAKE/Makefile.mpi as the starting
 Makefile.machine:
 </P>
 <PRE>Make.py -p opt -o opt file mpi 
 </PRE>
 <P>Or you can follow these steps:
 </P>
 <PRE>cd lammps/src
 make yes-opt
 make machine 
 </PRE>
-<P>If you are using Intel compilers, then the CCFLAGS setting in your
+<P>If you are using Intel compilers, then the CCFLAGS setting in
-src/MAKE/Makefile.machine needs to include "-restrict".
+Makefile.machine needs to include "-restrict".
 </P>
 <P><B>Run with the OPT package from the command line:</B>
 </P>
--- a/doc/accelerate_opt.txt
+++ b/doc/accelerate_opt.txt
@ -35,12 +35,22 @@ None.
 Include the package and build LAMMPS:
 To do this in one line, use the src/Make.py script, described in
 "Section 2.4"_Section_start.html#start_4 of the manual.  Type "Make.py
 -h" for help.  If run from the src directory, this command will create
 src/lmp_opt using src/MAKE/Makefile.mpi as the starting
 Makefile.machine:
 Make.py -p opt -o opt file mpi :pre
 Or you can follow these steps:
 cd lammps/src
 make yes-opt
 make machine :pre
-If you are using Intel compilers, then the CCFLAGS setting in your
+If you are using Intel compilers, then the CCFLAGS setting in
-src/MAKE/Makefile.machine needs to include "-restrict".
+Makefile.machine needs to include "-restrict".
 [Run with the OPT package from the command line:]