diff --git a/doc/Manual.html b/doc/Manual.html index 359798c3cd..857a90c623 100644 --- a/doc/Manual.html +++ b/doc/Manual.html @@ -152,23 +152,23 @@ it gives quick access to documentation for all LAMMPS commands.
  • How-to discussions @@ -417,16 +417,6 @@ it gives quick access to documentation for all LAMMPS commands. - - - - - - - - - - diff --git a/doc/Manual.txt b/doc/Manual.txt index 59ed9b9387..a0aaf08dd6 100644 --- a/doc/Manual.txt +++ b/doc/Manual.txt @@ -120,15 +120,15 @@ it gives quick access to documentation for all LAMMPS commands. 4.2 "User packages"_pkg_2 :ule,b "Accelerating LAMMPS performance"_Section_accelerate.html :l 5.1 "Measuring performance"_acc_1 :ulb,b - 5.2 "General strategies"_acc_2 :b - 5.3 "Packages with optimized styles"_acc_3 :b - 5.4 "OPT package"_acc_4 :b - 5.5 "USER-OMP package"_acc_5 :b - 5.6 "GPU package"_acc_6 :b - 5.7 "USER-CUDA package"_acc_7 :b - 5.8 "KOKKOS package"_acc_8 :b - 5.9 "USER-INTEL package"_acc_9 :b - 5.10 "Comparison of GPU and USER-CUDA packages"_acc_10 :ule,b + 5.2 "Algorithms and code options to boost performace"_acc_2 :b + 5.3 "Accelerator packages with optimized styles"_acc_3 :b + 5.3.1 "USER-CUDA package"_accelerate_cuda.html :ulb,b + 5.3.2 "GPU package"_accelerate_gpu.html :b + 5.3.3 "USER-INTEL package"_accelerate_intel.html :b + 5.3.4 "KOKKOS package"_accelerate_kokkos.html :b + 5.3.5 "USER-OMP package"_accelerate_omp.html :b + 5.3.6 "OPT package"_accelerate_opt.html :ule,b + 5.4 "Comparison of various accelerator packages"_acc_4 :ule,b "How-to discussions"_Section_howto.html :l 6.1 "Restarting a simulation"_howto_1 :ulb,b 6.2 "2d simulations"_howto_2 :b @@ -216,11 +216,6 @@ it gives quick access to documentation for all LAMMPS commands. :link(acc_2,Section_accelerate.html#acc_2) :link(acc_3,Section_accelerate.html#acc_3) :link(acc_4,Section_accelerate.html#acc_4) -:link(acc_5,Section_accelerate.html#acc_5) -:link(acc_6,Section_accelerate.html#acc_6) -:link(acc_7,Section_accelerate.html#acc_7) -:link(acc_8,Section_accelerate.html#acc_8) -:link(acc_9,Section_accelerate.html#acc_9) :link(howto_1,Section_howto.html#howto_1) :link(howto_2,Section_howto.html#howto_2) diff --git a/doc/Section_accelerate.html b/doc/Section_accelerate.html index 27b80f3d63..7547b571af 100644 --- a/doc/Section_accelerate.html +++ b/doc/Section_accelerate.html @@ -17,22 +17,38 @@ Section performance for different classes of problems running on different kinds of machines.

    -5.1 Measuring performance
    -5.2 General strategies
    -5.3 Packages with optimized styles
    -5.4 OPT package
    -5.5 USER-OMP package
    -5.6 GPU package
    -5.7 USER-CUDA package
    -5.8 KOKKOS package
    -5.9 USER-INTEL package
    -5.10 Comparison of USER-CUDA, GPU, and KOKKOS packages
    +

    There are two thrusts to the discussion that follows. The +first is using code options that implement alternate algorithms +that can speed-up a simulation. The second is to use one +of the several accelerator packages provided with LAMMPS that +contain code optimized for certain kinds of hardware, including +multi-core CPUs, GPUs, and Intel Xeon Phi coprocessors. +

    +

    The Benchmark page of the LAMMPS web site gives performance results for the various accelerator -packages discussed in this section for several of the standard LAMMPS -benchmarks, as a function of problem size and number of compute nodes, -on different hardware platforms. +packages discussed in Section 5.2, for several of the standard LAMMPS +benchmark problems, as a function of problem size and number of +compute nodes, on different hardware platforms.


    @@ -104,11 +120,9 @@ various options.
  • Staggered PPPM
  • single vs double PPPM
  • partial charge PPPM -
  • verlet/split -
  • processor mapping via processors numa command -
  • load-balancing: balance and fix balance -
  • processor command for layout -
  • OMP when lots of cores +
  • verlet/split run style +
  • processor command for proc layout and numa layout +
  • load-balancing: balance and fix balance

    2-FFT PPPM, also called analytic differentiation or ad PPPM, uses 2 FFTs instead of the 4 FFTs used by the default ik differentiation @@ -146,28 +160,30 @@ such as when using a barostat. fixes, computes, and other commands have been added to LAMMPS, which will typically run faster than the standard non-accelerated versions. Some require appropriate hardware -on your system, e.g. GPUs or Intel Xeon Phi chips. +to be present on your system, e.g. GPUs or Intel Xeon Phi +coprocessors.

    -

    All of these commands are in packages provided with LAMMPS, as -explained here. Currently, there are 6 such -accelerator packages in LAMMPS, either as standard or user packages: +

    All of these commands are in packages provided with LAMMPS. An +overview of packages is give in Section +packages. Currently, there are 6 accelerator +packages in LAMMPS, either as standard or user packages:

    - - - - - - + + + + +
    USER-CUDA for NVIDIA GPUs
    GPU for NVIDIA GPUs as well as OpenCL support
    USER-INTEL for Intel CPUs and Intel Xeon Phi
    KOKKOS for GPUs, Intel Xeon Phi, and OpenMP threading
    USER-OMP for OpenMP threading
    OPT generic CPU optimizations +
    USER-CUDA for NVIDIA GPUs
    GPU for NVIDIA GPUs as well as OpenCL support
    USER-INTEL for Intel CPUs and Intel Xeon Phi
    KOKKOS for GPUs, Intel Xeon Phi, and OpenMP threading
    USER-OMP for OpenMP threading
    OPT generic CPU optimizations

    Any accelerated style has the same name as the corresponding standard style, except that a suffix is appended. Otherwise, the syntax for -the command that specifies the style is identical, their functionality -is the same, and the numerical results it produces should also be the +the command that uses the style is identical, their functionality is +the same, and the numerical results it produces should also be the same, except for precision and round-off effects.

    -

    For example, all of these styles are variants of the basic +

    For example, all of these styles are accelerated variants of the Lennard-Jones pair_style lj/cut:

    -

    Assuming LAMMPS was built with the appropriate package, a simulation -using accelerated styles from the package can be run without modifying -your input script, by specifying command-line -switches. The details of how to do this -vary from package to package and are explained below. There is also a -suffix command and a package command that -accomplish the same thing and can be used within an input script if -preferred. The suffix command allows more precise -control of whether an accelerated or unaccelerated version of a style -is used at various points within an input script. +

    To see what accelerate styles are currently available, see +Section_commands 5 of the manual. The +doc pages for individual commands (e.g. pair lj/cut or +fix nve) also list any accelerated variants available +for that style.

    -

    To see what styles are currently available in each of the accelerated -packages, see Section_commands 5 of the -manual. The doc page for individual commands (e.g. pair -lj/cut or fix nve) also lists any -accelerated variants available for that style. +

    To use an accelerator package in LAMMPS, and one or more of the styles +it provides, follow these general steps. Details vary from package to +package and are explained in the individual accelerator sub-section +doc pages, listed above: +

    +
    + + + + + + + +
    build the accelerator library only for USER-CUDA and GPU packages
    install the accelerator package make yes-opt, make yes-user-intel, etc
    add compile/link flags to Makefile.machine in src/MAKE,
    only for USER-INTEL, KOKKOS, USER-OMP packages
    re-build LAMMPS make machine
    run a LAMMPS simulation lmp_machine < in.script
    enable the accelerator package via "-c on" and "-k on" command-line switches,
    only for USER-CUDA and KOKKOS packages
    set any needed options for the package via "-pk" command-line switch or package command,
    only if defaults need to be changed
    use accelerated styles in your input script via "-sf" command-line switch or suffix command +
    + +

    The first 4 steps typically only need to be done once, to create an +executable that uses one or more accelerator packages. We are working +to create a "make" tool that will perform all these 4 steps in a +single command. +

    +

    The last 4 steps can all be done from the command-line when LAMMPS is +launched, without changing your input script. Or you can add +package and suffix commands to your input +script.

    The examples directory has several sub-directories with scripts and -README files for using the accelerator packages: +README files for how to use the following accelerator packages: