git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@6697 f3b2605a-c512-4ea7-a41b-209d697bcdaa
This commit is contained in:
@ -190,55 +190,16 @@ from the GPU package, you can either append "gpu" to the style name
|
||||
switch</A>, or use the <A HREF = "suffix.html">suffix</A>
|
||||
command.
|
||||
</P>
|
||||
<P>The <A HREF = "fix_gpu.html">fix gpu</A> command controls the GPU selection and
|
||||
initialization steps.
|
||||
<P>The <A HREF = "package.html">package gpu</A> command must be used near the beginning
|
||||
of your script to control the GPU selection and initialization steps.
|
||||
It also enables asynchronous splitting of force computations between
|
||||
the CPUs and GPUs.
|
||||
</P>
|
||||
<P>The format for the fix is:
|
||||
</P>
|
||||
<PRE>fix fix-ID all gpu <I>mode</I> <I>first</I> <I>last</I> <I>split</I>
|
||||
</PRE>
|
||||
<P>where fix-ID is the name for the fix. The gpu fix must be the first
|
||||
fix specified for a given run, otherwise LAMMPS will exit with an
|
||||
error. The gpu fix does not have any effect on runs that do not use
|
||||
GPU acceleration, so there should be no problem specifying the fix
|
||||
first in any input script.
|
||||
</P>
|
||||
<P>The <I>mode</I> setting can be either "force" or "force/neigh". In the
|
||||
former, neighbor list calculation is performed on the CPU using the
|
||||
standard LAMMPS routines. In the latter, the neighbor list calculation
|
||||
is performed on the GPU. The GPU neighbor list can be used for better
|
||||
performance, however, it cannot not be used with a triclinic box or
|
||||
with <A HREF = "pair_hybrid.html">hybrid</A> pair styles.
|
||||
</P>
|
||||
<P>There are cases when it may be more efficient to select the CPU for
|
||||
neighbor list builds. If a non-GPU enabled style (e.g. a fix or
|
||||
compute) requires a neighbor list, it will also be built using CPU
|
||||
routines. Redundant CPU and GPU neighbor list calculations will
|
||||
typically be less efficient.
|
||||
</P>
|
||||
<P>The <I>first</I> setting is the ID (as reported by
|
||||
lammps/lib/gpu/nvc_get_devices) of the first GPU that will be used on
|
||||
each node. The <I>last</I> setting is the ID of the last GPU that will be
|
||||
used on each node. If you have only one GPU per node, <I>first</I> and
|
||||
<I>last</I> will typically both be 0. Selecting a non-sequential set of GPU
|
||||
IDs (e.g. 0,1,3) is not currently supported.
|
||||
</P>
|
||||
<P>The <I>split</I> setting is the fraction of particles whose forces,
|
||||
torques, energies, and/or virials will be calculated on the GPU. This
|
||||
can be used to perform CPU and GPU force calculations simultaneously,
|
||||
e.g. on a hybrid node with a multicore CPU and a GPU(s). If <I>split</I>
|
||||
is negative, the software will attempt to calculate the optimal
|
||||
fraction automatically every 25 timesteps based on CPU and GPU
|
||||
timings. Because the GPU speedups are dependent on the number of
|
||||
particles, automatic calculation of the split can be less efficient,
|
||||
but typically results in loop times within 20% of an optimal fixed
|
||||
split.
|
||||
</P>
|
||||
<P>As an example, if you have two GPUs per node, 8 CPU cores per node,
|
||||
<P>As an example, if you have two GPUs per node and 8 CPU cores per node,
|
||||
and would like to run on 4 nodes (32 cores) with dynamic balancing of
|
||||
force calculation across CPU and GPU cores, the fix might be
|
||||
force calculation across CPU and GPU cores, you could specify
|
||||
</P>
|
||||
<PRE>fix 0 all gpu force/neigh 0 1 -1
|
||||
<PRE>package gpu force/neigh 0 1 -1
|
||||
</PRE>
|
||||
<P>In this case, all CPU cores and GPU devices on the nodes would be
|
||||
utilized. Each GPU device would be shared by 4 CPU cores. The CPU
|
||||
@ -246,39 +207,14 @@ cores would perform force calculations for some fraction of the
|
||||
particles at the same time the GPUs performed force calculation for
|
||||
the other particles.
|
||||
</P>
|
||||
<P><B>Asynchronous pair computation on GPU and CPU</B>
|
||||
</P>
|
||||
<P>The GPU accelerated pair styles can perform pair style force
|
||||
calculation on the GPU at the same time other force calculations
|
||||
within LAMMPS are being performed on the CPU. These include pair,
|
||||
bond, angle, etc forces as well as long-range Coulombic forces. This
|
||||
is enabled by the <I>split</I> setting in the gpu fix as described above.
|
||||
</P>
|
||||
<P>With a <I>split</I> setting less than 1.0, a portion of the pair-wise force
|
||||
calculations will also be performed on the CPU. When the CPU finishes
|
||||
its pair style computations (if any), the next LAMMPS force
|
||||
computation will begin (bond, angle, etc), possibly before the GPU has
|
||||
finished its pair style computations.
|
||||
</P>
|
||||
<P>This means that if <I>split</I> is set to 1.0, the GPU will begin the
|
||||
LAMMPS force computation immediately. This can be used to run a
|
||||
<A HREF = "pair_hybrid.html">hybrid</A> GPU pair style at the same time as a hybrid
|
||||
CPU pair style. In this case, the GPU pair style should be first in
|
||||
the hybrid command in order to perform simultaneous calculations. This
|
||||
also allows <A HREF = "bond_style.html">bond</A>, <A HREF = "angle_style.html">angle</A>,
|
||||
<A HREF = "dihedral_style.html">dihedral</A>, <A HREF = "improper_style.html">improper</A>, and
|
||||
<A HREF = "kspace_style.html">long-range</A> force computations to run
|
||||
simultaneously with the GPU pair style. If all CPU force computations
|
||||
complete before the GPU, LAMMPS will block until the GPU has finished
|
||||
before continuing the timestep.
|
||||
</P>
|
||||
<P><B>Timing output:</B>
|
||||
</P>
|
||||
<P>As noted above, GPU accelerated pair styles can perform computations
|
||||
asynchronously with CPU computations. The "Pair" time reported by
|
||||
LAMMPS will be the maximum of the time required to complete the CPU
|
||||
pair style computations and the time required to complete the GPU pair
|
||||
style computations. Any time spent for GPU-enabled pair styles for
|
||||
<P>As described by the <A HREF = "package.html">package gpu</A> command, GPU
|
||||
accelerated pair styles can perform computations asynchronously with
|
||||
CPU computations. The "Pair" time reported by LAMMPS will be the
|
||||
maximum of the time required to complete the CPU pair style
|
||||
computations and the time required to complete the GPU pair style
|
||||
computations. Any time spent for GPU-enabled pair styles for
|
||||
computations that run simultaneously with <A HREF = "bond_style.html">bond</A>,
|
||||
<A HREF = "angle_style.html">angle</A>, <A HREF = "dihedral_style.html">dihedral</A>,
|
||||
<A HREF = "improper_style.html">improper</A>, and <A HREF = "kspace_style.html">long-range</A>
|
||||
|
||||
@ -185,55 +185,16 @@ from the GPU package, you can either append "gpu" to the style name
|
||||
switch"_Section_start.html#2_6, or use the "suffix"_suffix.html
|
||||
command.
|
||||
|
||||
The "fix gpu"_fix_gpu.html command controls the GPU selection and
|
||||
initialization steps.
|
||||
The "package gpu"_package.html command must be used near the beginning
|
||||
of your script to control the GPU selection and initialization steps.
|
||||
It also enables asynchronous splitting of force computations between
|
||||
the CPUs and GPUs.
|
||||
|
||||
The format for the fix is:
|
||||
|
||||
fix fix-ID all gpu {mode} {first} {last} {split} :pre
|
||||
|
||||
where fix-ID is the name for the fix. The gpu fix must be the first
|
||||
fix specified for a given run, otherwise LAMMPS will exit with an
|
||||
error. The gpu fix does not have any effect on runs that do not use
|
||||
GPU acceleration, so there should be no problem specifying the fix
|
||||
first in any input script.
|
||||
|
||||
The {mode} setting can be either "force" or "force/neigh". In the
|
||||
former, neighbor list calculation is performed on the CPU using the
|
||||
standard LAMMPS routines. In the latter, the neighbor list calculation
|
||||
is performed on the GPU. The GPU neighbor list can be used for better
|
||||
performance, however, it cannot not be used with a triclinic box or
|
||||
with "hybrid"_pair_hybrid.html pair styles.
|
||||
|
||||
There are cases when it may be more efficient to select the CPU for
|
||||
neighbor list builds. If a non-GPU enabled style (e.g. a fix or
|
||||
compute) requires a neighbor list, it will also be built using CPU
|
||||
routines. Redundant CPU and GPU neighbor list calculations will
|
||||
typically be less efficient.
|
||||
|
||||
The {first} setting is the ID (as reported by
|
||||
lammps/lib/gpu/nvc_get_devices) of the first GPU that will be used on
|
||||
each node. The {last} setting is the ID of the last GPU that will be
|
||||
used on each node. If you have only one GPU per node, {first} and
|
||||
{last} will typically both be 0. Selecting a non-sequential set of GPU
|
||||
IDs (e.g. 0,1,3) is not currently supported.
|
||||
|
||||
The {split} setting is the fraction of particles whose forces,
|
||||
torques, energies, and/or virials will be calculated on the GPU. This
|
||||
can be used to perform CPU and GPU force calculations simultaneously,
|
||||
e.g. on a hybrid node with a multicore CPU and a GPU(s). If {split}
|
||||
is negative, the software will attempt to calculate the optimal
|
||||
fraction automatically every 25 timesteps based on CPU and GPU
|
||||
timings. Because the GPU speedups are dependent on the number of
|
||||
particles, automatic calculation of the split can be less efficient,
|
||||
but typically results in loop times within 20% of an optimal fixed
|
||||
split.
|
||||
|
||||
As an example, if you have two GPUs per node, 8 CPU cores per node,
|
||||
As an example, if you have two GPUs per node and 8 CPU cores per node,
|
||||
and would like to run on 4 nodes (32 cores) with dynamic balancing of
|
||||
force calculation across CPU and GPU cores, the fix might be
|
||||
force calculation across CPU and GPU cores, you could specify
|
||||
|
||||
fix 0 all gpu force/neigh 0 1 -1 :pre
|
||||
package gpu force/neigh 0 1 -1 :pre
|
||||
|
||||
In this case, all CPU cores and GPU devices on the nodes would be
|
||||
utilized. Each GPU device would be shared by 4 CPU cores. The CPU
|
||||
@ -241,39 +202,14 @@ cores would perform force calculations for some fraction of the
|
||||
particles at the same time the GPUs performed force calculation for
|
||||
the other particles.
|
||||
|
||||
[Asynchronous pair computation on GPU and CPU]
|
||||
|
||||
The GPU accelerated pair styles can perform pair style force
|
||||
calculation on the GPU at the same time other force calculations
|
||||
within LAMMPS are being performed on the CPU. These include pair,
|
||||
bond, angle, etc forces as well as long-range Coulombic forces. This
|
||||
is enabled by the {split} setting in the gpu fix as described above.
|
||||
|
||||
With a {split} setting less than 1.0, a portion of the pair-wise force
|
||||
calculations will also be performed on the CPU. When the CPU finishes
|
||||
its pair style computations (if any), the next LAMMPS force
|
||||
computation will begin (bond, angle, etc), possibly before the GPU has
|
||||
finished its pair style computations.
|
||||
|
||||
This means that if {split} is set to 1.0, the GPU will begin the
|
||||
LAMMPS force computation immediately. This can be used to run a
|
||||
"hybrid"_pair_hybrid.html GPU pair style at the same time as a hybrid
|
||||
CPU pair style. In this case, the GPU pair style should be first in
|
||||
the hybrid command in order to perform simultaneous calculations. This
|
||||
also allows "bond"_bond_style.html, "angle"_angle_style.html,
|
||||
"dihedral"_dihedral_style.html, "improper"_improper_style.html, and
|
||||
"long-range"_kspace_style.html force computations to run
|
||||
simultaneously with the GPU pair style. If all CPU force computations
|
||||
complete before the GPU, LAMMPS will block until the GPU has finished
|
||||
before continuing the timestep.
|
||||
|
||||
[Timing output:]
|
||||
|
||||
As noted above, GPU accelerated pair styles can perform computations
|
||||
asynchronously with CPU computations. The "Pair" time reported by
|
||||
LAMMPS will be the maximum of the time required to complete the CPU
|
||||
pair style computations and the time required to complete the GPU pair
|
||||
style computations. Any time spent for GPU-enabled pair styles for
|
||||
As described by the "package gpu"_package.html command, GPU
|
||||
accelerated pair styles can perform computations asynchronously with
|
||||
CPU computations. The "Pair" time reported by LAMMPS will be the
|
||||
maximum of the time required to complete the CPU pair style
|
||||
computations and the time required to complete the GPU pair style
|
||||
computations. Any time spent for GPU-enabled pair styles for
|
||||
computations that run simultaneously with "bond"_bond_style.html,
|
||||
"angle"_angle_style.html, "dihedral"_dihedral_style.html,
|
||||
"improper"_improper_style.html, and "long-range"_kspace_style.html
|
||||
|
||||
@ -338,15 +338,14 @@ of each style or click on the style itself for a full description:
|
||||
<DIV ALIGN=center><TABLE BORDER=1 >
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_adapt.html">adapt</A></TD><TD ><A HREF = "fix_addforce.html">addforce</A></TD><TD ><A HREF = "fix_aveforce.html">aveforce</A></TD><TD ><A HREF = "fix_ave_atom.html">ave/atom</A></TD><TD ><A HREF = "fix_ave_correlate.html">ave/correlate</A></TD><TD ><A HREF = "fix_ave_histo.html">ave/histo</A></TD><TD ><A HREF = "fix_ave_spatial.html">ave/spatial</A></TD><TD ><A HREF = "fix_ave_time.html">ave/time</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_bond_break.html">bond/break</A></TD><TD ><A HREF = "fix_bond_create.html">bond/create</A></TD><TD ><A HREF = "fix_bond_swap.html">bond/swap</A></TD><TD ><A HREF = "fix_box_relax.html">box/relax</A></TD><TD ><A HREF = "fix_deform.html">deform</A></TD><TD ><A HREF = "fix_deposit.html">deposit</A></TD><TD ><A HREF = "fix_drag.html">drag</A></TD><TD ><A HREF = "fix_dt_reset.html">dt/reset</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_efield.html">efield</A></TD><TD ><A HREF = "fix_enforce2d.html">enforce2d</A></TD><TD ><A HREF = "fix_evaporate.html">evaporate</A></TD><TD ><A HREF = "fix_external.html">external</A></TD><TD ><A HREF = "fix_freeze.html">freeze</A></TD><TD ><A HREF = "fix_gpu.html">gpu</A></TD><TD ><A HREF = "fix_gravity.html">gravity</A></TD><TD ><A HREF = "fix_heat.html">heat</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_indent.html">indent</A></TD><TD ><A HREF = "fix_langevin.html">langevin</A></TD><TD ><A HREF = "fix_lineforce.html">lineforce</A></TD><TD ><A HREF = "fix_momentum.html">momentum</A></TD><TD ><A HREF = "fix_move.html">move</A></TD><TD ><A HREF = "fix_msst.html">msst</A></TD><TD ><A HREF = "fix_neb.html">neb</A></TD><TD ><A HREF = "fix_nh.html">nph</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_nph_asphere.html">nph/asphere</A></TD><TD ><A HREF = "fix_nph_sphere.html">nph/sphere</A></TD><TD ><A HREF = "fix_nh.html">npt</A></TD><TD ><A HREF = "fix_npt_asphere.html">npt/asphere</A></TD><TD ><A HREF = "fix_npt_sphere.html">npt/sphere</A></TD><TD ><A HREF = "fix_nve.html">nve</A></TD><TD ><A HREF = "fix_nve_asphere.html">nve/asphere</A></TD><TD ><A HREF = "fix_nve_limit.html">nve/limit</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_nve_noforce.html">nve/noforce</A></TD><TD ><A HREF = "fix_nve_sphere.html">nve/sphere</A></TD><TD ><A HREF = "fix_nh.html">nvt</A></TD><TD ><A HREF = "fix_nvt_asphere.html">nvt/asphere</A></TD><TD ><A HREF = "fix_nvt_sllod.html">nvt/sllod</A></TD><TD ><A HREF = "fix_nvt_sphere.html">nvt/sphere</A></TD><TD ><A HREF = "fix_orient_fcc.html">orient/fcc</A></TD><TD ><A HREF = "fix_planeforce.html">planeforce</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_poems.html">poems</A></TD><TD ><A HREF = "fix_pour.html">pour</A></TD><TD ><A HREF = "fix_press_berendsen.html">press/berendsen</A></TD><TD ><A HREF = "fix_print.html">print</A></TD><TD ><A HREF = "fix_qeq_comb.html">qeq/comb</A></TD><TD ><A HREF = "fix_reax_bonds.html">reax/bonds</A></TD><TD ><A HREF = "fix_recenter.html">recenter</A></TD><TD ><A HREF = "fix_rigid.html">rigid</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_rigid.html">rigid/nve</A></TD><TD ><A HREF = "fix_rigid.html">rigid/nvt</A></TD><TD ><A HREF = "fix_setforce.html">setforce</A></TD><TD ><A HREF = "fix_shake.html">shake</A></TD><TD ><A HREF = "fix_spring.html">spring</A></TD><TD ><A HREF = "fix_spring_rg.html">spring/rg</A></TD><TD ><A HREF = "fix_spring_self.html">spring/self</A></TD><TD ><A HREF = "fix_srd.html">srd</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_store_force.html">store/force</A></TD><TD ><A HREF = "fix_store_state.html">store/state</A></TD><TD ><A HREF = "fix_temp_berendsen.html">temp/berendsen</A></TD><TD ><A HREF = "fix_temp_rescale.html">temp/rescale</A></TD><TD ><A HREF = "fix_thermal_conductivity.html">thermal/conductivity</A></TD><TD ><A HREF = "fix_tmd.html">tmd</A></TD><TD ><A HREF = "fix_ttm.html">ttm</A></TD><TD ><A HREF = "fix_viscosity.html">viscosity</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_viscous.html">viscous</A></TD><TD ><A HREF = "fix_wall.html">wall/colloid</A></TD><TD ><A HREF = "fix_wall_gran.html">wall/gran</A></TD><TD ><A HREF = "fix_wall.html">wall/harmonic</A></TD><TD ><A HREF = "fix_wall.html">wall/lj126</A></TD><TD ><A HREF = "fix_wall.html">wall/lj93</A></TD><TD ><A HREF = "fix_wall_reflect.html">wall/reflect</A></TD><TD ><A HREF = "fix_wall_region.html">wall/region</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_wall_srd.html">wall/srd</A>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_efield.html">efield</A></TD><TD ><A HREF = "fix_enforce2d.html">enforce2d</A></TD><TD ><A HREF = "fix_evaporate.html">evaporate</A></TD><TD ><A HREF = "fix_external.html">external</A></TD><TD ><A HREF = "fix_freeze.html">freeze</A></TD><TD ><A HREF = "fix_gravity.html">gravity</A></TD><TD ><A HREF = "fix_heat.html">heat</A></TD><TD ><A HREF = "fix_indent.html">indent</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_langevin.html">langevin</A></TD><TD ><A HREF = "fix_lineforce.html">lineforce</A></TD><TD ><A HREF = "fix_momentum.html">momentum</A></TD><TD ><A HREF = "fix_move.html">move</A></TD><TD ><A HREF = "fix_msst.html">msst</A></TD><TD ><A HREF = "fix_neb.html">neb</A></TD><TD ><A HREF = "fix_nh.html">nph</A></TD><TD ><A HREF = "fix_nph_asphere.html">nph/asphere</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_nph_sphere.html">nph/sphere</A></TD><TD ><A HREF = "fix_nh.html">npt</A></TD><TD ><A HREF = "fix_npt_asphere.html">npt/asphere</A></TD><TD ><A HREF = "fix_npt_sphere.html">npt/sphere</A></TD><TD ><A HREF = "fix_nve.html">nve</A></TD><TD ><A HREF = "fix_nve_asphere.html">nve/asphere</A></TD><TD ><A HREF = "fix_nve_limit.html">nve/limit</A></TD><TD ><A HREF = "fix_nve_noforce.html">nve/noforce</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_nve_sphere.html">nve/sphere</A></TD><TD ><A HREF = "fix_nh.html">nvt</A></TD><TD ><A HREF = "fix_nvt_asphere.html">nvt/asphere</A></TD><TD ><A HREF = "fix_nvt_sllod.html">nvt/sllod</A></TD><TD ><A HREF = "fix_nvt_sphere.html">nvt/sphere</A></TD><TD ><A HREF = "fix_orient_fcc.html">orient/fcc</A></TD><TD ><A HREF = "fix_planeforce.html">planeforce</A></TD><TD ><A HREF = "fix_poems.html">poems</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_pour.html">pour</A></TD><TD ><A HREF = "fix_press_berendsen.html">press/berendsen</A></TD><TD ><A HREF = "fix_print.html">print</A></TD><TD ><A HREF = "fix_qeq_comb.html">qeq/comb</A></TD><TD ><A HREF = "fix_reax_bonds.html">reax/bonds</A></TD><TD ><A HREF = "fix_recenter.html">recenter</A></TD><TD ><A HREF = "fix_rigid.html">rigid</A></TD><TD ><A HREF = "fix_rigid.html">rigid/nve</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_rigid.html">rigid/nvt</A></TD><TD ><A HREF = "fix_setforce.html">setforce</A></TD><TD ><A HREF = "fix_shake.html">shake</A></TD><TD ><A HREF = "fix_spring.html">spring</A></TD><TD ><A HREF = "fix_spring_rg.html">spring/rg</A></TD><TD ><A HREF = "fix_spring_self.html">spring/self</A></TD><TD ><A HREF = "fix_srd.html">srd</A></TD><TD ><A HREF = "fix_store_force.html">store/force</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_store_state.html">store/state</A></TD><TD ><A HREF = "fix_temp_berendsen.html">temp/berendsen</A></TD><TD ><A HREF = "fix_temp_rescale.html">temp/rescale</A></TD><TD ><A HREF = "fix_thermal_conductivity.html">thermal/conductivity</A></TD><TD ><A HREF = "fix_tmd.html">tmd</A></TD><TD ><A HREF = "fix_ttm.html">ttm</A></TD><TD ><A HREF = "fix_viscosity.html">viscosity</A></TD><TD ><A HREF = "fix_viscous.html">viscous</A></TD></TR>
|
||||
<TR ALIGN="center"><TD ><A HREF = "fix_wall.html">wall/colloid</A></TD><TD ><A HREF = "fix_wall_gran.html">wall/gran</A></TD><TD ><A HREF = "fix_wall.html">wall/harmonic</A></TD><TD ><A HREF = "fix_wall.html">wall/lj126</A></TD><TD ><A HREF = "fix_wall.html">wall/lj93</A></TD><TD ><A HREF = "fix_wall_reflect.html">wall/reflect</A></TD><TD ><A HREF = "fix_wall_region.html">wall/region</A></TD><TD ><A HREF = "fix_wall_srd.html">wall/srd</A>
|
||||
</TD></TR></TABLE></DIV>
|
||||
|
||||
<P>These are fix styles contributed by users, which can be used if
|
||||
|
||||
@ -418,7 +418,6 @@ of each style or click on the style itself for a full description:
|
||||
"evaporate"_fix_evaporate.html,
|
||||
"external"_fix_external.html,
|
||||
"freeze"_fix_freeze.html,
|
||||
"gpu"_fix_gpu.html,
|
||||
"gravity"_fix_gravity.html,
|
||||
"heat"_fix_heat.html,
|
||||
"indent"_fix_indent.html,
|
||||
|
||||
112
doc/fix_gpu.html
112
doc/fix_gpu.html
@ -1,112 +0,0 @@
|
||||
<HTML>
|
||||
<CENTER><A HREF = "http://lammps.sandia.gov">LAMMPS WWW Site</A> - <A HREF = "Manual.html">LAMMPS Documentation</A> - <A HREF = "Section_commands.html#comm">LAMMPS Commands</A>
|
||||
</CENTER>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
<HR>
|
||||
|
||||
<H3>fix gpu command
|
||||
</H3>
|
||||
<P><B>Syntax:</B>
|
||||
</P>
|
||||
<PRE>fix ID group-ID gpu mode first last split
|
||||
</PRE>
|
||||
<UL><LI>ID, group-ID are documented in <A HREF = "fix.html">fix</A> command
|
||||
|
||||
<LI>gpu = style name of this fix command
|
||||
|
||||
<LI>mode = force or force/neigh
|
||||
|
||||
<LI>first = ID of first GPU to be used on each node
|
||||
|
||||
<LI>last = ID of last GPU to be used on each node
|
||||
|
||||
<LI>split = fraction of particles assigned to the GPU
|
||||
|
||||
|
||||
</UL>
|
||||
<P><B>Examples:</B>
|
||||
</P>
|
||||
<PRE>fix 0 all gpu force 0 0 1.0
|
||||
fix 0 all gpu force 0 0 0.75
|
||||
fix 0 all gpu force/neigh 0 0 1.0
|
||||
fix 0 all gpu force/neigh 0 1 -1.0
|
||||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>Select and initialize GPUs to be used for acceleration and configure
|
||||
GPU acceleration in LAMMPS. This fix is required in order to use
|
||||
any style with GPU acceleration. The fix must be the first fix
|
||||
specified for a run or an error will be generated. The fix will not have an
|
||||
effect on any LAMMPS computations that do not use GPU acceleration, so there
|
||||
should not be any problems with specifying this fix first in input scripts.
|
||||
</P>
|
||||
<P>The <I>mode</I> setting specifies where neighbor list calculations will be
|
||||
performed. If <I>mode</I> is force, neighbor list calculation is performed
|
||||
on the CPU. If <I>mode</I> is force/neigh, neighbor list calculation is
|
||||
performed on the GPU. GPU neighbor list calculation currently cannot
|
||||
be used with a triclinic box. GPU neighbor list calculation currently
|
||||
cannot be used with <A HREF = "pair_hybrid.html">hybrid</A> pair styles. GPU
|
||||
neighbor lists are not compatible with styles that are not
|
||||
GPU-enabled. When a non-GPU enabled style requires a neighbor list,
|
||||
it will also be built using CPU routines. In these cases, it will
|
||||
typically be more efficient to only use CPU neighbor list builds.
|
||||
</P>
|
||||
<P>The <I>first</I> and <I>last</I> settings specify the GPUs that will be used for
|
||||
simulation. On each node, the GPU IDs in the inclusive range from
|
||||
<I>first</I> to <I>last</I> will be used.
|
||||
</P>
|
||||
<P>The <I>split</I> setting can be used for load balancing force calculation
|
||||
work between CPU and GPU cores in GPU-enabled pair styles. If
|
||||
0<<I>split</I><1.0, a fixed fraction of particles is offloaded to the GPU
|
||||
while force calculation for the other particles occurs simulataneously
|
||||
on the CPU. If <I>split</I><0, the optimal fraction (based on CPU and GPU
|
||||
timings) is calculated every 25 timesteps. If <I>split</I>=1.0, all force
|
||||
calculations for GPU accelerated pair styles are performed on the
|
||||
GPU. In this case, <A HREF = "pair_hybrid.html">hybrid</A>, <A HREF = "bond_style.html">bond</A>,
|
||||
<A HREF = "angle_style.html">angle</A>, <A HREF = "dihedral_style.html">dihedral</A>,
|
||||
<A HREF = "improper_style.html">improper</A>, and <A HREF = "kspace_style.html">long-range</A>
|
||||
calculations can be performed on the CPU while the GPU is performing
|
||||
force calculations for the GPU-enabled pair style.
|
||||
</P>
|
||||
<P>In order to use GPU acceleration, a GPU enabled style must be selected
|
||||
in the input script in addition to this fix. Currently, this is
|
||||
limited to a few <A HREF = "pair_style.html">pair styles</A> and the PPPM <A HREF = "kspace_style.html">kspace
|
||||
style</A>.
|
||||
</P>
|
||||
<P>See <A HREF = "doc/Section_accerate.html">this section</A> of the manual for more
|
||||
details about using the GPU package.
|
||||
</P>
|
||||
<P><B>Restart, fix_modify, output, run start/stop, minimize info:</B>
|
||||
</P>
|
||||
<P>This fix is part of the "gpu" package. It is only enabled if LAMMPS
|
||||
was built with that package. See the <A HREF = "Section_start.html#2_3">Making
|
||||
LAMMPS</A> section for more info.
|
||||
</P>
|
||||
<P>No information about this fix is written to <A HREF = "restart.html">binary restart
|
||||
files</A>. None of the <A HREF = "fix_modify.html">fix_modify</A> options
|
||||
are relevant to this fix.
|
||||
</P>
|
||||
<P>No parameter of this fix can be used with the <I>start/stop</I> keywords of
|
||||
the <A HREF = "run.html">run</A> command.
|
||||
</P>
|
||||
<P><B>Restrictions:</B>
|
||||
</P>
|
||||
<P>The fix must be the first fix specified for a given run. The
|
||||
force/neigh <I>mode</I> should not be used with a triclinic box or
|
||||
<A HREF = "pair_hybrid.html">hybrid</A> pair styles.
|
||||
</P>
|
||||
<P>The <I>split</I> setting must be positive when using
|
||||
<A HREF = "pair_hybrid.html">hybrid</A> pair styles.
|
||||
</P>
|
||||
<P>Currently, group-ID must be all.
|
||||
</P>
|
||||
<P><B>Related commands:</B> none
|
||||
</P>
|
||||
<P><B>Default:</B> none
|
||||
</P>
|
||||
</HTML>
|
||||
102
doc/fix_gpu.txt
102
doc/fix_gpu.txt
@ -1,102 +0,0 @@
|
||||
"LAMMPS WWW Site"_lws - "LAMMPS Documentation"_ld - "LAMMPS Commands"_lc :c
|
||||
|
||||
:link(lws,http://lammps.sandia.gov)
|
||||
:link(ld,Manual.html)
|
||||
:link(lc,Section_commands.html#comm)
|
||||
|
||||
:line
|
||||
|
||||
fix gpu command :h3
|
||||
|
||||
[Syntax:]
|
||||
|
||||
fix ID group-ID gpu mode first last split :pre
|
||||
|
||||
ID, group-ID are documented in "fix"_fix.html command :ulb,l
|
||||
gpu = style name of this fix command :l
|
||||
mode = force or force/neigh :l
|
||||
first = ID of first GPU to be used on each node :l
|
||||
last = ID of last GPU to be used on each node :l
|
||||
split = fraction of particles assigned to the GPU :l
|
||||
:ule
|
||||
|
||||
[Examples:]
|
||||
|
||||
fix 0 all gpu force 0 0 1.0
|
||||
fix 0 all gpu force 0 0 0.75
|
||||
fix 0 all gpu force/neigh 0 0 1.0
|
||||
fix 0 all gpu force/neigh 0 1 -1.0 :pre
|
||||
|
||||
[Description:]
|
||||
|
||||
Select and initialize GPUs to be used for acceleration and configure
|
||||
GPU acceleration in LAMMPS. This fix is required in order to use
|
||||
any style with GPU acceleration. The fix must be the first fix
|
||||
specified for a run or an error will be generated. The fix will not have an
|
||||
effect on any LAMMPS computations that do not use GPU acceleration, so there
|
||||
should not be any problems with specifying this fix first in input scripts.
|
||||
|
||||
The {mode} setting specifies where neighbor list calculations will be
|
||||
performed. If {mode} is force, neighbor list calculation is performed
|
||||
on the CPU. If {mode} is force/neigh, neighbor list calculation is
|
||||
performed on the GPU. GPU neighbor list calculation currently cannot
|
||||
be used with a triclinic box. GPU neighbor list calculation currently
|
||||
cannot be used with "hybrid"_pair_hybrid.html pair styles. GPU
|
||||
neighbor lists are not compatible with styles that are not
|
||||
GPU-enabled. When a non-GPU enabled style requires a neighbor list,
|
||||
it will also be built using CPU routines. In these cases, it will
|
||||
typically be more efficient to only use CPU neighbor list builds.
|
||||
|
||||
The {first} and {last} settings specify the GPUs that will be used for
|
||||
simulation. On each node, the GPU IDs in the inclusive range from
|
||||
{first} to {last} will be used.
|
||||
|
||||
The {split} setting can be used for load balancing force calculation
|
||||
work between CPU and GPU cores in GPU-enabled pair styles. If
|
||||
0<{split}<1.0, a fixed fraction of particles is offloaded to the GPU
|
||||
while force calculation for the other particles occurs simulataneously
|
||||
on the CPU. If {split}<0, the optimal fraction (based on CPU and GPU
|
||||
timings) is calculated every 25 timesteps. If {split}=1.0, all force
|
||||
calculations for GPU accelerated pair styles are performed on the
|
||||
GPU. In this case, "hybrid"_pair_hybrid.html, "bond"_bond_style.html,
|
||||
"angle"_angle_style.html, "dihedral"_dihedral_style.html,
|
||||
"improper"_improper_style.html, and "long-range"_kspace_style.html
|
||||
calculations can be performed on the CPU while the GPU is performing
|
||||
force calculations for the GPU-enabled pair style.
|
||||
|
||||
In order to use GPU acceleration, a GPU enabled style must be selected
|
||||
in the input script in addition to this fix. Currently, this is
|
||||
limited to a few "pair styles"_pair_style.html and the PPPM "kspace
|
||||
style"_kspace_style.html.
|
||||
|
||||
See "this section"_doc/Section_accerate.html of the manual for more
|
||||
details about using the GPU package.
|
||||
|
||||
[Restart, fix_modify, output, run start/stop, minimize info:]
|
||||
|
||||
This fix is part of the "gpu" package. It is only enabled if LAMMPS
|
||||
was built with that package. See the "Making
|
||||
LAMMPS"_Section_start.html#2_3 section for more info.
|
||||
|
||||
No information about this fix is written to "binary restart
|
||||
files"_restart.html. None of the "fix_modify"_fix_modify.html options
|
||||
are relevant to this fix.
|
||||
|
||||
No parameter of this fix can be used with the {start/stop} keywords of
|
||||
the "run"_run.html command.
|
||||
|
||||
[Restrictions:]
|
||||
|
||||
The fix must be the first fix specified for a given run. The
|
||||
force/neigh {mode} should not be used with a triclinic box or
|
||||
"hybrid"_pair_hybrid.html pair styles.
|
||||
|
||||
The {split} setting must be positive when using
|
||||
"hybrid"_pair_hybrid.html pair styles.
|
||||
|
||||
Currently, group-ID must be all.
|
||||
|
||||
[Related commands:] none
|
||||
|
||||
[Default:] none
|
||||
|
||||
121
doc/package.html
121
doc/package.html
@ -15,39 +15,136 @@
|
||||
</P>
|
||||
<PRE>package style args
|
||||
</PRE>
|
||||
<UL><LI>style = <I>cuda</I>
|
||||
<UL><LI>style = <I>gpu</I> or <I>cuda</I> or <I>omp</I>
|
||||
|
||||
<LI>args = 0 or more args specific to the style
|
||||
<LI>args = arguments specific to the style
|
||||
|
||||
<PRE> <I>cuda</I> args = to be determined
|
||||
<LI> <I>gpu</I> args = mode first last split
|
||||
mode = force or force/neigh
|
||||
|
||||
<LI> first = ID of first GPU to be used on each node
|
||||
|
||||
<LI> last = ID of last GPU to be used on each node
|
||||
|
||||
<LI> split = fraction of particles assigned to the GPU
|
||||
|
||||
<PRE> <I>cuda</I> args = to be determined
|
||||
<I>omp</I> args = Nthreads
|
||||
</PRE>
|
||||
<PRE> Nthreads = # of OpenMP threads to associate with each MPI process
|
||||
</PRE>
|
||||
|
||||
</UL>
|
||||
<P><B>Examples:</B>
|
||||
</P>
|
||||
<PRE>package cuda blah
|
||||
<PRE>package gpu force 0 0 1.0
|
||||
package gpu force 0 0 0.75
|
||||
package gpu force/neigh 0 0 1.0
|
||||
package gpu force/neigh 0 1 -1.0
|
||||
package cuda blah
|
||||
package omp 4
|
||||
</PRE>
|
||||
<P><B>Description:</B>
|
||||
</P>
|
||||
<P>This command invokes package-specific settings. Currently only the
|
||||
USER-CUDA package uses it.
|
||||
<P>This command invokes package-specific settings. Currently the
|
||||
following packages use it: GPU, USER-CUDA, and USER-OMP.
|
||||
</P>
|
||||
<P>See <A HREF = "doc/Section_accerate.html">this section</A> of the manual for more
|
||||
details about using these various packages for accelerating
|
||||
a LAMMPS calculation.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>gpu</I> style invokes options associated with the use of the GPU
|
||||
package. It allows you to select and initialize GPUs to be used for
|
||||
acceleration via this package and configure how the GPU acceleration
|
||||
is performed. These settings are required in order to use any style
|
||||
with GPU acceleration.
|
||||
</P>
|
||||
<P>The <I>mode</I> setting specifies where neighbor list calculations will be
|
||||
performed. If <I>mode</I> is force, neighbor list calculation is performed
|
||||
on the CPU. If <I>mode</I> is force/neigh, neighbor list calculation is
|
||||
performed on the GPU. GPU neighbor list calculation currently cannot
|
||||
be used with a triclinic box. GPU neighbor list calculation currently
|
||||
cannot be used with <A HREF = "pair_hybrid.html">hybrid</A> pair styles. GPU
|
||||
neighbor lists are not compatible with styles that are not
|
||||
GPU-enabled. When a non-GPU enabled style requires a neighbor list,
|
||||
it will also be built using CPU routines. In these cases, it will
|
||||
typically be more efficient to only use CPU neighbor list builds.
|
||||
</P>
|
||||
<P>The <I>first</I> and <I>last</I> settings specify the GPUs that will be used for
|
||||
simulation. On each node, the GPU IDs in the inclusive range from
|
||||
<I>first</I> to <I>last</I> will be used.
|
||||
</P>
|
||||
<P>The <I>split</I> setting can be used for load balancing force calculation
|
||||
work between CPU and GPU cores in GPU-enabled pair styles. If 0 <
|
||||
<I>split</I> < 1.0, a fixed fraction of particles is offloaded to the GPU
|
||||
while force calculation for the other particles occurs simulataneously
|
||||
on the CPU. If <I>split</I><0, the optimal fraction (based on CPU and GPU
|
||||
timings) is calculated every 25 timesteps. If <I>split</I> = 1.0, all force
|
||||
calculations for GPU accelerated pair styles are performed on the
|
||||
GPU. In this case, <A HREF = "pair_hybrid.html">hybrid</A>, <A HREF = "bond_style.html">bond</A>,
|
||||
<A HREF = "angle_style.html">angle</A>, <A HREF = "dihedral_style.html">dihedral</A>,
|
||||
<A HREF = "improper_style.html">improper</A>, and <A HREF = "kspace_style.html">long-range</A>
|
||||
calculations can be performed on the CPU while the GPU is performing
|
||||
force calculations for the GPU-enabled pair style. If all CPU force
|
||||
computations complete before the GPU, LAMMPS will block until the GPU
|
||||
has finished before continuing the timestep.
|
||||
</P>
|
||||
<P>As an example, if you have two GPUs per node and 8 CPU cores per node,
|
||||
and would like to run on 4 nodes (32 cores) with dynamic balancing of
|
||||
force calculation across CPU and GPU cores, you could specify
|
||||
</P>
|
||||
<PRE>package gpu force/neigh 0 1 -1
|
||||
</PRE>
|
||||
<P>In this case, all CPU cores and GPU devices on the nodes would be
|
||||
utilized. Each GPU device would be shared by 4 CPU cores. The CPU
|
||||
cores would perform force calculations for some fraction of the
|
||||
particles at the same time the GPUs performed force calculation for
|
||||
the other particles.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>cuda</I> style invokes options associated with the use of the
|
||||
USER-CUDA package. These will be described when the USER-CUDA package
|
||||
is released with LAMMPS.
|
||||
USER-CUDA package. These need to be documented.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P>The <I>omp</I> style invokes options associated with the use of the
|
||||
USER-OMP package.
|
||||
</P>
|
||||
<P>The only setting to make is the number of OpenMP threads to be
|
||||
allocated for each MPI process. For example, if your system has nodes
|
||||
with dual quad-core processors, it has a total of 8 cores per node.
|
||||
You could run MPI on 2 cores on each node (e.g. using options for the
|
||||
mpirun command), and set the <I>Nthreads</I> setting to 4. This would
|
||||
effectively use all 8 cores on each node. Since each MPI process
|
||||
would spawn 4 threads (one of which runs as part of the MPI process
|
||||
itself).
|
||||
</P>
|
||||
<P>For performance reasons, you should not set <I>Nthreads</I> to more threads
|
||||
than there are physical cores, but LAMMPS does not check for this.
|
||||
</P>
|
||||
<HR>
|
||||
|
||||
<P><B>Restrictions:</B>
|
||||
</P>
|
||||
<P>This command cannot be used after the simulation box is defined by a
|
||||
<A HREF = "read_data.html">read_data</A> or <A HREF = "create_box.html">create_box</A> command.
|
||||
</P>
|
||||
<P>The cuda style of this command can only be invoked if LAMMPS was built
|
||||
with the USER-CUDA package. See the <A HREF = "Section_start.html#2_3">Making
|
||||
LAMMPS</A> section for more info.
|
||||
</P>
|
||||
<P>Obviously, you must have GPU hardware and associated software to build
|
||||
and use LAMMPS with either the GPU or USER-CUDA packages.
|
||||
<P>The gpu style of this command can only be invoked if LAMMPS was built
|
||||
with the GPU package. See the <A HREF = "Section_start.html#2_3">Making LAMMPS</A>
|
||||
section for more info.
|
||||
</P>
|
||||
<P><B>Related commands:</B>
|
||||
<P>The omp style of this command can only be invoked if LAMMPS was built
|
||||
with the USER-OMP package. See the <A HREF = "Section_start.html#2_3">Making
|
||||
LAMMPS</A> section for more info.
|
||||
</P>
|
||||
<P><A HREF = "fix_gpu.html">fix gpu</A>
|
||||
<P><B>Related commands:</B> none
|
||||
</P>
|
||||
<P><B>Default:</B> none
|
||||
</P>
|
||||
|
||||
116
doc/package.txt
116
doc/package.txt
@ -12,35 +12,127 @@ package command :h3
|
||||
|
||||
package style args :pre
|
||||
|
||||
style = {cuda} :ulb,l
|
||||
args = 0 or more args specific to the style :l
|
||||
{cuda} args = to be determined :pre
|
||||
style = {gpu} or {cuda} or {omp} :ulb,l
|
||||
args = arguments specific to the style :l
|
||||
{gpu} args = mode first last split
|
||||
mode = force or force/neigh :l
|
||||
first = ID of first GPU to be used on each node :l
|
||||
last = ID of last GPU to be used on each node :l
|
||||
split = fraction of particles assigned to the GPU :l
|
||||
{cuda} args = to be determined
|
||||
{omp} args = Nthreads :pre
|
||||
Nthreads = # of OpenMP threads to associate with each MPI process :pre
|
||||
:ule
|
||||
|
||||
[Examples:]
|
||||
|
||||
package cuda blah :pre
|
||||
package gpu force 0 0 1.0
|
||||
package gpu force 0 0 0.75
|
||||
package gpu force/neigh 0 0 1.0
|
||||
package gpu force/neigh 0 1 -1.0
|
||||
package cuda blah
|
||||
package omp 4 :pre
|
||||
|
||||
[Description:]
|
||||
|
||||
This command invokes package-specific settings. Currently only the
|
||||
USER-CUDA package uses it.
|
||||
This command invokes package-specific settings. Currently the
|
||||
following packages use it: GPU, USER-CUDA, and USER-OMP.
|
||||
|
||||
See "this section"_doc/Section_accerate.html of the manual for more
|
||||
details about using these various packages for accelerating
|
||||
a LAMMPS calculation.
|
||||
|
||||
:line
|
||||
|
||||
The {gpu} style invokes options associated with the use of the GPU
|
||||
package. It allows you to select and initialize GPUs to be used for
|
||||
acceleration via this package and configure how the GPU acceleration
|
||||
is performed. These settings are required in order to use any style
|
||||
with GPU acceleration.
|
||||
|
||||
The {mode} setting specifies where neighbor list calculations will be
|
||||
performed. If {mode} is force, neighbor list calculation is performed
|
||||
on the CPU. If {mode} is force/neigh, neighbor list calculation is
|
||||
performed on the GPU. GPU neighbor list calculation currently cannot
|
||||
be used with a triclinic box. GPU neighbor list calculation currently
|
||||
cannot be used with "hybrid"_pair_hybrid.html pair styles. GPU
|
||||
neighbor lists are not compatible with styles that are not
|
||||
GPU-enabled. When a non-GPU enabled style requires a neighbor list,
|
||||
it will also be built using CPU routines. In these cases, it will
|
||||
typically be more efficient to only use CPU neighbor list builds.
|
||||
|
||||
The {first} and {last} settings specify the GPUs that will be used for
|
||||
simulation. On each node, the GPU IDs in the inclusive range from
|
||||
{first} to {last} will be used.
|
||||
|
||||
The {split} setting can be used for load balancing force calculation
|
||||
work between CPU and GPU cores in GPU-enabled pair styles. If 0 <
|
||||
{split} < 1.0, a fixed fraction of particles is offloaded to the GPU
|
||||
while force calculation for the other particles occurs simulataneously
|
||||
on the CPU. If {split}<0, the optimal fraction (based on CPU and GPU
|
||||
timings) is calculated every 25 timesteps. If {split} = 1.0, all force
|
||||
calculations for GPU accelerated pair styles are performed on the
|
||||
GPU. In this case, "hybrid"_pair_hybrid.html, "bond"_bond_style.html,
|
||||
"angle"_angle_style.html, "dihedral"_dihedral_style.html,
|
||||
"improper"_improper_style.html, and "long-range"_kspace_style.html
|
||||
calculations can be performed on the CPU while the GPU is performing
|
||||
force calculations for the GPU-enabled pair style. If all CPU force
|
||||
computations complete before the GPU, LAMMPS will block until the GPU
|
||||
has finished before continuing the timestep.
|
||||
|
||||
As an example, if you have two GPUs per node and 8 CPU cores per node,
|
||||
and would like to run on 4 nodes (32 cores) with dynamic balancing of
|
||||
force calculation across CPU and GPU cores, you could specify
|
||||
|
||||
package gpu force/neigh 0 1 -1 :pre
|
||||
|
||||
In this case, all CPU cores and GPU devices on the nodes would be
|
||||
utilized. Each GPU device would be shared by 4 CPU cores. The CPU
|
||||
cores would perform force calculations for some fraction of the
|
||||
particles at the same time the GPUs performed force calculation for
|
||||
the other particles.
|
||||
|
||||
:line
|
||||
|
||||
The {cuda} style invokes options associated with the use of the
|
||||
USER-CUDA package. These will be described when the USER-CUDA package
|
||||
is released with LAMMPS.
|
||||
USER-CUDA package. These need to be documented.
|
||||
|
||||
:line
|
||||
|
||||
The {omp} style invokes options associated with the use of the
|
||||
USER-OMP package.
|
||||
|
||||
The only setting to make is the number of OpenMP threads to be
|
||||
allocated for each MPI process. For example, if your system has nodes
|
||||
with dual quad-core processors, it has a total of 8 cores per node.
|
||||
You could run MPI on 2 cores on each node (e.g. using options for the
|
||||
mpirun command), and set the {Nthreads} setting to 4. This would
|
||||
effectively use all 8 cores on each node. Since each MPI process
|
||||
would spawn 4 threads (one of which runs as part of the MPI process
|
||||
itself).
|
||||
|
||||
For performance reasons, you should not set {Nthreads} to more threads
|
||||
than there are physical cores, but LAMMPS does not check for this.
|
||||
|
||||
:line
|
||||
|
||||
[Restrictions:]
|
||||
|
||||
This command cannot be used after the simulation box is defined by a
|
||||
"read_data"_read_data.html or "create_box"_create_box.html command.
|
||||
|
||||
The cuda style of this command can only be invoked if LAMMPS was built
|
||||
with the USER-CUDA package. See the "Making
|
||||
LAMMPS"_Section_start.html#2_3 section for more info.
|
||||
|
||||
Obviously, you must have GPU hardware and associated software to build
|
||||
and use LAMMPS with either the GPU or USER-CUDA packages.
|
||||
The gpu style of this command can only be invoked if LAMMPS was built
|
||||
with the GPU package. See the "Making LAMMPS"_Section_start.html#2_3
|
||||
section for more info.
|
||||
|
||||
[Related commands:]
|
||||
The omp style of this command can only be invoked if LAMMPS was built
|
||||
with the USER-OMP package. See the "Making
|
||||
LAMMPS"_Section_start.html#2_3 section for more info.
|
||||
|
||||
"fix gpu"_fix_gpu.html
|
||||
[Related commands:] none
|
||||
|
||||
[Default:] none
|
||||
|
||||
Reference in New Issue
Block a user