diff --git a/doc/package.html b/doc/package.html index f31f8db721..2263301ebe 100644 --- a/doc/package.html +++ b/doc/package.html @@ -25,22 +25,24 @@ last = ID of last GPU to be used on each node split = fraction of particles assigned to the GPU zero or more keyword/value pairs may be appended - keywords = threads_per_atom - threads_per_atom value = Nthreads - Nthreads = # of GPU threads used per atom + keywords = threads_per_atom or cellsize + threads_per_atom value = Nthreads + Nthreads = # of GPU threads used per atom + cellsize value = dist + dist = length (distance units) in each dimension for neighbor bins cuda args = keyword value ... one or more keyword/value pairs may be appended keywords = gpu/node or gpu/node/special or timing or test or override/bpa - gpu/node value = N - N = number of GPUs to be used per node - gpu/node/special values = N gpu1 .. gpuN - N = number of GPUs to be used per node - gpu1 .. gpuN = N IDs of the GPUs to use - timing values = none - test values = id - id = atom-ID of a test particle - override/bpa values = flag - flag = 0 for TpA algorithm, 1 for BpA algorithm + gpu/node value = N + N = number of GPUs to be used per node + gpu/node/special values = N gpu1 .. gpuN + N = number of GPUs to be used per node + gpu1 .. gpuN = N IDs of the GPUs to use + timing values = none + test values = id + id = atom-ID of a test particle + override/bpa values = flag + flag = 0 for TpA algorithm, 1 for BpA algorithm omp args = Nthreads mode Nthreads = # of OpenMP threads to associate with each MPI process mode = force or force/neigh (optional) @@ -133,6 +135,18 @@ large cutoffs or with a small number of particles per GPU, increasing the value can improve performance. The number of threads per atom must be a power of 2 and currently cannot be greater than 32.
+The cellsize keyword can be used to control the size of the cells used +for binning atoms in neighbor list calculations. Setting this value is +normally not needed; the optimal value is close to the default +(equal to the cutoff distance for the short range interactions +plus the neighbor skin). GPUs can perform efficiently with much larger cutoffs +than CPUs and this can be used to reduce the time required for long-range +calculations or in some cases to eliminate them with models such as +coul/wolf or coul/dsf. For very large cutoffs, +it can be more efficient to use smaller values for cellsize in parallel +simulations. For example, with a cutoff of 20*sigma and a neighbor skin of +sigma, a cellsize of 5.25*sigma can be efficient for parallel simulations. +
The cuda style invokes options associated with the use of the diff --git a/doc/package.txt b/doc/package.txt index c9b6d9681f..fb2431f8ad 100644 --- a/doc/package.txt +++ b/doc/package.txt @@ -20,22 +20,24 @@ args = arguments specific to the style :l last = ID of last GPU to be used on each node split = fraction of particles assigned to the GPU zero or more keyword/value pairs may be appended - keywords = {threads_per_atom} - {threads_per_atom} value = Nthreads - Nthreads = # of GPU threads used per atom + keywords = {threads_per_atom} or {cellsize} + {threads_per_atom} value = Nthreads + Nthreads = # of GPU threads used per atom + {cellsize} value = dist + dist = length (distance units) in each dimension for neighbor bins {cuda} args = keyword value ... one or more keyword/value pairs may be appended keywords = {gpu/node} or {gpu/node/special} or {timing} or {test} or {override/bpa} - {gpu/node} value = N - N = number of GPUs to be used per node - {gpu/node/special} values = N gpu1 .. gpuN - N = number of GPUs to be used per node - gpu1 .. gpuN = N IDs of the GPUs to use - {timing} values = none - {test} values = id - id = atom-ID of a test particle - {override/bpa} values = flag - flag = 0 for TpA algorithm, 1 for BpA algorithm + {gpu/node} value = N + N = number of GPUs to be used per node + {gpu/node/special} values = N gpu1 .. gpuN + N = number of GPUs to be used per node + gpu1 .. gpuN = N IDs of the GPUs to use + {timing} values = none + {test} values = id + id = atom-ID of a test particle + {override/bpa} values = flag + flag = 0 for TpA algorithm, 1 for BpA algorithm {omp} args = Nthreads mode Nthreads = # of OpenMP threads to associate with each MPI process mode = force or force/neigh (optional) :pre @@ -127,6 +129,18 @@ large cutoffs or with a small number of particles per GPU, increasing the value can improve performance. The number of threads per atom must be a power of 2 and currently cannot be greater than 32. +The {cellsize} keyword can be used to control the size of the cells used +for binning atoms in neighbor list calculations. Setting this value is +normally not needed; the optimal value is close to the default +(equal to the cutoff distance for the short range interactions +plus the neighbor skin). GPUs can perform efficiently with much larger cutoffs +than CPUs and this can be used to reduce the time required for long-range +calculations or in some cases to eliminate them with models such as +"coul/wolf"_pair_coul.html or "coul/dsf"_pair_coul.html. For very large cutoffs, +it can be more efficient to use smaller values for cellsize in parallel +simulations. For example, with a cutoff of 20*sigma and a neighbor skin of +sigma, a cellsize of 5.25*sigma can be efficient for parallel simulations. + :line The {cuda} style invokes options associated with the use of the diff --git a/doc/pair_coul.html b/doc/pair_coul.html index 62cab162ac..153cfcb150 100644 --- a/doc/pair_coul.html +++ b/doc/pair_coul.html @@ -17,6 +17,10 @@
pair_style coul/cut cutoff pair_style coul/debye kappa cutoff +pair_style coul/dsf alpha cutoff pair_style coul/long cutoff pair_style coul/long/gpu cutoff pair_style coul/wolf alpha cutoff @@ -49,6 +54,9 @@ pair_coeff 2 2 3.5 pair_coeff * * pair_coeff 2 2 3.5+
pair_style coul/dsf 0.05 10.0 +pair_coeff * * +
pair_style coul/long 10.0 pair_coeff * *@@ -75,6 +83,17 @@ Coulombic term, given by
where kappa is the Debye length. This potential is another way to mimic the screening effect of a polar solvent.
+Style coul/dsf computes Coulombic interactions via the damped +shifted force model described in Fennell, given by: +
+
+where alpha is the damping parameter and erfc() is the +complementary error-function. The potential corrects issues in the +Wolf model (described below) to provide consistent forces and energies +(the Wolf potential is not differentiable at the cutoff) and smooth +decay to zero. +
Style coul/wolf computes Coulombic interactions via the Wolf summation method, described in Wolf, given by:
@@ -193,5 +212,11 @@ hybrid/overlay(Wolf) D. Wolf, P. Keblinski, S. R. Phillpot, J. Eggebrecht, J Chem -Phys, 110, 8254 (1999).
+Phys, 110, 8254 (1999). + + + +(Fennell) C. J. Fennell, J. D. Gezelter, J Chem Phys, 124, +234104 (2006). +