git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@6623 f3b2605a-c512-4ea7-a41b-209d697bcdaa

2011-08-08 19:49:47 +00:00
parent 8745580d5d
commit da3e05ef46
6 changed files with 83 additions and 52 deletions
--- a/doc/Section_commands.html
+++ b/doc/Section_commands.html
@ -521,7 +521,7 @@ description:
 Kspace solvers.  Click on the style itself for a full description:
 </P>
 <DIV ALIGN=center><TABLE  BORDER=1 >
-<TR ALIGN="center"><TD WIDTH="100"><A HREF = "kspace_style.html">ewald</A></TD><TD WIDTH="100"><A HREF = "kspace_style.html">pppm</A></TD><TD WIDTH="100"><A HREF = "kspace_style.html">pppm/tip4p</A> 
+<TR ALIGN="center"><TD WIDTH="100"><A HREF = "kspace_style.html">ewald</A></TD><TD WIDTH="100"><A HREF = "kspace_style.html">pppm</A></TD><TD WIDTH="100"><A HREF = "kspace_style.html">pppm/cg</A></TD><TD WIDTH="100"><A HREF = "kspace_style.html">pppm/tip4p</A> 
 </TD></TR></TABLE></DIV>

 <P>These are Kspace solvers contributed by users, which can be used if
@ -536,7 +536,7 @@ built with the <A HREF = "Section_accelerate.html">appropriate accelerated
 package</A>.
 </P>
 <DIV ALIGN=center><TABLE  BORDER=1 >
-<TR ALIGN="center"><TD ><A HREF = "kspace_style.html">pppm/cuda</A></TD><TD ><A HREF = "kspace_style.html">pppm/gpu/single</A></TD><TD ><A HREF = "kspace_style.html">pppm/gpu/double</A> 
+<TR ALIGN="center"><TD ><A HREF = "kspace_style.html">pppm/cuda</A></TD><TD ><A HREF = "kspace_style.html">pppm/gpu</A> 
 </TD></TR></TABLE></DIV>

 </HTML>
--- a/doc/Section_commands.txt
+++ b/doc/Section_commands.txt
@ -804,6 +804,7 @@ Kspace solvers.  Click on the style itself for a full description:

 "ewald"_kspace_style.html,
 "pppm"_kspace_style.html,
+"pppm/cg"_kspace_style.html,
 "pppm/tip4p"_kspace_style.html :tb(c=4,ea=c,w=100)

 These are Kspace solvers contributed by users, which can be used if
@ -816,6 +817,6 @@ built with the "appropriate accelerated
 package"_Section_accelerate.html.

 "pppm/cuda"_kspace_style.html,
-"pppm/gpu/single"_kspace_style.html,
-"pppm/gpu/double"_kspace_style.html :tb(c=4,ea=c)
+"pppm/gpu"_kspace_style.html :tb(c=4,ea=c)
+

--- a/doc/Section_start.html
+++ b/doc/Section_start.html
@ -312,18 +312,16 @@ your lo-level LAMMPS Makefile, called -DFFTW2_SIZE, which will select
 the correct include file.  In this case, For FFT_LIB you still must
 manually specify the correct -lsfftw or -ldfftw.
 </P>
-<P>(3.d) NOTE: this -DFFT_SINGLE option is not yet fully implemented.
-</P>
-<P>The FFT_INC variable also allows for a -DFFT_SINGLE setting that will
-use single-precision FFTs with PPPM, which can speed-up long-range
-calulations, particularly in parallel or on a GPU.  Fourier transform
-and related PPPM operations are somewhat insensitive to floating point
-truncation errors and thus do not always need to be performed in
-double precision.  Using the -DFFT_SINGLE setting trades off a little
-accuracy for reduced memory use and parallel communication costs for
-transposing 3d FFT data.  Note that single precision FFTs have only
-been tested with the FFTW3, FFTW2, MKL, and the internal KISS FFTs
-options.
+<P>(3.d) The FFT_INC variable also allows for a -DFFT_SINGLE setting that
+will use single-precision FFTs with PPPM, which can speed-up
+long-range calulations, particularly in parallel or on a GPU.  Fourier
+transform and related PPPM operations are somewhat insensitive to
+floating point truncation errors and thus do not always need to be
+performed in double precision.  Using the -DFFT_SINGLE setting trades
+off a little accuracy for reduced memory use and parallel
+communication costs for transposing 3d FFT data.  Note that single
+precision FFTs have only been tested with the FFTW3, FFTW2, MKL, and
+the internal KISS FFTs options.
 </P>
 <P>(3.e) The 3 JPG variables are used to specify a JPEG library which
 LAMMPS uses when writing a JPEG file via the <A HREF = "dump_image.html">dump
--- a/doc/Section_start.txt
+++ b/doc/Section_start.txt
@ -307,18 +307,16 @@ your lo-level LAMMPS Makefile, called -DFFTW2_SIZE, which will select
 the correct include file.  In this case, For FFT_LIB you still must
 manually specify the correct -lsfftw or -ldfftw.

-(3.d) NOTE: this -DFFT_SINGLE option is not yet fully implemented.
-
-The FFT_INC variable also allows for a -DFFT_SINGLE setting that will
-use single-precision FFTs with PPPM, which can speed-up long-range
-calulations, particularly in parallel or on a GPU.  Fourier transform
-and related PPPM operations are somewhat insensitive to floating point
-truncation errors and thus do not always need to be performed in
-double precision.  Using the -DFFT_SINGLE setting trades off a little
-accuracy for reduced memory use and parallel communication costs for
-transposing 3d FFT data.  Note that single precision FFTs have only
-been tested with the FFTW3, FFTW2, MKL, and the internal KISS FFTs
-options.
+(3.d) The FFT_INC variable also allows for a -DFFT_SINGLE setting that
+will use single-precision FFTs with PPPM, which can speed-up
+long-range calulations, particularly in parallel or on a GPU.  Fourier
+transform and related PPPM operations are somewhat insensitive to
+floating point truncation errors and thus do not always need to be
+performed in double precision.  Using the -DFFT_SINGLE setting trades
+off a little accuracy for reduced memory use and parallel
+communication costs for transposing 3d FFT data.  Note that single
+precision FFTs have only been tested with the FFTW3, FFTW2, MKL, and
+the internal KISS FFTs options.

 (3.e) The 3 JPG variables are used to specify a JPEG library which
 LAMMPS uses when writing a JPEG file via the "dump
--- a/doc/kspace_style.html
+++ b/doc/kspace_style.html
@ -15,20 +15,21 @@
 </P>
 <PRE>kspace_style style value 
 </PRE>
-<UL><LI>style = <I>none</I> or <I>ewald</I> or <I>pppm</I> or <I>pppm/tip4p</I> or <I>ewald/n</I> or <I>pppm/gpu/single</I> or <I>pppm/gpu/double</I> 
+<UL><LI>style = <I>none</I> or <I>ewald</I> or <I>pppm</I> or <I>pppm/cg</I> or <I>pppm/tip4p</I> or <I>ewald/n</I> or <I>pppm/gpu</I> 

 <PRE>  <I>none</I> value = none
  <I>ewald</I> value = precision
    precision = desired accuracy
  <I>pppm</I> value = precision
    precision = desired accuracy
+  <I>pppm/cg</I> value = precision (smallq)
+    precision = desired accuracy
+    smallq = cutoff for charges to be considered (optional) (charge units)
  <I>pppm/tip4p</I> value = precision
    precision = desired accuracy
  <I>ewald/n</I> value = precision
    precision = desired accuracy
-  <I>pppm/gpu/single</I> value = precision
-    precision = desired accuracy
-  <I>pppm/gpu/double</I> value = precision
+  <I>pppm/gpu</I> value = precision
    precision = desired accuracy 
 </PRE>

@ -36,6 +37,7 @@
 <P><B>Examples:</B>
 </P>
 <PRE>kspace_style pppm 1.0e-4
+kspace_style pppm/cg 1.0e-5 1.0e-6
 kspace_style none 
 </PRE>
 <P><B>Description:</B>
@ -60,6 +62,12 @@ N^(3/2) where N is the number of atoms in the system.  The PPPM solver
 scales as Nlog(N) due to the FFTs, so it is almost always a faster
 choice <A HREF = "#Pollock">(Pollock)</A>.
 </P>
+<P>The <I>pppm/cg</I> style is identical to the <I>pppm</I> style except that it
+has an optimization for systems where most particles are uncharged.
+The optional <I>smallq</I> argument defines the cutoff for the absolute
+charge value which determines whether a particle is considered charged
+or not.  Its default value is 1.0e-5.
+</P>
 <P>The <I>pppm/tip4p</I> style is identical to the <I>pppm</I> style except that it
 adds a charge at the massless 4th site in each TIP4P water molecule.
 It should be used with <A HREF = "pair_style.html">pair styles</A> with a
@ -76,6 +84,15 @@ long-range potentials.
 <P>Currently, only the <I>ewald/n</I> style can be used with non-orthogonal
 (triclinic symmetry) simulation boxes.
 </P>
+<P>Note that the PPPM styles can be used with single-precision FFTs by
+using the compiler switch -DFFT_SINGLE for the FFT_INC setting in your
+lo-level Makefile.  This setting also changes some of the PPPM
+operations (e.g. mapping charge to mesh and interpolating electric
+fields to particles) to be performed in single precision.  This option
+can speed-up long-range calulations, particularly in parallel or on a
+GPU.  The use of the -DFFT_SINGLE flag is discussed in <A HREF = "Section_start.html#2_2_4">this
+section</A> of the manual.
+</P>
 <HR>

 <P>When a kspace style is used, a pair style that includes the
@ -96,20 +113,20 @@ options of the K-space solvers that can be set.
 </P>
 <HR>

-<P>Styles with a <I>cuda</I>, <I>gpu/single</I>, <I>gpu/double</I>, or <I>opt</I> suffix are
+<P>Styles with a <I>cuda</I>, <I>gpu</I>, or <I>opt</I> suffix are
 functionally the same as the corresponding style without the suffix.
 They have been optimized to run faster, depending on your available
 hardware, as discussed in <A HREF = "Section_accelerate.html">this section</A> of
 the manual.  The accelerated styles take the same arguments and should
 produce the same results, except for round-off and precision issues.
 </P>
-<P>More specifically, the <I>pppm/gpu/single</I> style performs single
-precision charge assignment and force interpolation calculations on
-the GPU.  The <I>pppm/gpu/double</I> style performs the mesh calculations
-on the GPU in double precision. In both cases, FFT solves are
-calculated on the CPU.  If either <I>pppm/gpu/single</I> or
-<I>pppm/gpu/double</I> are used with a GPU-enabled pair style, part of the
-PPPM calculation can be performed concurrently on the GPU while other
+<P>More specifically, the <I>pppm/gpu</I> style performs charge assignment and
+force interpolation calculations on the GPU.  These processes are
+performed either in single or double precision, depending on whether
+the -DFFT_SINGLE setting was specified in your lo-level Makefile, as
+discussed above.  The FFTs themselves are still calculated on the CPU.
+If <I>pppm/gpu</I> is used with a GPU-enabled pair style, part of the PPPM
+calculation can be performed concurrently on the GPU while other
 calculations for non-bonded and bonded force calculation are performed
 on the CPU.
 </P>
--- a/doc/kspace_style.txt
+++ b/doc/kspace_style.txt
@ -12,25 +12,27 @@ kspace_style command :h3

 kspace_style style value :pre

-style = {none} or {ewald} or {pppm} or {pppm/tip4p} or {ewald/n} or {pppm/gpu/single} or {pppm/gpu/double} :ulb,l
+style = {none} or {ewald} or {pppm} or {pppm/cg} or {pppm/tip4p} or {ewald/n} or {pppm/gpu} :ulb,l
  {none} value = none
  {ewald} value = precision
    precision = desired accuracy
  {pppm} value = precision
    precision = desired accuracy
+  {pppm/cg} value = precision (smallq)
+    precision = desired accuracy
+    smallq = cutoff for charges to be considered (optional) (charge units)
  {pppm/tip4p} value = precision
    precision = desired accuracy
  {ewald/n} value = precision
    precision = desired accuracy
-  {pppm/gpu/single} value = precision
-    precision = desired accuracy
-  {pppm/gpu/double} value = precision
+  {pppm/gpu} value = precision
    precision = desired accuracy :pre
 :ule

 [Examples:]

 kspace_style pppm 1.0e-4
+kspace_style pppm/cg 1.0e-5 1.0e-6
 kspace_style none :pre

 [Description:]
@ -55,6 +57,12 @@ N^(3/2) where N is the number of atoms in the system.  The PPPM solver
 scales as Nlog(N) due to the FFTs, so it is almost always a faster
 choice "(Pollock)"_#Pollock.

+The {pppm/cg} style is identical to the {pppm} style except that it
+has an optimization for systems where most particles are uncharged.
+The optional {smallq} argument defines the cutoff for the absolute
+charge value which determines whether a particle is considered charged
+or not.  Its default value is 1.0e-5.
+
 The {pppm/tip4p} style is identical to the {pppm} style except that it
 adds a charge at the massless 4th site in each TIP4P water molecule.
 It should be used with "pair styles"_pair_style.html with a
@ -71,6 +79,15 @@ long-range potentials.
 Currently, only the {ewald/n} style can be used with non-orthogonal
 (triclinic symmetry) simulation boxes.

+Note that the PPPM styles can be used with single-precision FFTs by
+using the compiler switch -DFFT_SINGLE for the FFT_INC setting in your
+lo-level Makefile.  This setting also changes some of the PPPM
+operations (e.g. mapping charge to mesh and interpolating electric
+fields to particles) to be performed in single precision.  This option
+can speed-up long-range calulations, particularly in parallel or on a
+GPU.  The use of the -DFFT_SINGLE flag is discussed in "this
+section"_Section_start.html#2_2_4 of the manual.
+
 :line

 When a kspace style is used, a pair style that includes the
@ -91,20 +108,20 @@ options of the K-space solvers that can be set.

 :line

-Styles with a {cuda}, {gpu/single}, {gpu/double}, or {opt} suffix are
+Styles with a {cuda}, {gpu}, or {opt} suffix are
 functionally the same as the corresponding style without the suffix.
 They have been optimized to run faster, depending on your available
 hardware, as discussed in "this section"_Section_accelerate.html of
 the manual.  The accelerated styles take the same arguments and should
 produce the same results, except for round-off and precision issues.

-More specifically, the {pppm/gpu/single} style performs single
-precision charge assignment and force interpolation calculations on
-the GPU.  The {pppm/gpu/double} style performs the mesh calculations
-on the GPU in double precision. In both cases, FFT solves are
-calculated on the CPU.  If either {pppm/gpu/single} or
-{pppm/gpu/double} are used with a GPU-enabled pair style, part of the
-PPPM calculation can be performed concurrently on the GPU while other
+More specifically, the {pppm/gpu} style performs charge assignment and
+force interpolation calculations on the GPU.  These processes are
+performed either in single or double precision, depending on whether
+the -DFFT_SINGLE setting was specified in your lo-level Makefile, as
+discussed above.  The FFTs themselves are still calculated on the CPU.
+If {pppm/gpu} is used with a GPU-enabled pair style, part of the PPPM
+calculation can be performed concurrently on the GPU while other
 calculations for non-bonded and bonded force calculation are performed
 on the CPU.