Merge branch 'master' into USER-DPD_kokkos as of patch 13Feb17

This commit is contained in:
Tim Mattox
2017-02-13 12:29:41 -05:00
38 changed files with 414 additions and 671 deletions

View File

@ -1,7 +1,7 @@
<!-- HTML_ONLY -->
<HEAD>
<TITLE>LAMMPS Users Manual</TITLE>
<META NAME="docnumber" CONTENT="26 Jan 2017 version">
<META NAME="docnumber" CONTENT="13 Feb 2017 version">
<META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
<META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License.">
</HEAD>
@ -21,7 +21,7 @@
<H1></H1>
LAMMPS Documentation :c,h3
26 Jan 2017 version :c,h4
13 Feb 2017 version :c,h4
Version info: :h4

View File

@ -969,7 +969,7 @@ KOKKOS, o = USER-OMP, t = OPT.
"lubricateU/poly"_pair_lubricateU.html,
"meam"_pair_meam.html,
"mie/cut (o)"_pair_mie.html,
"morse (got)"_pair_morse.html,
"morse (gkot)"_pair_morse.html,
"nb3b/harmonic (o)"_pair_nb3b_harmonic.html,
"nm/cut (o)"_pair_nm.html,
"nm/cut/coul/cut (o)"_pair_nm.html,

View File

@ -22,7 +22,7 @@ either conceptually, or as printed out by the program.
12.1 Common problems :link(err_1),h4
If two LAMMPS runs do not produce the same answer on different
If two LAMMPS runs do not produce the exact same answer on different
machines or different numbers of processors, this is typically not a
bug. In theory you should get identical answers on any number of
processors and on any machine. In practice, numerical round-off can
@ -80,12 +80,24 @@ order. If you mess this up, LAMMPS will often flag the error, but it
may also simply read a bogus argument and assign a value that is
valid, but not what you wanted. E.g. trying to read the string "abc"
as an integer value of 0. Careful reading of the associated doc page
for the command should allow you to fix these problems. Note that
some commands allow for variables to be specified in place of numeric
constants so that the value can be evaluated and change over the
course of a run. This is typically done with the syntax {v_name} for
a parameter, where name is the name of the variable. This is only
allowed if the command documentation says it is.
for the command should allow you to fix these problems. In most cases,
where LAMMPS expects to read a number, either integer or floating point,
it performs a stringent test on whether the provided input actually
is an integer or floating-point number, respectively, and reject the
input with an error message (for instance, when an integer is required,
but a floating-point number 1.0 is provided):
ERROR: Expected integer parameter in input script or data file :pre
Some commands allow for using variable references in place of numeric
constants so that the value can be evaluated and may change over the
course of a run. This is typically done with the syntax {v_name} for a
parameter, where name is the name of the variable. On the other hand,
immediate variable expansion with the syntax ${name} is performed while
reading the input and before parsing commands,
NOTE: Using a variable reference (i.e. {v_name}) is only allowed if
the documentation of the corresponding command explicitly says it is.
Generally, LAMMPS will print a message to the screen and logfile and
exit gracefully when it encounters a fatal error. Sometimes it will

View File

@ -413,7 +413,7 @@ uses (for performing 1d FFTs) when running the particle-particle
particle-mesh (PPPM) option for long-range Coulombics via the
"kspace_style"_kspace_style.html command.
LAMMPS supports various open-source or vendor-supplied FFT libraries
LAMMPS supports common open-source or vendor-supplied FFT libraries
for this purpose. If you leave these 3 variables blank, LAMMPS will
use the open-source "KISS FFT library"_http://kissfft.sf.net, which is
included in the LAMMPS distribution. This library is portable to all
@ -423,10 +423,9 @@ package in your build, you can also leave the 3 variables blank.
Otherwise, select which kinds of FFTs to use as part of the FFT_INC
setting by a switch of the form -DFFT_XXX. Recommended values for XXX
are: MKL, SCSL, FFTW2, and FFTW3. Legacy options are: INTEL, SGI,
ACML, and T3E. For backward compatability, using -DFFT_FFTW will use
the FFTW2 library. Using -DFFT_NONE will use the KISS library
described above.
are: MKL or FFTW3. FFTW2 and NONE are supported as legacy options.
Selecting -DFFT_FFTW will use the FFTW3 library and -DFFT_NONE will
use the KISS library described above.
You may also need to set the FFT_INC, FFT_PATH, and FFT_LIB variables,
so the compiler and linker can find the needed FFT header and library

View File

@ -16,10 +16,11 @@ ID, group-ID are documented in "compute"_compute.html command :ulb,l
group/group = style name of this compute command :l
group2-ID = group ID of second (or same) group :l
zero or more keyword/value pairs may be appended :l
keyword = {pair} or {kspace} or {boundary} :l
keyword = {pair} or {kspace} or {boundary} or {molecule} :l
{pair} value = {yes} or {no}
{kspace} value = {yes} or {no}
{boundary} value = {yes} or {no} :pre
{boundary} value = {yes} or {no}
{molecule} value = {off} or {inter} or {intra} :pre
:ule
[Examples:]
@ -46,6 +47,13 @@ NOTE: The energies computed by the {pair} keyword do not include tail
corrections, even if they are enabled via the
"pair_modify"_pair_modify.html command.
If the {molecule} keyword is set to {inter} or {intra} than an
additional check is made based on the molecule IDs of the two atoms in
each pair before including their pairwise interaction energy and
force. For the {inter} setting, the two atoms must be in different
molecules. For the {intra} setting, the two atoms must be in the same
molecule.
If the {kspace} keyword is set to {yes}, which is not the default, and
if a "kspace_style"_kspace_style.html is defined, then the interaction
energy will include a Kspace component which is the long-range
@ -66,6 +74,10 @@ affect the force calculation and will be zero if one or both of the
groups are charge neutral. This energy correction term is the same as
that included in the regular Ewald and PPPM routines.
NOTE: The {molecule} setting only affects the group/group
contributions calculated by the {pair} keyword. It does not affect
the group/group contributions calculated by the {kspace} keyword.
This compute does not calculate any bond or angle or dihedral or
improper interactions between atoms in the two groups.
@ -78,6 +90,22 @@ work (FFTs, Ewald summation) as computing long-range forces for the
entire system. Thus it can be costly to invoke this compute too
frequently.
NOTE: If you have a bonded system, then the settings of
"special_bonds"_special_bonds.html command can remove pairwise
interactions between atoms in the same bond, angle, or dihedral. This
is the default setting for the "special_bonds"_special_bonds.html
command, and means those pairwise interactions do not appear in the
neighbor list. Because this compute uses a neighbor list, it also
means those pairs will not be included in the group/group interaction.
This does not apply when using long-range coulomb interactions
({coul/long}, {coul/msm}, {coul/wolf} or similar. One way to get
around this would be to set special_bond scaling factors to very tiny
numbers that are not exactly zero (e.g. 1.0e-50). Another workaround
is to write a dump file, and use the "rerun"_rerun.html command to
compute the group/group interactions for snapshots in the dump file.
The rerun script can use a "special_bonds"_special_bonds.html command
that includes all pairs in the neighbor list.
If you desire a breakdown of the interactions into a pairwise and
Kspace component, simply invoke the compute twice with the appropriate
yes/no settings for the {pair} and {kspace} keywords. This is no more
@ -119,7 +147,8 @@ The {ewald} and {pppm} styles do.
[Default:]
The option defaults are pair = yes, kspace = no, and boundary = yes.
The option defaults are pair = yes, kspace = no, boundary = yes,
molecule = off.
:line

View File

@ -41,14 +41,14 @@ NOTE: If you have a bonded system, then the settings of
interactions between atoms in the same bond, angle, or dihedral. This
is the default setting for the "special_bonds"_special_bonds.html
command, and means those pairwise interactions do not appear in the
neighbor list. Because this fix uses the neighbor list, it also means
neighbor list. Because this fix uses a neighbor list, it also means
those pairs will not be included in the RDF. This does not apply when
using long-range coulomb ({coul/long}, {coul/msm}, {coul/wolf} or
similar. One way to get around this would be to set special_bond
scaling factors to very tiny numbers that are not exactly zero
(e.g. 1.0e-50). Another workaround is to write a dump file, and use
the "rerun"_rerun.html command to compute the RDF for snapshots in the
dump file. The rerun script can use a
using long-range coulomb interactions ({coul/long}, {coul/msm},
{coul/wolf} or similar. One way to get around this would be to set
special_bond scaling factors to very tiny numbers that are not exactly
zero (e.g. 1.0e-50). Another workaround is to write a dump file, and
use the "rerun"_rerun.html command to compute the RDF for snapshots in
the dump file. The rerun script can use a
"special_bonds"_special_bonds.html command that includes all pairs in
the neighbor list.

View File

@ -54,7 +54,8 @@ reset_timestep 0
variable pxy equal pxy
variable pxx equal pxx-press
fix avstress all ave/time $s $p $d v_pxy v_pxx ave one file einstein.dat
fix avstress all ave/time $s $p $d v_pxy v_pxx ave one &
file profile.einstein.2d
# Diagonal components of SS are larger by factor 2-2/d,
# which is 4/3 for d=3, but 1 for d=2.

View File

@ -0,0 +1,189 @@
LAMMPS (26 Jan 2017)
# Testsystem for core-shell model compared to Mitchel and Finchham
# Hendrik Heenen, June 2014
# ------------------------ INITIALIZATION ----------------------------
units metal
dimension 3
boundary p p p
atom_style full
# ----------------------- ATOM DEFINITION ----------------------------
fix csinfo all property/atom i_CSID
read_data data.coreshell fix csinfo NULL CS-Info
orthogonal box = (0 0 0) to (24.096 24.096 24.096)
1 by 2 by 2 MPI processor grid
reading atoms ...
432 atoms
scanning bonds ...
1 = max bonds/atom
reading bonds ...
216 bonds
1 = max # of 1-2 neighbors
0 = max # of 1-3 neighbors
0 = max # of 1-4 neighbors
1 = max # of special neighbors
group cores type 1 2
216 atoms in group cores
group shells type 3 4
216 atoms in group shells
neighbor 2.0 bin
comm_modify vel yes
# ------------------------ FORCE FIELDS ------------------------------
pair_style born/coul/dsf/cs 0.1 20.0 20.0 # A, rho, sigma=0, C, D
pair_coeff * * 0.0 1.000 0.00 0.00 0.00
pair_coeff 3 3 487.0 0.23768 0.00 1.05 0.50 #Na-Na
pair_coeff 3 4 145134.0 0.23768 0.00 6.99 8.70 #Na-Cl
pair_coeff 4 4 405774.0 0.23768 0.00 72.40 145.40 #Cl-Cl
bond_style harmonic
bond_coeff 1 63.014 0.0
bond_coeff 2 25.724 0.0
# ------------------------ Equilibration Run -------------------------------
reset_timestep 0
thermo 50
thermo_style custom step etotal pe ke temp press epair evdwl ecoul elong ebond fnorm fmax vol
compute CSequ all temp/cs cores shells
# output via chunk method
#compute prop all property/atom i_CSID
#compute cs_chunk all chunk/atom c_prop
#compute cstherm all temp/chunk cs_chunk temp internal com yes cdof 3.0
#fix ave_chunk all ave/time 100 1 100 c_cstherm file chunk.dump mode vector
thermo_modify temp CSequ
# velocity bias option
velocity all create 1427 134 dist gaussian mom yes rot no bias yes temp CSequ
Neighbor list info ...
update every 1 steps, delay 10 steps, check yes
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 22
ghost atom cutoff = 22
binsize = 11, bins = 3 3 3
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair born/coul/dsf/cs, half, perpetual
pair build: half/bin/newton
stencil: half/bin/3d/newton
bin: standard
velocity all scale 1427 temp CSequ
fix thermoberendsen all temp/berendsen 1427 1427 0.4
fix nve all nve
fix_modify thermoberendsen temp CSequ
# 2 fmsec timestep
timestep 0.002
run 500
Memory usage per processor = 6.8559 Mbytes
Step TotEng PotEng KinEng Temp Press E_pair E_vdwl E_coul E_long E_bond Fnorm Fmax Volume
0 -635.80596 -675.46362 39.657659 1427 -21302.622 -675.46362 1.6320365 -677.09565 0 0 1.5814015e-14 3.2317898e-15 13990.5
50 -634.07021 -666.11867 32.048452 1153.1982 -4560.945 -668.28236 37.756542 -706.0389 0 2.163691 13.802484 3.022372 13990.5
100 -631.97128 -662.02544 30.054164 1081.4378 -3497.564 -664.61825 39.275003 -703.89325 0 2.5928078 13.956833 2.5417699 13990.5
150 -630.14953 -663.04215 32.892622 1183.5739 -88.43828 -665.63444 46.239965 -711.87441 0 2.5922927 14.667898 2.4964255 13990.5
200 -628.52878 -663.9795 35.45072 1275.6219 -1755.9004 -666.73564 41.758052 -708.49369 0 2.7561421 14.230743 3.0924004 13990.5
250 -627.27102 -662.025 34.753978 1250.5511 -1234.0918 -665.13519 43.170874 -708.30606 0 3.1101887 14.221086 1.941354 13990.5
300 -626.5495 -663.74287 37.193368 1338.3275 -2049.3444 -666.45574 40.476148 -706.93188 0 2.7128711 13.330425 1.7756755 13990.5
350 -625.87313 -665.21855 39.345421 1415.7647 -1543.1723 -667.90872 41.577366 -709.48609 0 2.6901682 13.541311 1.854662 13990.5
400 -625.09344 -661.26404 36.1706 1301.5253 -729.96729 -664.10334 43.468765 -707.57211 0 2.8392963 13.663555 1.9067551 13990.5
450 -624.46214 -660.01362 35.551477 1279.2474 -1617.7158 -663.06571 41.644856 -704.71057 0 3.0520921 14.527005 1.7280213 13990.5
500 -623.49246 -659.2527 35.76024 1286.7593 -935.99238 -662.32953 43.038808 -705.36834 0 3.0768302 14.099593 1.9831106 13990.5
Loop time of 4.09864 on 4 procs for 500 steps with 432 atoms
Performance: 21.080 ns/day, 1.139 hours/ns, 121.992 timesteps/s
99.7% CPU use with 4 MPI tasks x no OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 3.3804 | 3.568 | 3.8354 | 8.9 | 87.05
Bond | 0.00074339 | 0.00079519 | 0.00087976 | 0.0 | 0.02
Neigh | 0.045851 | 0.046084 | 0.046361 | 0.1 | 1.12
Comm | 0.20413 | 0.47123 | 0.65875 | 24.3 | 11.50
Output | 0.00044298 | 0.00046057 | 0.00051165 | 0.0 | 0.01
Modify | 0.0064909 | 0.0067219 | 0.0069766 | 0.2 | 0.16
Other | | 0.005345 | | | 0.13
Nlocal: 108 ave 114 max 105 min
Histogram: 1 1 1 0 0 0 0 0 0 1
Nghost: 6527 ave 6599 max 6472 min
Histogram: 1 0 1 0 1 0 0 0 0 1
Neighs: 74388.2 ave 75855 max 73680 min
Histogram: 1 2 0 0 0 0 0 0 0 1
Total # of neighbors = 297553
Ave neighs/atom = 688.78
Ave special neighs/atom = 1
Neighbor list builds = 20
Dangerous builds = 0
unfix thermoberendsen
# ------------------------ Dynamic Run -------------------------------
run 1000
Memory usage per processor = 6.85787 Mbytes
Step TotEng PotEng KinEng Temp Press E_pair E_vdwl E_coul E_long E_bond Fnorm Fmax Volume
500 -623.49319 -659.2527 35.759511 1286.7331 -936.04802 -662.32953 43.038808 -705.36834 0 3.0768302 14.099593 1.9831106 13990.5
550 -623.44059 -663.57938 40.138795 1444.3127 -935.73484 -666.2789 42.563337 -708.84224 0 2.6995167 13.918509 2.3189805 13990.5
600 -623.4703 -660.01592 36.545618 1315.0196 1327.3492 -663.08845 47.985462 -711.07391 0 3.0725254 15.192713 2.4098428 13990.5
650 -623.46796 -661.56776 38.099807 1370.9439 457.82439 -664.81976 45.495622 -710.31538 0 3.2519966 15.026057 1.8500226 13990.5
700 -623.50158 -659.5131 36.011523 1295.8012 -460.03772 -663.1078 43.938203 -707.046 0 3.5946908 14.660979 2.4825518 13990.5
750 -623.44787 -661.93353 38.485658 1384.8279 97.429626 -664.9551 45.083146 -710.03825 0 3.0215753 15.10043 2.3433897 13990.5
800 -623.48215 -659.50655 36.024402 1296.2647 1097.3866 -662.61124 47.251998 -709.86324 0 3.1046914 14.556382 2.0543766 13990.5
850 -623.45868 -661.13782 37.679134 1355.8068 -1802.1624 -664.41257 40.70845 -705.12102 0 3.2747525 14.691444 2.2054332 13990.5
900 -623.43556 -663.59137 40.155815 1444.9251 534.99197 -666.71877 45.601619 -712.32039 0 3.127395 14.741411 2.5807895 13990.5
950 -623.51318 -661.57916 38.06598 1369.7267 -678.12625 -664.37535 43.207862 -707.58322 0 2.7961988 14.430307 2.3936105 13990.5
1000 -623.47287 -661.22274 37.749874 1358.3523 634.7979 -664.42973 46.373361 -710.80309 0 3.2069879 15.891192 2.4042765 13990.5
1050 -623.48133 -661.52868 38.047347 1369.0562 -583.15228 -664.6098 43.618772 -708.22857 0 3.081116 14.806856 2.3447613 13990.5
1100 -623.47867 -661.83761 38.358946 1380.2685 -868.9779 -664.8826 42.84846 -707.73106 0 3.044983 14.69567 2.399143 13990.5
1150 -623.44713 -661.21299 37.765857 1358.9274 405.14554 -664.09567 45.578739 -709.6744 0 2.8826753 15.437367 3.1381305 13990.5
1200 -623.46549 -660.91706 37.451568 1347.6183 699.78996 -664.0883 46.36297 -710.45127 0 3.1712473 15.109665 1.8891886 13990.5
1250 -623.49296 -658.2218 34.728838 1249.6464 1061.0154 -661.29052 47.668699 -708.95922 0 3.0687228 14.901367 2.3964137 13990.5
1300 -623.49837 -660.91022 37.411844 1346.1889 226.99512 -664.35989 45.352287 -709.71217 0 3.4496704 15.161542 2.2137993 13990.5
1350 -623.46718 -658.80365 35.336469 1271.5108 1039.6469 -662.16908 47.565671 -709.73475 0 3.3654314 15.892516 2.7888426 13990.5
1400 -623.47124 -661.45375 37.982513 1366.7233 -379.56023 -664.6321 43.788306 -708.42041 0 3.1783497 14.251126 1.7415409 13990.5
1450 -623.46671 -660.17518 36.708464 1320.8792 -374.37056 -662.92706 44.083648 -707.01071 0 2.7518803 15.210167 1.9984277 13990.5
1500 -623.50515 -659.06488 35.559725 1279.5442 260.37822 -662.39548 45.779764 -708.17524 0 3.3306005 14.682396 2.4201107 13990.5
Loop time of 8.26746 on 4 procs for 1000 steps with 432 atoms
Performance: 20.901 ns/day, 1.148 hours/ns, 120.956 timesteps/s
99.7% CPU use with 4 MPI tasks x no OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 6.706 | 7.1568 | 7.6597 | 12.7 | 86.57
Bond | 0.0014617 | 0.0015531 | 0.0016506 | 0.2 | 0.02
Neigh | 0.10511 | 0.10522 | 0.10532 | 0.0 | 1.27
Comm | 0.48547 | 0.98841 | 1.4393 | 34.0 | 11.96
Output | 0.0012085 | 0.0012462 | 0.0013196 | 0.1 | 0.02
Modify | 0.0021446 | 0.0021989 | 0.0022545 | 0.1 | 0.03
Other | | 0.01204 | | | 0.15
Nlocal: 108 ave 114 max 94 min
Histogram: 1 0 0 0 0 0 0 0 1 2
Nghost: 6512.25 ave 6586 max 6456 min
Histogram: 1 0 0 2 0 0 0 0 0 1
Neighs: 74248.2 ave 77441 max 65858 min
Histogram: 1 0 0 0 0 0 0 0 0 3
Total # of neighbors = 296993
Ave neighs/atom = 687.484
Ave special neighs/atom = 1
Neighbor list builds = 46
Dangerous builds = 0
Total wall time: 0:00:12

17
potentials/Ge.tersoff Normal file
View File

@ -0,0 +1,17 @@
# DATE: 2016-12-21 CONTRIBUTOR: Sayyed Jalil Mahdizadh, saja.mahdizadeh@gmail.com CITATION: Mahdizadeh, Akhlamadi, J. Mol. Graph. Model. 72, 1-5 (2017)
# Tersoff parameters for various elements and mixtures
# multiple entries can be added to this file, LAMMPS reads the ones it needs
# these entries are in LAMMPS "metal" units:
# A,B = eV; lambda1,lambda2,lambda3 = 1/Angstroms; R,D = Angstroms
# other quantities are unitless
# This is the Ge reparameterization from the following paper:
# Optimized Tersoff empirical potential for germanene
# Mahdizadeh, Akhlamadi, J. Mol. Graph. Model. 72, 1-5 (2017)
# format of a single entry (one or more lines):
# element 1, element 2, element 3,
# m, gamma, lambda3, c, d, costheta0, n, beta, lambda2, B, R, D, lambda1, A
Ge Ge Ge 3.0 1.0 0.0 1.0643e5 15.2 -0.35 0.75627 5.017e-7 1.71 430.0 2.95 0.15 2.4451 1760.1

View File

@ -278,7 +278,7 @@ void FixWallGranRegion::update_contacts(int i, int nc)
iold = 0;
while (iold < ncontact[i]) {
for (m = 0; m < nc; m++)
if (region->contact[m].iwall = walls[i][iold]) break;
if (region->contact[m].iwall == walls[i][iold]) break;
if (m < nc) {
ilast = ncontact[i]-1;
for (j = 0; j < sheardim; j++)

View File

@ -799,3 +799,14 @@ double PairGranHookeHistory::memory_usage()
double bytes = nmax * sizeof(double);
return bytes;
}
/* ----------------------------------------------------------------------
return ptr to FixShearHistory class
called by Neighbor when setting up neighbor lists
------------------------------------------------------------------------- */
void *PairGranHookeHistory::extract(const char *str, int &dim)
{
if (strcmp(str,"history") == 0) return (void *) fix_history;
return NULL;
}

View File

@ -43,6 +43,7 @@ class PairGranHookeHistory : public Pair {
int pack_forward_comm(int, int *, double *, int, int *);
void unpack_forward_comm(int, int, double *);
double memory_usage();
void *extract(const char *, int &);
protected:
double kn,kt,gamman,gammat,xmu;

View File

@ -314,7 +314,7 @@ void PairTersoffMODKokkos<DeviceType>::operator()(TagPairTersoffMODComputeHalf<N
const X_FLOAT ytmp = x(i,1);
const X_FLOAT ztmp = x(i,2);
const int itype = type(i);
const int tagitag = tag(i);
const tagint itag = tag(i);
F_FLOAT fi[3], fj[3], fk[3];

View File

@ -75,23 +75,10 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
// system specific constants
#if defined(FFT_SCSL)
int isys = 0;
FFT_PREC scalef = 1.0;
#elif defined(FFT_DEC)
char c = 'C';
char f = 'F';
char b = 'B';
int one = 1;
#elif defined(FFT_T3E)
int isys = 0;
double scalef = 1.0;
#elif defined(FFT_ACML)
int info;
#elif defined(FFT_FFTW3)
#if defined(FFT_FFTW3)
FFTW_API(plan) theplan;
#else
// nothing to do for other FFTs.
// nothing to do for other FFTs
#endif
// pre-remap to prepare for 1st FFTs if needed
@ -100,8 +87,8 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
if (plan->pre_plan) {
if (plan->pre_target == 0) copy = out;
else copy = plan->copy;
remap_3d((FFT_SCALAR *) in, (FFT_SCALAR *) copy, (FFT_SCALAR *) plan->scratch,
plan->pre_plan);
remap_3d((FFT_SCALAR *) in, (FFT_SCALAR *) copy,
(FFT_SCALAR *) plan->scratch, plan->pre_plan);
data = copy;
}
else
@ -112,35 +99,11 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
total = plan->total1;
length = plan->length1;
#if defined(FFT_SGI)
for (offset = 0; offset < total; offset += length)
FFT_1D(flag,length,&data[offset],1,plan->coeff1);
#elif defined(FFT_SCSL)
for (offset = 0; offset < total; offset += length)
FFT_1D(flag,length,scalef,&data[offset],&data[offset],plan->coeff1,
plan->work1,&isys);
#elif defined(FFT_ACML)
num=total/length;
FFT_1D(&flag,&num,&length,data,plan->coeff1,&info);
#elif defined(FFT_INTEL)
for (offset = 0; offset < total; offset += length)
FFT_1D(&data[offset],&length,&flag,plan->coeff1);
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
if (flag == -1)
DftiComputeForward(plan->handle_fast,data);
else
DftiComputeBackward(plan->handle_fast,data);
#elif defined(FFT_DEC)
if (flag == -1)
for (offset = 0; offset < total; offset += length)
FFT_1D(&c,&c,&f,&data[offset],&data[offset],&length,&one);
else
for (offset = 0; offset < total; offset += length)
FFT_1D(&c,&c,&b,&data[offset],&data[offset],&length,&one);
#elif defined(FFT_T3E)
for (offset = 0; offset < total; offset += length)
FFT_1D(&flag,&length,&scalef,&data[offset],&data[offset],plan->coeff1,
plan->work1,&isys);
#elif defined(FFT_FFTW2)
if (flag == -1)
fftw(plan->plan_fast_forward,total/length,data,1,length,NULL,0,0);
@ -166,8 +129,8 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
if (plan->mid1_target == 0) copy = out;
else copy = plan->copy;
remap_3d((FFT_SCALAR *) data, (FFT_SCALAR *) copy, (FFT_SCALAR *) plan->scratch,
plan->mid1_plan);
remap_3d((FFT_SCALAR *) data, (FFT_SCALAR *) copy,
(FFT_SCALAR *) plan->scratch, plan->mid1_plan);
data = copy;
// 1d FFTs along mid axis
@ -175,35 +138,11 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
total = plan->total2;
length = plan->length2;
#if defined(FFT_SGI)
for (offset = 0; offset < total; offset += length)
FFT_1D(flag,length,&data[offset],1,plan->coeff2);
#elif defined(FFT_SCSL)
for (offset = 0; offset < total; offset += length)
FFT_1D(flag,length,scalef,&data[offset],&data[offset],plan->coeff2,
plan->work2,&isys);
#elif defined(FFT_ACML)
num=total/length;
FFT_1D(&flag,&num,&length,data,plan->coeff2,&info);
#elif defined(FFT_INTEL)
for (offset = 0; offset < total; offset += length)
FFT_1D(&data[offset],&length,&flag,plan->coeff2);
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
if (flag == -1)
DftiComputeForward(plan->handle_mid,data);
else
DftiComputeBackward(plan->handle_mid,data);
#elif defined(FFT_DEC)
if (flag == -1)
for (offset = 0; offset < total; offset += length)
FFT_1D(&c,&c,&f,&data[offset],&data[offset],&length,&one);
else
for (offset = 0; offset < total; offset += length)
FFT_1D(&c,&c,&b,&data[offset],&data[offset],&length,&one);
#elif defined(FFT_T3E)
for (offset = 0; offset < total; offset += length)
FFT_1D(&flag,&length,&scalef,&data[offset],&data[offset],plan->coeff2,
plan->work2,&isys);
#elif defined(FFT_FFTW2)
if (flag == -1)
fftw(plan->plan_mid_forward,total/length,data,1,length,NULL,0,0);
@ -229,8 +168,8 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
if (plan->mid2_target == 0) copy = out;
else copy = plan->copy;
remap_3d((FFT_SCALAR *) data, (FFT_SCALAR *) copy, (FFT_SCALAR *) plan->scratch,
plan->mid2_plan);
remap_3d((FFT_SCALAR *) data, (FFT_SCALAR *) copy,
(FFT_SCALAR *) plan->scratch, plan->mid2_plan);
data = copy;
// 1d FFTs along slow axis
@ -238,35 +177,11 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
total = plan->total3;
length = plan->length3;
#if defined(FFT_SGI)
for (offset = 0; offset < total; offset += length)
FFT_1D(flag,length,&data[offset],1,plan->coeff3);
#elif defined(FFT_SCSL)
for (offset = 0; offset < total; offset += length)
FFT_1D(flag,length,scalef,&data[offset],&data[offset],plan->coeff3,
plan->work3,&isys);
#elif defined(FFT_ACML)
num=total/length;
FFT_1D(&flag,&num,&length,data,plan->coeff3,&info);
#elif defined(FFT_INTEL)
for (offset = 0; offset < total; offset += length)
FFT_1D(&data[offset],&length,&flag,plan->coeff3);
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
if (flag == -1)
DftiComputeForward(plan->handle_slow,data);
else
DftiComputeBackward(plan->handle_slow,data);
#elif defined(FFT_DEC)
if (flag == -1)
for (offset = 0; offset < total; offset += length)
FFT_1D(&c,&c,&f,&data[offset],&data[offset],&length,&one);
else
for (offset = 0; offset < total; offset += length)
FFT_1D(&c,&c,&b,&data[offset],&data[offset],&length,&one);
#elif defined(FFT_T3E)
for (offset = 0; offset < total; offset += length)
FFT_1D(&flag,&length,&scalef,&data[offset],&data[offset],plan->coeff3,
plan->work3,&isys);
#elif defined(FFT_FFTW2)
if (flag == -1)
fftw(plan->plan_slow_forward,total/length,data,1,length,NULL,0,0);
@ -291,11 +206,10 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
// destination is always out
if (plan->post_plan)
remap_3d((FFT_SCALAR *) data, (FFT_SCALAR *) out, (FFT_SCALAR *) plan->scratch,
plan->post_plan);
remap_3d((FFT_SCALAR *) data, (FFT_SCALAR *) out,
(FFT_SCALAR *) plan->scratch, plan->post_plan);
// scaling if required
#if !defined(FFT_T3E) && !defined(FFT_ACML)
if (flag == 1 && plan->scaled) {
norm = plan->norm;
num = plan->normnum;
@ -314,25 +228,6 @@ void fft_3d(FFT_DATA *in, FFT_DATA *out, int flag, struct fft_plan_3d *plan)
#endif
}
}
#endif
#ifdef FFT_T3E
if (flag == 1 && plan->scaled) {
norm = plan->norm;
num = plan->normnum;
for (i = 0; i < num; i++) out[i] *= (norm,norm);
}
#endif
#ifdef FFT_ACML
norm = plan->norm;
num = plan->normnum;
for (i = 0; i < num; i++) {
out[i].re *= norm;
out[i].im *= norm;
}
#endif
}
/* ----------------------------------------------------------------------
@ -373,23 +268,6 @@ struct fft_plan_3d *fft_3d_create_plan(
int out_size,first_size,second_size,third_size,copy_size,scratch_size;
int np1,np2,ip1,ip2;
// system specific variables
#ifdef FFT_SCSL
FFT_DATA dummy_d[5];
FFT_PREC dummy_p[5];
int isign,isys;
FFT_PREC scalef;
#endif
#ifdef FFT_INTEL
FFT_DATA dummy;
#endif
#ifdef FFT_T3E
FFT_DATA dummy[5];
int isign,isys;
double scalef;
#endif
// query MPI info
MPI_Comm_rank(comm,&me);
@ -425,8 +303,7 @@ struct fft_plan_3d *fft_3d_create_plan(
first_klo = in_klo;
first_khi = in_khi;
plan->pre_plan = NULL;
}
else {
} else {
first_ilo = 0;
first_ihi = nfast - 1;
first_jlo = ip1*nmid/np1;
@ -460,7 +337,8 @@ struct fft_plan_3d *fft_3d_create_plan(
first_ilo,first_ihi,first_jlo,first_jhi,
first_klo,first_khi,
second_ilo,second_ihi,second_jlo,second_jhi,
second_klo,second_khi,2,1,0,FFT_PRECISION,usecollective);
second_klo,second_khi,2,1,0,FFT_PRECISION,
usecollective);
if (plan->mid1_plan == NULL) return NULL;
// 1d FFTs along mid axis
@ -487,8 +365,7 @@ struct fft_plan_3d *fft_3d_create_plan(
third_jhi = out_jhi;
third_klo = out_klo;
third_khi = out_khi;
}
else {
} else {
third_ilo = ip1*nfast/np1;
third_ihi = (ip1+1)*nfast/np1 - 1;
third_jlo = ip2*nmid/np2;
@ -537,7 +414,8 @@ struct fft_plan_3d *fft_3d_create_plan(
// configure plan memory pointers and allocate work space
// out_size = amount of memory given to FFT by user
// first/second/third_size = amount of memory needed after pre,mid1,mid2 remaps
// first/second/third_size =
// amount of memory needed after pre,mid1,mid2 remaps
// copy_size = amount needed internally for extra copy of data
// scratch_size = amount needed internally for remap scratch space
// for each remap:
@ -605,155 +483,29 @@ struct fft_plan_3d *fft_3d_create_plan(
// system specific pre-computation of 1d FFT coeffs
// and scaling normalization
#if defined(FFT_SGI)
plan->coeff1 = (FFT_DATA *) malloc((nfast+15)*sizeof(FFT_DATA));
plan->coeff2 = (FFT_DATA *) malloc((nmid+15)*sizeof(FFT_DATA));
plan->coeff3 = (FFT_DATA *) malloc((nslow+15)*sizeof(FFT_DATA));
if (plan->coeff1 == NULL || plan->coeff2 == NULL ||
plan->coeff3 == NULL) return NULL;
FFT_1D_INIT(nfast,plan->coeff1);
FFT_1D_INIT(nmid,plan->coeff2);
FFT_1D_INIT(nslow,plan->coeff3);
if (scaled == 0)
plan->scaled = 0;
else {
plan->scaled = 1;
plan->norm = 1.0/(nfast*nmid*nslow);
plan->normnum = (out_ihi-out_ilo+1) * (out_jhi-out_jlo+1) *
(out_khi-out_klo+1);
}
#elif defined(FFT_SCSL)
plan->coeff1 = (FFT_PREC *) malloc((2*nfast+30)*sizeof(FFT_PREC));
plan->coeff2 = (FFT_PREC *) malloc((2*nmid+30)*sizeof(FFT_PREC));
plan->coeff3 = (FFT_PREC *) malloc((2*nslow+30)*sizeof(FFT_PREC));
if (plan->coeff1 == NULL || plan->coeff2 == NULL ||
plan->coeff3 == NULL) return NULL;
plan->work1 = (FFT_PREC *) malloc((2*nfast)*sizeof(FFT_PREC));
plan->work2 = (FFT_PREC *) malloc((2*nmid)*sizeof(FFT_PREC));
plan->work3 = (FFT_PREC *) malloc((2*nslow)*sizeof(FFT_PREC));
if (plan->work1 == NULL || plan->work2 == NULL ||
plan->work3 == NULL) return NULL;
isign = 0;
scalef = 1.0;
isys = 0;
FFT_1D_INIT(isign,nfast,scalef,dummy_d,dummy_d,plan->coeff1,dummy_p,&isys);
FFT_1D_INIT(isign,nmid,scalef,dummy_d,dummy_d,plan->coeff2,dummy_p,&isys);
FFT_1D_INIT(isign,nslow,scalef,dummy_d,dummy_d,plan->coeff3,dummy_p,&isys);
if (scaled == 0)
plan->scaled = 0;
else {
plan->scaled = 1;
plan->norm = 1.0/(nfast*nmid*nslow);
plan->normnum = (out_ihi-out_ilo+1) * (out_jhi-out_jlo+1) *
(out_khi-out_klo+1);
}
#elif defined(FFT_ACML)
plan->coeff1 = (FFT_DATA *) malloc((3*nfast+100)*sizeof(FFT_DATA));
plan->coeff2 = (FFT_DATA *) malloc((3*nmid+100)*sizeof(FFT_DATA));
plan->coeff3 = (FFT_DATA *) malloc((3*nslow+100)*sizeof(FFT_DATA));
if (plan->coeff1 == NULL || plan->coeff2 == NULL ||
plan->coeff3 == NULL) return NULL;
int isign = 100;
int isys = 1;
int info = 0;
FFT_DATA *dummy = NULL;
FFT_1D(&isign,&isys,&nfast,dummy,plan->coeff1,&info);
FFT_1D(&isign,&isys,&nmid,dummy,plan->coeff2,&info);
FFT_1D(&isign,&isys,&nslow,dummy,plan->coeff3,&info);
if (scaled == 0) {
plan->scaled = 0;
plan->norm = sqrt(nfast*nmid*nslow);
plan->normnum = (out_ihi-out_ilo+1) * (out_jhi-out_jlo+1) *
(out_khi-out_klo+1);
} else {
plan->scaled = 1;
plan->norm = sqrt(nfast*nmid*nslow);
plan->normnum = (out_ihi-out_ilo+1) * (out_jhi-out_jlo+1) *
(out_khi-out_klo+1);
}
#elif defined(FFT_INTEL)
flag = 0;
int i,num,fftflag;
int list[50];
num = 0;
factor(nfast,&num,list);
for (i = 0; i < num; i++)
if (list[i] != 2 && list[i] != 3 && list[i] != 5) flag = 1;
num = 0;
factor(nmid,&num,list);
for (i = 0; i < num; i++)
if (list[i] != 2 && list[i] != 3 && list[i] != 5) flag = 1;
num = 0;
factor(nslow,&num,list);
for (i = 0; i < num; i++)
if (list[i] != 2 && list[i] != 3 && list[i] != 5) flag = 1;
MPI_Allreduce(&flag,&fftflag,1,MPI_INT,MPI_MAX,comm);
if (fftflag) {
if (me == 0) printf("ERROR: FFTs are not power of 2,3,5\n");
return NULL;
}
plan->coeff1 = (FFT_DATA *) malloc((3*nfast/2+1)*sizeof(FFT_DATA));
plan->coeff2 = (FFT_DATA *) malloc((3*nmid/2+1)*sizeof(FFT_DATA));
plan->coeff3 = (FFT_DATA *) malloc((3*nslow/2+1)*sizeof(FFT_DATA));
if (plan->coeff1 == NULL || plan->coeff2 == NULL ||
plan->coeff3 == NULL) return NULL;
flag = 0;
FFT_1D_INIT(&dummy,&nfast,&flag,plan->coeff1);
FFT_1D_INIT(&dummy,&nmid,&flag,plan->coeff2);
FFT_1D_INIT(&dummy,&nslow,&flag,plan->coeff3);
if (scaled == 0) {
plan->scaled = 1;
plan->norm = nfast*nmid*nslow;
plan->normnum = (out_ihi-out_ilo+1) * (out_jhi-out_jlo+1) *
(out_khi-out_klo+1);
}
else
plan->scaled = 0;
#elif defined(FFT_MKL)
DftiCreateDescriptor( &(plan->handle_fast), FFT_MKL_PREC, DFTI_COMPLEX, 1, (MKL_LONG)nfast);
DftiSetValue(plan->handle_fast, DFTI_NUMBER_OF_TRANSFORMS, (MKL_LONG)plan->total1/nfast);
#if defined(FFT_MKL)
DftiCreateDescriptor( &(plan->handle_fast), FFT_MKL_PREC, DFTI_COMPLEX, 1,
(MKL_LONG)nfast);
DftiSetValue(plan->handle_fast, DFTI_NUMBER_OF_TRANSFORMS,
(MKL_LONG)plan->total1/nfast);
DftiSetValue(plan->handle_fast, DFTI_PLACEMENT,DFTI_INPLACE);
DftiSetValue(plan->handle_fast, DFTI_INPUT_DISTANCE, (MKL_LONG)nfast);
DftiSetValue(plan->handle_fast, DFTI_OUTPUT_DISTANCE, (MKL_LONG)nfast);
DftiCommitDescriptor(plan->handle_fast);
DftiCreateDescriptor( &(plan->handle_mid), FFT_MKL_PREC, DFTI_COMPLEX, 1, (MKL_LONG)nmid);
DftiSetValue(plan->handle_mid, DFTI_NUMBER_OF_TRANSFORMS, (MKL_LONG)plan->total2/nmid);
DftiCreateDescriptor( &(plan->handle_mid), FFT_MKL_PREC, DFTI_COMPLEX, 1,
(MKL_LONG)nmid);
DftiSetValue(plan->handle_mid, DFTI_NUMBER_OF_TRANSFORMS,
(MKL_LONG)plan->total2/nmid);
DftiSetValue(plan->handle_mid, DFTI_PLACEMENT,DFTI_INPLACE);
DftiSetValue(plan->handle_mid, DFTI_INPUT_DISTANCE, (MKL_LONG)nmid);
DftiSetValue(plan->handle_mid, DFTI_OUTPUT_DISTANCE, (MKL_LONG)nmid);
DftiCommitDescriptor(plan->handle_mid);
DftiCreateDescriptor( &(plan->handle_slow), FFT_MKL_PREC, DFTI_COMPLEX, 1, (MKL_LONG)nslow);
DftiSetValue(plan->handle_slow, DFTI_NUMBER_OF_TRANSFORMS, (MKL_LONG)plan->total3/nslow);
DftiCreateDescriptor( &(plan->handle_slow), FFT_MKL_PREC, DFTI_COMPLEX, 1,
(MKL_LONG)nslow);
DftiSetValue(plan->handle_slow, DFTI_NUMBER_OF_TRANSFORMS,
(MKL_LONG)plan->total3/nslow);
DftiSetValue(plan->handle_slow, DFTI_PLACEMENT,DFTI_INPLACE);
DftiSetValue(plan->handle_slow, DFTI_INPUT_DISTANCE, (MKL_LONG)nslow);
DftiSetValue(plan->handle_slow, DFTI_OUTPUT_DISTANCE, (MKL_LONG)nslow);
@ -768,50 +520,6 @@ struct fft_plan_3d *fft_3d_create_plan(
(out_khi-out_klo+1);
}
#elif defined(FFT_DEC)
if (scaled == 0) {
plan->scaled = 1;
plan->norm = nfast*nmid*nslow;
plan->normnum = (out_ihi-out_ilo+1) * (out_jhi-out_jlo+1) *
(out_khi-out_klo+1);
}
else
plan->scaled = 0;
#elif defined(FFT_T3E)
plan->coeff1 = (double *) malloc((12*nfast)*sizeof(double));
plan->coeff2 = (double *) malloc((12*nmid)*sizeof(double));
plan->coeff3 = (double *) malloc((12*nslow)*sizeof(double));
if (plan->coeff1 == NULL || plan->coeff2 == NULL ||
plan->coeff3 == NULL) return NULL;
plan->work1 = (double *) malloc((8*nfast)*sizeof(double));
plan->work2 = (double *) malloc((8*nmid)*sizeof(double));
plan->work3 = (double *) malloc((8*nslow)*sizeof(double));
if (plan->work1 == NULL || plan->work2 == NULL ||
plan->work3 == NULL) return NULL;
isign = 0;
scalef = 1.0;
isys = 0;
FFT_1D_INIT(&isign,&nfast,&scalef,dummy,dummy,plan->coeff1,dummy,&isys);
FFT_1D_INIT(&isign,&nmid,&scalef,dummy,dummy,plan->coeff2,dummy,&isys);
FFT_1D_INIT(&isign,&nslow,&scalef,dummy,dummy,plan->coeff3,dummy,&isys);
if (scaled == 0)
plan->scaled = 0;
else {
plan->scaled = 1;
plan->norm = 1.0/(nfast*nmid*nslow);
plan->normnum = (out_ihi-out_ilo+1) * (out_jhi-out_jlo+1) *
(out_khi-out_klo+1);
}
#elif defined(FFT_FFTW2)
plan->plan_fast_forward =
@ -948,36 +656,10 @@ void fft_3d_destroy_plan(struct fft_plan_3d *plan)
if (plan->copy) free(plan->copy);
if (plan->scratch) free(plan->scratch);
#if defined(FFT_SGI)
free(plan->coeff1);
free(plan->coeff2);
free(plan->coeff3);
#elif defined(FFT_SCSL)
free(plan->coeff1);
free(plan->coeff2);
free(plan->coeff3);
free(plan->work1);
free(plan->work2);
free(plan->work3);
#elif defined(FFT_ACML)
free(plan->coeff1);
free(plan->coeff2);
free(plan->coeff3);
#elif defined(FFT_INTEL)
free(plan->coeff1);
free(plan->coeff2);
free(plan->coeff3);
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
DftiFreeDescriptor(&(plan->handle_fast));
DftiFreeDescriptor(&(plan->handle_mid));
DftiFreeDescriptor(&(plan->handle_slow));
#elif defined(FFT_T3E)
free(plan->coeff1);
free(plan->coeff2);
free(plan->coeff3);
free(plan->work1);
free(plan->work2);
free(plan->work3);
#elif defined(FFT_FFTW2)
if (plan->plan_slow_forward != plan->plan_fast_forward &&
plan->plan_slow_forward != plan->plan_mid_forward) {
@ -1022,38 +704,31 @@ void factor(int n, int *num, int *list)
{
if (n == 1) {
return;
}
else if (n % 2 == 0) {
} else if (n % 2 == 0) {
*list = 2;
(*num)++;
factor(n/2,num,list+1);
}
else if (n % 3 == 0) {
} else if (n % 3 == 0) {
*list = 3;
(*num)++;
factor(n/3,num,list+1);
}
else if (n % 5 == 0) {
} else if (n % 5 == 0) {
*list = 5;
(*num)++;
factor(n/5,num,list+1);
}
else if (n % 7 == 0) {
} else if (n % 7 == 0) {
*list = 7;
(*num)++;
factor(n/7,num,list+1);
}
else if (n % 11 == 0) {
} else if (n % 11 == 0) {
*list = 11;
(*num)++;
factor(n/11,num,list+1);
}
else if (n % 13 == 0) {
} else if (n % 13 == 0) {
*list = 13;
(*num)++;
factor(n/13,num,list+1);
}
else {
} else {
*list = n;
(*num)++;
return;
@ -1099,23 +774,6 @@ void fft_1d_only(FFT_DATA *data, int nsize, int flag, struct fft_plan_3d *plan)
FFT_SCALAR *data_ptr;
#endif
// system specific constants
#ifdef FFT_SCSL
int isys = 0;
FFT_PREC scalef = 1.0;
#endif
#ifdef FFT_DEC
char c = 'C';
char f = 'F';
char b = 'B';
int one = 1;
#endif
#ifdef FFT_T3E
int isys = 0;
double scalef = 1.0;
#endif
// total = size of data needed in each dim
// length = length of 1d FFT in each dim
// total/length = # of 1d FFTs in each dim
@ -1141,39 +799,7 @@ void fft_1d_only(FFT_DATA *data, int nsize, int flag, struct fft_plan_3d *plan)
// perform 1d FFTs in each of 3 dimensions
// data is just an array of 0.0
#ifdef FFT_SGI
for (int offset = 0; offset < total1; offset += length1)
FFT_1D(flag,length1,&data[offset],1,plan->coeff1);
for (int offset = 0; offset < total2; offset += length2)
FFT_1D(flag,length2,&data[offset],1,plan->coeff2);
for (int offset = 0; offset < total3; offset += length3)
FFT_1D(flag,length3,&data[offset],1,plan->coeff3);
#elif defined(FFT_SCSL)
for (int offset = 0; offset < total1; offset += length1)
FFT_1D(flag,length1,scalef,&data[offset],&data[offset],plan->coeff1,
plan->work1,&isys);
for (int offset = 0; offset < total2; offset += length2)
FFT_1D(flag,length2,scalef,&data[offset],&data[offset],plan->coeff2,
plan->work2,&isys);
for (int offset = 0; offset < total3; offset += length3)
FFT_1D(flag,length3,scalef,&data[offset],&data[offset],plan->coeff3,
plan->work3,&isys);
#elif defined(FFT_ACML)
int info=0;
num=total1/length1;
FFT_1D(&flag,&num,&length1,data,plan->coeff1,&info);
num=total2/length2;
FFT_1D(&flag,&num,&length2,data,plan->coeff2,&info);
num=total3/length3;
FFT_1D(&flag,&num,&length3,data,plan->coeff3,&info);
#elif defined(FFT_INTEL)
for (int offset = 0; offset < total1; offset += length1)
FFT_1D(&data[offset],&length1,&flag,plan->coeff1);
for (int offset = 0; offset < total2; offset += length2)
FFT_1D(&data[offset],&length2,&flag,plan->coeff2);
for (int offset = 0; offset < total3; offset += length3)
FFT_1D(&data[offset],&length3,&flag,plan->coeff3);
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
if (flag == -1) {
DftiComputeForward(plan->handle_fast,data);
DftiComputeForward(plan->handle_mid,data);
@ -1183,32 +809,6 @@ void fft_1d_only(FFT_DATA *data, int nsize, int flag, struct fft_plan_3d *plan)
DftiComputeBackward(plan->handle_mid,data);
DftiComputeBackward(plan->handle_slow,data);
}
#elif defined(FFT_DEC)
if (flag == -1) {
for (int offset = 0; offset < total1; offset += length1)
FFT_1D(&c,&c,&f,&data[offset],&data[offset],&length1,&one);
for (int offset = 0; offset < total2; offset += length2)
FFT_1D(&c,&c,&f,&data[offset],&data[offset],&length2,&one);
for (int offset = 0; offset < total3; offset += length3)
FFT_1D(&c,&c,&f,&data[offset],&data[offset],&length3,&one);
} else {
for (int offset = 0; offset < total1; offset += length1)
FFT_1D(&c,&c,&b,&data[offset],&data[offset],&length1,&one);
for (int offset = 0; offset < total2; offset += length2)
FFT_1D(&c,&c,&b,&data[offset],&data[offset],&length2,&one);
for (int offset = 0; offset < total3; offset += length3)
FFT_1D(&c,&c,&b,&data[offset],&data[offset],&length3,&one);
}
#elif defined(FFT_T3E)
for (int offset = 0; offset < total1; offset += length1)
FFT_1D(&flag,&length1,&scalef,&data[offset],&data[offset],plan->coeff1,
plan->work1,&isys);
for (int offset = 0; offset < total2; offset += length2)
FFT_1D(&flag,&length2,&scalef,&data[offset],&data[offset],plan->coeff2,
plan->work2,&isys);
for (int offset = 0; offset < total3; offset += length3)
FFT_1D(&flag,&length3,&scalef,&data[offset],&data[offset],plan->coeff3,
plan->work3,&isys);
#elif defined(FFT_FFTW2)
if (flag == -1) {
fftw(plan->plan_fast_forward,total1/length1,data,1,0,NULL,0,0);
@ -1257,7 +857,6 @@ void fft_1d_only(FFT_DATA *data, int nsize, int flag, struct fft_plan_3d *plan)
// scaling if required
// limit num to size of data
#ifndef FFT_T3E
if (flag == 1 && plan->scaled) {
norm = plan->norm;
num = MIN(plan->normnum,nsize);
@ -1276,13 +875,4 @@ void fft_1d_only(FFT_DATA *data, int nsize, int flag, struct fft_plan_3d *plan)
#endif
}
}
#endif
#ifdef FFT_T3E
if (flag == 1 && plan->scaled) {
norm = plan->norm;
num = MIN(plan->normnum,nsize);
for (i = 0; i < num; i++) data[i] *= (norm,norm);
}
#endif
}

View File

@ -27,7 +27,7 @@ typedef double FFT_SCALAR;
// set default fftw library. switch to FFT_FFTW3 when convenient.
#ifdef FFT_FFTW
#define FFT_FFTW2
#define FFT_FFTW3
#endif
// -------------------------------------------------------------------------
@ -36,73 +36,11 @@ typedef double FFT_SCALAR;
#if FFT_PRECISION == 1
#if defined(FFT_SGI)
#include "fft.h"
typedef complex FFT_DATA;
#define FFT_1D cfft1d
#define FFT_1D_INIT cfft1di
extern "C" {
int cfft1d(int, int, FFT_DATA *, int, FFT_DATA *);
FFT_DATA *cfft1di(int, FFT_DATA *);
}
#elif defined(FFT_SCSL)
#include <scsl_fft.h>
typedef scsl_complex FFT_DATA;
typedef float FFT_PREC;
#define FFT_1D ccfft
#define FFT_1D_INIT ccfft
extern "C" {
int ccfft(int, int, FFT_PREC, FFT_DATA *, FFT_DATA *,
FFT_PREC *, FFT_PREC *, int *);
}
#elif defined(FFT_ACML)
typedef struct {
float re;
float im;
} FFT_DATA;
#define FFT_1D cfft1m_
extern "C" {
void cfft1m_(int *, int *, int *, FFT_DATA *, FFT_DATA *, int *);
}
#elif defined(FFT_INTEL)
typedef struct {
float re;
float im;
} FFT_DATA;
#define FFT_1D cfft1d_
#define FFT_1D_INIT cfft1d_
extern "C" {
void cfft1d_(FFT_DATA *, int *, int *, FFT_DATA *);
}
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
#include "mkl_dfti.h"
typedef float _Complex FFT_DATA;
#define FFT_MKL_PREC DFTI_SINGLE
#elif defined(FFT_DEC)
typedef struct {
float re;
float im;
} FFT_DATA;
#define FFT_1D cfft_
extern "C" {
void cfft_(char *, char *, char *, FFT_DATA *, FFT_DATA *, int *, int *);
}
#elif defined(FFT_T3E)
#include <complex.h>
typedef complex single FFT_DATA;
#define FFT_1D GGFFT
#define FFT_1D_INIT GGFFT
extern "C" {
void GGFFT(int *, int *, double *, FFT_DATA *, FFT_DATA *,
double *, double *, int *);
}
#elif defined(FFT_FFTW2)
#if defined(FFTW_SIZE)
#include "sfftw.h"
@ -138,73 +76,11 @@ typedef struct kiss_fft_state* kiss_fft_cfg;
#elif FFT_PRECISION == 2
#if defined(FFT_SGI)
#include "fft.h"
typedef zomplex FFT_DATA;
#define FFT_1D zfft1d
#define FFT_1D_INIT zfft1di
extern "C" {
int zfft1d(int, int, FFT_DATA *, int, FFT_DATA *);
FFT_DATA *zfft1di(int, FFT_DATA *);
}
#elif defined(FFT_SCSL)
#include <scsl_fft.h>
typedef scsl_zomplex FFT_DATA;
typedef double FFT_PREC;
#define FFT_1D zzfft
#define FFT_1D_INIT zzfft
extern "C" {
int zzfft(int, int, FFT_PREC, FFT_DATA *, FFT_DATA *,
FFT_PREC *, FFT_PREC *, int *);
}
#elif defined(FFT_ACML)
typedef struct {
double re;
double im;
} FFT_DATA;
#define FFT_1D zfft1m_
extern "C" {
void zfft1m_(int *, int *, int *, FFT_DATA *, FFT_DATA *, int *);
}
#elif defined(FFT_INTEL)
typedef struct {
double re;
double im;
} FFT_DATA;
#define FFT_1D zfft1d_
#define FFT_1D_INIT zfft1d_
extern "C" {
void zfft1d_(FFT_DATA *, int *, int *, FFT_DATA *);
}
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
#include "mkl_dfti.h"
typedef double _Complex FFT_DATA;
#define FFT_MKL_PREC DFTI_DOUBLE
#elif defined(FFT_DEC)
typedef struct {
double re;
double im;
} FFT_DATA;
#define FFT_1D zfft_
extern "C" {
void zfft_(char *, char *, char *, FFT_DATA *, FFT_DATA *, int *, int *);
}
#elif defined(FFT_T3E)
#include <complex.h>
typedef complex double FFT_DATA;
#define FFT_1D CCFFT
#define FFT_1D_INIT CCFFT
extern "C" {
void CCFFT(int *, int *, double *, FFT_DATA *, FFT_DATA *,
double *, double *, int *);
}
#elif defined(FFT_FFTW2)
#if defined(FFTW_SIZE)
#include "dfftw.h"
@ -258,36 +134,10 @@ struct fft_plan_3d {
double norm; // normalization factor for rescaling
// system specific 1d FFT info
#if defined(FFT_SGI)
FFT_DATA *coeff1;
FFT_DATA *coeff2;
FFT_DATA *coeff3;
#elif defined(FFT_SCSL)
FFT_PREC *coeff1;
FFT_PREC *coeff2;
FFT_PREC *coeff3;
FFT_PREC *work1;
FFT_PREC *work2;
FFT_PREC *work3;
#elif defined(FFT_ACML)
FFT_DATA *coeff1;
FFT_DATA *coeff2;
FFT_DATA *coeff3;
#elif defined(FFT_INTEL)
FFT_DATA *coeff1;
FFT_DATA *coeff2;
FFT_DATA *coeff3;
#elif defined(FFT_MKL)
#if defined(FFT_MKL)
DFTI_DESCRIPTOR *handle_fast;
DFTI_DESCRIPTOR *handle_mid;
DFTI_DESCRIPTOR *handle_slow;
#elif defined(FFT_T3E)
double *coeff1;
double *coeff2;
double *coeff3;
double *work1;
double *work2;
double *work3;
#elif defined(FFT_FFTW2)
fftw_plan plan_fast_forward;
fftw_plan plan_fast_backward;

View File

@ -56,7 +56,7 @@ MPI_LIB = -lmpich.rts
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW
FFT_INC = -DFFT_FFTW2
FFT_PATH =
FFT_LIB = -lfftw

View File

@ -8,7 +8,7 @@ SHELL = /bin/bash
# do not edit this section
# select which compiler by editing Makefile.bgq.details
include ../MAKE/Makefile.bgq.details
include ../MAKE/MACHINES/Makefile.bgq.details
include Makefile.package.settings
include Makefile.package

View File

@ -101,7 +101,7 @@ MPI_LIB += #/home/jhammond/OSPRI/branches/marpn/wrap/libmpiarbrpn.a
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -I/soft/libraries/alcf/current/xl/FFTW2/include -DFFT_FFTW -DFFTW_SIZE
FFT_INC = -I/soft/libraries/alcf/current/xl/FFTW2/include -DFFT_FFTW2 -DFFTW_SIZE
FFT_PATH = #/soft/libraries/alcf/current/xl/FFTW2
FFT_LIB = -L/soft/libraries/alcf/current/xl/FFTW2/lib -ldfftw

View File

@ -50,7 +50,7 @@ MPI_LIB =
# PATH = path for FFT library
# LIB = name of FFT library
#FFT_INC = -DFFT_FFTW -I${FFTW_INCLUDE}
#FFT_INC = -DFFT_FFTW2 -I${FFTW_INCLUDE}
#FFT_PATH = -L${FFTW_LIB}
#FFT_LIB = -lfftw

View File

@ -50,7 +50,7 @@ MPI_LIB = -lmpich
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW -I/cygdrive/c/cygwin/usr/local/include
FFT_INC = -DFFT_FFTW2 -I/cygdrive/c/cygwin/usr/local/include
FFT_PATH = -L/cygdrive/c/cygwin/usr/local/lib
FFT_LIB = -lfftw

View File

@ -67,7 +67,7 @@ MPI_LIB =
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW -I${FFTW_INCLUDE}
FFT_INC = -DFFT_FFTW2 -I${FFTW_INCLUDE}
FFT_PATH = -L${FFTW_LIB}
FFT_LIB = -lfftw

View File

@ -50,7 +50,7 @@ MPI_LIB = -lmpi_stubs
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW
FFT_INC = -DFFT_FFTW2
FFT_PATH =
FFT_LIB = -lfftw

View File

@ -53,7 +53,7 @@ MPI_LIB =
FFTW = /usr/local
FFT_INC = -DFFT_FFTW -I${FFTW}/include
FFT_INC = -DFFT_FFTW2 -I${FFTW}/include
FFT_PATH = -L${FFTW}/lib
FFT_LIB = -lfftw

View File

@ -51,7 +51,7 @@ MPI_LIB =
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW -I/scr/oppe/LAMMPS/fftw-2.1.5/include
FFT_INC = -DFFT_FFTW2 -I/scr/oppe/LAMMPS/fftw-2.1.5/include
FFT_PATH = -L/scr/oppe/LAMMPS/fftw-2.1.5/lib
FFT_LIB = -lfftw

View File

@ -70,7 +70,7 @@ MPI_LIB =
# PATH = path for FFT library
# LIB = name of FFT library
#FFT_INC = -DFFT_FFTW -I${FFTW_INCLUDE}
#FFT_INC = -DFFT_FFTW2 -I${FFTW_INCLUDE}
#FFT_PATH = -L${FFTW_LIB}
#FFT_LIB = -lfftw

View File

@ -51,7 +51,7 @@ MPI_LIB =
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW -I/projects/fftw/fftw-2.1.5/include
FFT_INC = -DFFT_FFTW2 -I/projects/fftw/fftw-2.1.5/include
FFT_PATH =
FFT_LIB = -lfftw

View File

@ -53,7 +53,7 @@ MPI_LIB = -lmpich -lpthread
FFTW_INC = ${TACC_FFTW2_INC}
FFTW_LIB = ${TACC_FFTW2_LIB}
FFT_INC = -DFFT_FFTW -I${FFTW_INC}
FFT_INC = -DFFT_FFTW2 -I${FFTW_INC}
FFT_PATH = -L${FFTW_LIB}
FFT_LIB = ${FFTW_LIB}/libfftw.a

View File

@ -51,7 +51,7 @@ MPI_LIB =
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW
FFT_INC = -DFFT_FFTW2
FFT_PATH =
FFT_LIB = -lfftw

View File

@ -52,7 +52,7 @@ MPI_LIB = -lmpich -lpthread
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW -I$(FFTW_INC)
FFT_INC = -DFFT_FFTW2 -I$(FFTW_INC)
FFT_PATH = -L$(FFTW_LIB)
FFT_LIB = -ldfftw

View File

@ -51,7 +51,7 @@ MPI_LIB =
# PATH = path for FFT library
# LIB = name of FFT library
FFT_INC = -DFFT_FFTW -I/home/sjplimp/fftw/fftw
FFT_INC = -DFFT_FFTW2 -I/home/sjplimp/fftw/fftw
FFT_PATH = -L/home/sjplimp/fftw/fftw/.libs
FFT_LIB = -lfftw

View File

@ -180,6 +180,7 @@ void DihedralNHarmonic::compute(int eflag, int vflag)
// force & energy
// p = sum (i=1,n) a_i * c**(i-1)
// pd = dp/dc
c_ = c;
p = this->a[type][0];
pd = this->a[type][1];
@ -276,7 +277,8 @@ void DihedralNHarmonic::coeff(int narg, char **arg)
if (narg < 4 ) error->all(FLERR,"Incorrect args for dihedral coefficients");
int n = force->inumeric(FLERR,arg[1]);
if (narg != n + 2 ) error->all(FLERR,"Incorrect args for dihedral coefficients");
if (narg != n + 2)
error->all(FLERR,"Incorrect args for dihedral coefficients");
if (!allocated) allocate();

View File

@ -39,6 +39,8 @@ using namespace MathConst;
#define SMALL 0.00001
enum{OFF,INTER,INTRA};
/* ---------------------------------------------------------------------- */
ComputeGroupGroup::ComputeGroupGroup(LAMMPS *lmp, int narg, char **arg) :
@ -64,6 +66,7 @@ ComputeGroupGroup::ComputeGroupGroup(LAMMPS *lmp, int narg, char **arg) :
pairflag = 1;
kspaceflag = 0;
boundaryflag = 1;
molflag = OFF;
int iarg = 4;
while (iarg < narg) {
@ -88,6 +91,16 @@ ComputeGroupGroup::ComputeGroupGroup(LAMMPS *lmp, int narg, char **arg) :
else if (strcmp(arg[iarg+1],"no") == 0) boundaryflag = 0;
else error->all(FLERR,"Illegal compute group/group command");
iarg += 2;
} else if (strcmp(arg[iarg],"molecule") == 0) {
if (iarg+2 > narg)
error->all(FLERR,"Illegal compute group/group command");
if (strcmp(arg[iarg+1],"off") == 0) molflag = OFF;
else if (strcmp(arg[iarg+1],"inter") == 0) molflag = INTER;
else if (strcmp(arg[iarg+1],"intra") == 0) molflag = INTRA;
else error->all(FLERR,"Illegal compute group/group command");
if (molflag != OFF && atom->molecule_flag == 0)
error->all(FLERR,"Compute group/group molecule requires molecule IDs");
iarg += 2;
} else error->all(FLERR,"Illegal compute group/group command");
}
@ -203,6 +216,7 @@ void ComputeGroupGroup::pair_contribution()
int *ilist,*jlist,*numneigh,**firstneigh;
double **x = atom->x;
tagint *molecule = atom->molecule;
int *type = atom->type;
int *mask = atom->mask;
int nlocal = atom->nlocal;
@ -243,13 +257,27 @@ void ComputeGroupGroup::pair_contribution()
factor_coul = special_coul[sbmask(j)];
j &= NEIGHMASK;
if (!(mask[j] & groupbit || mask[j] & jgroupbit)) continue; // skip if atom J is not in either group
// skip if atom J is not in either group
if (!(mask[j] & groupbit || mask[j] & jgroupbit)) continue;
// skip if atoms I,J are only in the same group
int ij_flag = 0;
int ji_flag = 0;
if (mask[i] & groupbit && mask[j] & jgroupbit) ij_flag = 1;
if (mask[j] & groupbit && mask[i] & jgroupbit) ji_flag = 1;
if (!ij_flag && !ji_flag) continue; // skip if atoms I,J are only in the same group
if (!ij_flag && !ji_flag) continue;
// skip if molecule IDs of atoms I,J do not satisfy molflag setting
if (molflag != OFF) {
if (molflag == INTER) {
if (molecule[i] == molecule[j]) continue;
} else {
if (molecule[i] != molecule[j]) continue;
}
}
delx = xtmp - x[j][0];
dely = ytmp - x[j][1];

View File

@ -38,7 +38,7 @@ class ComputeGroupGroup : public Compute {
int jgroup,jgroupbit,othergroupbit;
double **cutsq;
double e_self,e_correction;
int pairflag,kspaceflag,boundaryflag;
int pairflag,kspaceflag,boundaryflag,molflag;
class Pair *pair;
class NeighList *list;
class KSpace *kspace;

View File

@ -31,9 +31,10 @@ int Fix::instance_total = 0;
/* ---------------------------------------------------------------------- */
Fix::Fix(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp),
id(NULL), style(NULL), extlist(NULL), vector_atom(NULL), array_atom(NULL),
vector_local(NULL), array_local(NULL), eatom(NULL), vatom(NULL)
Fix::Fix(LAMMPS *lmp, int narg, char **arg) :
Pointers(lmp),
id(NULL), style(NULL), extlist(NULL), vector_atom(NULL), array_atom(NULL),
vector_local(NULL), array_local(NULL), eatom(NULL), vatom(NULL)
{
instance_me = instance_total++;

View File

@ -187,7 +187,8 @@ void PairBuck::settings(int narg, char **arg)
void PairBuck::coeff(int narg, char **arg)
{
if (narg < 5 || narg > 6) error->all(FLERR,"Incorrect args for pair coefficients");
if (narg < 5 || narg > 6)
error->all(FLERR,"Incorrect args for pair coefficients");
if (!allocated) allocate();
int ilo,ihi,jlo,jhi;

View File

@ -1 +1 @@
#define LAMMPS_VERSION "26 Jan 2017"
#define LAMMPS_VERSION "13 Feb 2017"

View File

@ -2,7 +2,7 @@
# polarizer.py - add Drude oscillators to LAMMPS data file.
# Agilio Padua <agilio.padua@univ-bpclermont.fr>
# Alain Dequidt <alain.dequidt@univ-bpclermont.fr>
# version 2015/07/17
# version 2017/02/03
import sys
import argparse
@ -560,17 +560,17 @@ class Data(object):
print("# Commands to include in the LAMMPS input script\n")
print("# adapt the pair_style line as needed")
print("pair_style hybrid/overlay ... coul/long {0:.1f} "\
print("# adapt the pair_style command as needed")
print("pair_style hybrid/overlay ... coul/long/cs {0:.1f} "\
"thole {1:.3f} {0:.1f}\n".format(cutoff, thole))
print("read_data {0}\n".format(outfile))
print("# add interactions between any atoms and Drude particles")
print("pair_coeff * {0}* coul/long".format(att['id']))
print("# add interactions between atoms and Drude particles")
print("pair_coeff * {0:3d}* coul/long/cs".format(att['id']))
# Thole parameters for I,J pairs
print("# add Thole screening if more than 1 Drude per molecule")
print("# add Thole damping if more than 1 Drude per molecule")
ifound = False
for atti in self.atomtypes:
itype = atti['type'].split()[0]
@ -596,9 +596,14 @@ class Data(object):
if ifound and jfound:
alphaij = (alphai * alphaj)**0.5
tholeij = (tholei + tholej) / 2.0
print("pair_coeff {0:4} {1:4} thole {2:7.3f} "\
"{3:7.3f}".format(atti['id'], attj['id'],
alphaij, tholeij))
if tholeij == thole:
print("pair_coeff {0:4} {1:4} thole {2:7.3f}".format(
atti['id'], attj['id'], alphaij))
else:
print("pair_coeff {0:4} {1:4} thole {2:7.3f} "\
"{3:7.3f}".format(atti['id'],attj['id'],
alphaij, tholeij))
jfound = False
ifound = False
print("")
@ -627,11 +632,18 @@ class Data(object):
print("# ATTENTION!")
print("# * special_bonds may need 'extra' keyword, LAMMPS will exit "
"with a message")
print("# * give all I<=J pair interactions, no mixing")
print("# * if using fix shake the group-ID must not include "
"Drude particles")
print("# use group ATOMS for example")
"with a message.")
print("# * If using fix shake the group-ID must not include "
"Drude particles.")
print("# Use group ATOMS for example.")
print("# * Give all I<=J pair interactions, no mixing.")
print("# * Pair style coul/long/cs from CORESHELL package is used "\
"for interactions")
print("# of Drude particles. Alternatively pair lj/cut/thole/long "\
"could be used,")
print("# avoiding hybrid/overlay and allowing mixing. See doc "\
"pages.")
# --------------------------------------