changes to imbalance weight factors

This commit is contained in:
Steve Plimpton
2016-10-05 10:33:39 -06:00
parent 11c2892e54
commit c46be7db62
36 changed files with 241 additions and 302 deletions

View File

@ -1,7 +1,7 @@
<!-- HTML_ONLY --> <!-- HTML_ONLY -->
<HEAD> <HEAD>
<TITLE>LAMMPS Users Manual</TITLE> <TITLE>LAMMPS Users Manual</TITLE>
<META NAME="docnumber" CONTENT="30 Sep 2016 version"> <META NAME="docnumber" CONTENT="5 Oct 2016 version">
<META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories"> <META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
<META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License."> <META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License.">
</HEAD> </HEAD>
@ -21,7 +21,7 @@
<H1></H1> <H1></H1>
LAMMPS Documentation :c,h3 LAMMPS Documentation :c,h3
30 Sep 2016 version :c,h4 5 Oct 2016 version :c,h4
Version info: :h4 Version info: :h4

View File

@ -319,24 +319,25 @@ accurately would be impractical and slow down the computation.
Instead the {weight} keyword implements several ways to influence the Instead the {weight} keyword implements several ways to influence the
per-particle weights empirically by properties readily available or per-particle weights empirically by properties readily available or
using the user's knowledge of the system. Note that the absolute using the user's knowledge of the system. Note that the absolute
value of the weights are not important; their ratio is what is used to value of the weights are not important; only their relative ratios
assign particles to processors. A particle with a weight of 2.5 is affect which particle is assigned to which processor. A particle with
assumed to require 5x more computational than a particle with a weight a weight of 2.5 is assumed to require 5x more computational than a
of 0.5. particle with a weight of 0.5. For all the options below the weight
assigned to a particle must be a positive value; an error will be be
generated if a weight is <= 0.0.
Below is a list of possible weight options with a short description of Below is a list of possible weight options with a short description of
their usage and some example scenarios where they might be applicable. their usage and some example scenarios where they might be applicable.
It is possible to apply multiple weight flags and the weightins they It is possible to apply multiple weight flags and the weightings they
induce will be combined through multiplication. Most of the time, induce will be combined through multiplication. Most of the time,
however, it is sufficient to use just one method. however, it is sufficient to use just one method.
The {group} weight style assigns weight factors to specified The {group} weight style assigns weight factors to specified
"groups"_group.html of particles. The {group} style keyword is "groups"_group.html of particles. The {group} style keyword is
followed by the number of groups, then pairs of group IDs and the followed by the number of groups, then pairs of group IDs and the
corresponding weight factor. If a particle belongs to none of the corresponding weight factor. If a particle belongs to none of the
specified groups, its weight is not changed. If it belongs to specified groups, its weight is not changed. If it belongs to
multiple groups, its weight is the product of the weight factors. multiple groups, its weight is the product of the weight factors.
The weight factors have to be positive.
This weight style is useful in combination with pair style This weight style is useful in combination with pair style
"hybrid"_pair_hybrid.html, e.g. when combining a more costly manybody "hybrid"_pair_hybrid.html, e.g. when combining a more costly manybody
@ -347,14 +348,24 @@ the computational cost for each group remains constant over time.
This is a purely empirical weighting, so a series test runs to tune This is a purely empirical weighting, so a series test runs to tune
the assigned weight factors for optimal performance is recommended. the assigned weight factors for optimal performance is recommended.
The {neigh} weight style assigns a weight to each particle equal to The {neigh} weight style assigns the same weight to each particle
its number of neighbors divided by the avergage number of neighbors owned by a processor based on the total count of neighbors in the
for all particles. The {factor} setting is then appied as an overall neighbor list owned by that processor. The motivation is that more
scale factor to all the {neigh} weights which allows tuning of the neighbors means a higher computational cost. The style does not use
impact of this style. A {factor} smaller than 1.0 (e.g. 0.8) often neighbors per atom to assign a unique weight to each atom, because
results in the best performance, since the number of neighbors is that value can vary depending on how the neighbor list is built.
likely to overestimate the ideal weight. The factor has to be between
0.0 and 2.0. The {factor} setting is applied as an overall scale factor to the
{neigh} weights which allows adjustment of their impact on the
balancing operation. The specified {factor} value must be positive.
A value > 1.0 will increase the weights so that the ratio of max
weight to min weight increases by {factor}. A value < 1.0 will
decrease the weights so that the ratio of max weight to min weight
decreases by {factor}. In both cases the intermediate weight values
increase/decrease proportionally as well. A value = 1.0 has no effect
on the {neigh} weights. As a rule of thumb, we have found a {factor}
of about 0.8 often results in the best performance, since the number
of neighbors is likely to overestimate the ideal weight.
This weight style is useful for systems where there are different This weight style is useful for systems where there are different
cutoffs used for different pairs of interations, or the density cutoffs used for different pairs of interations, or the density
@ -370,35 +381,48 @@ weights are computed. Inserting a "run 0 post no"_run.html command
before issuing the {balance} command, may be a workaround for this before issuing the {balance} command, may be a workaround for this
case, as it will induce the neighbor list to be built. case, as it will induce the neighbor list to be built.
The {time} weight style uses "timer data"_timer.html to estimate a The {time} weight style uses "timer data"_timer.html to estimate
weight for each particle. It uses the same information as is used for weights. It assigns the same weight to each particle owned by a
the "MPI task timing breakdown"_Section_start.html#start_8, namely, processor based on the total computational time spent by that
the timings for sections {Pair}, {Bond}, {Kspace}, and {Neigh}. The processor. See details below on what time window is used. It uses
time spent in these sections of the timestep are measured for each MPI the same timing information as is used for the "MPI task timing
rank, summed up, then converted into a cost for each MPI rank relative breakdown"_Section_start.html#start_8, namely, for sections {Pair},
to the average cost over all MPI ranks for the same sections. That {Bond}, {Kspace}, and {Neigh}. The time spent in those portions of
cost then evenly distributed over all the particles owned by that the timestep are measured for each MPI rank, summed, then divided by
rank. Finally, the {factor} setting is then appied as an overall the number of particles owned by that processor. I.e. the weight is
scale factor to all the {time} weights as a way to fine tune the an effective CPU time/particle averaged over the particles on that
impact of this weight style. Good {factor} values to use are processor.
typically between 0.5 and 1.2. Allowed are values between 0.0 and 2.0.
For the {balance} command the timing data is taken from the preceding The {factor} setting is applied as an overall scale factor to the
run command, i.e. the timings are for the entire previous run. For {time} weights which allows adjustment of their impact on the
the {fix balance} command the timing data is for only the timesteps balancing operation. The specified {factor} value must be positive.
since the last balancing operation was performed. If timing A value > 1.0 will increase the weights so that the ratio of max
information for the required sections is not available, e.g. at the weight to min weight increases by {factor}. A value < 1.0 will
beginning of a run, or when the "timer"_timer.html command is set to decrease the weights so that the ratio of max weight to min weight
either {loop} or {off}, a warning is issued. In this case no weights decreases by {factor}. In both cases the intermediate weight values
are computed. increase/decrease proportionally as well. A value = 1.0 has no effect
on the {time} weights. As a rule of thumb, effective values to use
are typicall between 0.5 and 1.2. Note that the timer quantities
mentioned above can be affected by communication which occurs in the
middle of the operations, e.g. pair styles with intermediate exchange
of data witin the force computation, and likewise for KSpace solves.
This weight style is the most generic one, and should be tried first, When using the {time} weight style with the {balance} command, the
if neither the {group} or {neigh} styles are easily applicable. timing data is taken from the preceding run command, i.e. the timings
However, since the computed cost function is averaged over all local are for the entire previous run. For the {fix balance} command the
particles this weight style may not be highly accurate. This style timing data is for only the timesteps since the last balancing
can also be effective as a secondary weight in combination with either operation was performed. If timing information for the required
{group} or {neigh} to offset some of inaccuracies in either of those sections is not available, e.g. at the beginning of a run, or when the
heuristics. "timer"_timer.html command is set to either {loop} or {off}, a warning
is issued. In this case no weights are computed.
NOTE: The {time} weight style is the most generic option, and should
be tried first, unless the {group} style is easily applicable.
However, since the computed cost function is averaged over all
particles on a processor, the weights may not be highly accurate.
This style can also be effective as a secondary weight in combination
with either {group} or {neigh} to offset some of inaccuracies in
either of those heuristics.
The {var} weight style assigns per-particle weights by evaluating an The {var} weight style assigns per-particle weights by evaluating an
"atom-style variable"_variable.html specified by {name}. This is "atom-style variable"_variable.html specified by {name}. This is

View File

@ -49,8 +49,8 @@ keyword = {append} or {buffer} or {element} or {every} or {fileper} or {first} o
-N = sort per-atom lines in descending order by the Nth column -N = sort per-atom lines in descending order by the Nth column
{thresh} args = attribute operation value {thresh} args = attribute operation value
attribute = same attributes (x,fy,etotal,sxx,etc) used by dump custom style attribute = same attributes (x,fy,etotal,sxx,etc) used by dump custom style
operation = "<" or "<=" or ">" or ">=" or "==" or "!=" operation = "<" or "<=" or ">" or ">=" or "==" or "!=" or "|^"
value = numeric value to compare to value = numeric value to compare to, or LAST
these 3 args can be replaced by the word "none" to turn off thresholding these 3 args can be replaced by the word "none" to turn off thresholding
{unwrap} arg = {yes} or {no} :pre {unwrap} arg = {yes} or {no} :pre
these keywords apply only to the {image} and {movie} "styles"_dump_image.html :l these keywords apply only to the {image} and {movie} "styles"_dump_image.html :l
@ -458,16 +458,59 @@ as well as memory, versus unsorted output.
The {thresh} keyword only applies to the dump {custom}, {cfg}, The {thresh} keyword only applies to the dump {custom}, {cfg},
{image}, and {movie} styles. Multiple thresholds can be specified. {image}, and {movie} styles. Multiple thresholds can be specified.
Specifying "none" turns off all threshold criteria. If thresholds are Specifying {none} turns off all threshold criteria. If thresholds are
specified, only atoms whose attributes meet all the threshold criteria specified, only atoms whose attributes meet all the threshold criteria
are written to the dump file or included in the image. The possible are written to the dump file or included in the image. The possible
attributes that can be tested for are the same as those that can be attributes that can be tested for are the same as those that can be
specified in the "dump custom"_dump.html command, with the exception specified in the "dump custom"_dump.html command, with the exception
of the {element} attribute, since it is not a numeric value. Note of the {element} attribute, since it is not a numeric value. Note
that different attributes can be output by the dump custom command that a different attributes can be used than those output by the "dump
than are used as threshold criteria by the dump_modify command. custom"_dump.html command. E.g. you can output the coordinates and
E.g. you can output the coordinates and stress of atoms whose energy stress of atoms whose energy is above some threshold.
is above some threshold.
If an atom-style variable is used as the attribute, then it can
produce continuous numeric values or effective Boolean 0/1 values
which may be useful for the comparision operation. Boolean values can
be generated by variable formulas that use comparison or Boolean math
operators or special functions like gmask() and rmask() and grmask().
See the "variable"_variable.html command doc page for details.
NOTE: The LAST option, discussed below, is not yet implemented. It
will be soon.
The specified value must be a simple numeric value or the word LAST.
If LAST is used, it refers to the value of the attribute the last time
the dump command was invoked to produce a snapshot. This is a way to
only dump atoms whose attribute has changed (or not changed).
Three examples follow.
dump_modify ... thresh ix != LAST :pre
This will dump atoms which have crossed the periodic x boundary of the
simulation box since the last dump. (Note that atoms that crossed
once and then crossed back between the two dump timesteps would not be
included.)
region foo sphere 10 20 10 15
variable inregion atom rmask(foo)
dump_modify ... thresh v_inregion |^ LAST
This will dump atoms which crossed the boundary of the spherical
region since the last dump.
variable charge atom "(q > 0.5) || (q < -0.5)"
dump_modify ... thresh v_charge |^ LAST
This will dump atoms whose charge has changed from an absolute value
less than 1/2 to greater than 1/2 (or vice versa) since the last dump.
E.g. due to reactions and subsequent charge equilibration in a
reactive force field.
The choice of operations are the usual comparison operators. The XOR
operation (exclusive or) is also included as "|^". In this context,
XOR means that if either the attribute or value is 0.0 and the other
is non-zero, then the result is "true" and the threshold criterion is
met. Otherwise it is not met.
:line :line

View File

@ -11,13 +11,19 @@ velocity all create 1.44 87287 loop geom
pair_style body 5.0 pair_style body 5.0
pair_coeff * * 1.0 1.0 pair_coeff * * 1.0 1.0
neighbor 0.3 bin neighbor 0.5 bin
neigh_modify every 1 delay 0 check yes
fix 1 all nve/body fix 1 all nve/body
#fix 1 all nvt/body temp 1.44 1.44 1.0
fix 2 all enforce2d fix 2 all enforce2d
#compute 1 all body/local type 1 2 3 #compute 1 all body/local type 1 2 3
#dump 1 all local 100 dump.body index c_1[1] c_1[2] c_1[3] c_1[4] #dump 1 all local 100 dump.body index c_1[1] c_1[2] c_1[3] c_1[4]
thermo 500 #dump 2 all image 1000 image.*.jpg type type &
# zoom 1.6 adiam 1.5 body type 1.0 0
#dump_modify 2 pad 5
thermo 100
run 10000 run 10000

View File

@ -40,9 +40,6 @@ Angle::Angle(LAMMPS *lmp) : Pointers(lmp)
vatom = NULL; vatom = NULL;
setflag = NULL; setflag = NULL;
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;
datamask_modify = ALL_MASK; datamask_modify = ALL_MASK;

View File

@ -29,10 +29,9 @@ class Angle : protected Pointers {
double energy; // accumulated energies double energy; // accumulated energies
double virial[6]; // accumlated virial double virial[6]; // accumlated virial
double *eatom,**vatom; // accumulated per-atom energy/virial double *eatom,**vatom; // accumulated per-atom energy/virial
unsigned int datamask;
unsigned int datamask_ext;
// KOKKOS host/device flag and data masks // KOKKOS host/device flag and data masks
ExecutionSpace execution_space; ExecutionSpace execution_space;
unsigned int datamask_read,datamask_modify; unsigned int datamask_read,datamask_modify;
int copymode; int copymode;
@ -51,9 +50,6 @@ class Angle : protected Pointers {
virtual double single(int, int, int, int) = 0; virtual double single(int, int, int, int) = 0;
virtual double memory_usage(); virtual double memory_usage();
virtual unsigned int data_mask() {return datamask;}
virtual unsigned int data_mask_ext() {return datamask_ext;}
protected: protected:
int suffix_flag; // suffix compatibility flag int suffix_flag; // suffix compatibility flag

View File

@ -208,9 +208,6 @@ Atom::Atom(LAMMPS *lmp) : Pointers(lmp)
atom_style = NULL; atom_style = NULL;
avec = NULL; avec = NULL;
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
avec_map = new AtomVecCreatorMap(); avec_map = new AtomVecCreatorMap();
#define ATOM_CLASS #define ATOM_CLASS

View File

@ -124,11 +124,6 @@ class Atom : protected Pointers {
char **iname,**dname; char **iname,**dname;
int nivector,ndvector; int nivector,ndvector;
// used by USER-CUDA to flag used per-atom arrays
unsigned int datamask;
unsigned int datamask_ext;
// atom style and per-atom array existence flags // atom style and per-atom array existence flags
// customize by adding new flag // customize by adding new flag

View File

@ -156,10 +156,6 @@ E: Invalid atom_style command
Self-explanatory. Self-explanatory.
E: USER-CUDA package requires a cuda enabled atom_style
Self-explanatory.
E: KOKKOS package requires a kokkos enabled atom_style E: KOKKOS package requires a kokkos enabled atom_style
Self-explanatory. Self-explanatory.

View File

@ -44,9 +44,6 @@ Bond::Bond(LAMMPS *lmp) : Pointers(lmp)
vatom = NULL; vatom = NULL;
setflag = NULL; setflag = NULL;
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;
datamask_modify = ALL_MASK; datamask_modify = ALL_MASK;

View File

@ -29,10 +29,9 @@ class Bond : protected Pointers {
double energy; // accumulated energies double energy; // accumulated energies
double virial[6]; // accumlated virial double virial[6]; // accumlated virial
double *eatom,**vatom; // accumulated per-atom energy/virial double *eatom,**vatom; // accumulated per-atom energy/virial
unsigned int datamask;
unsigned int datamask_ext;
// KOKKOS host/device flag and data masks // KOKKOS host/device flag and data masks
ExecutionSpace execution_space; ExecutionSpace execution_space;
unsigned int datamask_read,datamask_modify; unsigned int datamask_read,datamask_modify;
int copymode; int copymode;
@ -51,9 +50,6 @@ class Bond : protected Pointers {
virtual double single(int, double, int, int, double &) = 0; virtual double single(int, double, int, int, double &) = 0;
virtual double memory_usage(); virtual double memory_usage();
virtual unsigned int data_mask() {return datamask;}
virtual unsigned int data_mask_ext() {return datamask_ext;}
void write_file(int, char**); void write_file(int, char**);
protected: protected:

View File

@ -155,10 +155,6 @@ class CommTiled : public Comm {
/* ERROR/WARNING messages: /* ERROR/WARNING messages:
E: USER-CUDA package does not yet support comm_style tiled
Self-explanatory.
E: KOKKOS package does not yet support comm_style tiled E: KOKKOS package does not yet support comm_style tiled
Self-explanatory. Self-explanatory.

View File

@ -99,9 +99,6 @@ Compute::Compute(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp),
// data masks // data masks
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;
datamask_modify = ALL_MASK; datamask_modify = ALL_MASK;

View File

@ -84,9 +84,6 @@ class Compute : protected Pointers {
int comm_reverse; // size of reverse communication (0 if none) int comm_reverse; // size of reverse communication (0 if none)
int dynamic_group_allow; // 1 if can be used with dynamic group, else 0 int dynamic_group_allow; // 1 if can be used with dynamic group, else 0
unsigned int datamask;
unsigned int datamask_ext;
// KOKKOS host/device flag and data masks // KOKKOS host/device flag and data masks
ExecutionSpace execution_space; ExecutionSpace execution_space;
@ -140,9 +137,6 @@ class Compute : protected Pointers {
double, double, double, double, double, double,
double, double, double) {} double, double, double) {}
virtual int unsigned data_mask() {return datamask;}
virtual int unsigned data_mask_ext() {return datamask_ext;}
protected: protected:
int instance_me; // which Compute class instantiation I am int instance_me; // which Compute class instantiation I am

View File

@ -41,9 +41,6 @@ Dihedral::Dihedral(LAMMPS *lmp) : Pointers(lmp)
vatom = NULL; vatom = NULL;
setflag = NULL; setflag = NULL;
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;
datamask_modify = ALL_MASK; datamask_modify = ALL_MASK;

View File

@ -29,10 +29,9 @@ class Dihedral : protected Pointers {
double energy; // accumulated energy double energy; // accumulated energy
double virial[6]; // accumlated virial double virial[6]; // accumlated virial
double *eatom,**vatom; // accumulated per-atom energy/virial double *eatom,**vatom; // accumulated per-atom energy/virial
unsigned int datamask;
unsigned int datamask_ext;
// KOKKOS host/device flag and data masks // KOKKOS host/device flag and data masks
ExecutionSpace execution_space; ExecutionSpace execution_space;
unsigned int datamask_read,datamask_modify; unsigned int datamask_read,datamask_modify;
int copymode; int copymode;
@ -49,9 +48,6 @@ class Dihedral : protected Pointers {
virtual void write_data(FILE *) {} virtual void write_data(FILE *) {}
virtual double memory_usage(); virtual double memory_usage();
virtual unsigned int data_mask() {return datamask;}
virtual unsigned int data_mask_ext() {return datamask_ext;}
protected: protected:
int suffix_flag; // suffix compatibility flag int suffix_flag; // suffix compatibility flag

View File

@ -43,7 +43,7 @@ enum{ID,MOL,PROC,PROCP1,TYPE,ELEMENT,MASS,
OMEGAX,OMEGAY,OMEGAZ,ANGMOMX,ANGMOMY,ANGMOMZ, OMEGAX,OMEGAY,OMEGAZ,ANGMOMX,ANGMOMY,ANGMOMZ,
TQX,TQY,TQZ, TQX,TQY,TQZ,
COMPUTE,FIX,VARIABLE,INAME,DNAME}; COMPUTE,FIX,VARIABLE,INAME,DNAME};
enum{LT,LE,GT,GE,EQ,NEQ}; enum{LT,LE,GT,GE,EQ,NEQ,XOR};
enum{INT,DOUBLE,STRING,BIGINT}; // same as in DumpCFG enum{INT,DOUBLE,STRING,BIGINT}; // same as in DumpCFG
#define INVOKED_PERATOM 8 #define INVOKED_PERATOM 8
@ -947,6 +947,11 @@ int DumpCustom::count()
} else if (thresh_op[ithresh] == NEQ) { } else if (thresh_op[ithresh] == NEQ) {
for (i = 0; i < nlocal; i++, ptr += nstride) for (i = 0; i < nlocal; i++, ptr += nstride)
if (choose[i] && *ptr == value) choose[i] = 0; if (choose[i] && *ptr == value) choose[i] = 0;
} else if (thresh_op[ithresh] == XOR) {
for (i = 0; i < nlocal; i++, ptr += nstride)
if (choose[i] && (*ptr == 0.0 && value == 0.0) ||
(*ptr != 0.0 && value != 0.0))
choose[i] = 0;
} }
} }
} }
@ -1835,6 +1840,7 @@ int DumpCustom::modify_param(int narg, char **arg)
else if (strcmp(arg[2],">=") == 0) thresh_op[nthresh] = GE; else if (strcmp(arg[2],">=") == 0) thresh_op[nthresh] = GE;
else if (strcmp(arg[2],"==") == 0) thresh_op[nthresh] = EQ; else if (strcmp(arg[2],"==") == 0) thresh_op[nthresh] = EQ;
else if (strcmp(arg[2],"!=") == 0) thresh_op[nthresh] = NEQ; else if (strcmp(arg[2],"!=") == 0) thresh_op[nthresh] = NEQ;
else if (strcmp(arg[2],"|^") == 0) thresh_op[nthresh] = XOR;
else error->all(FLERR,"Invalid dump_modify threshold operator"); else error->all(FLERR,"Invalid dump_modify threshold operator");
// set threshold value // set threshold value

View File

@ -95,10 +95,7 @@ id(NULL), style(NULL), eatom(NULL), vatom(NULL)
maxeatom = maxvatom = 0; maxeatom = maxvatom = 0;
vflag_atom = 0; vflag_atom = 0;
// CUDA and KOKKOS per-fix data masks // KOKKOS per-fix data masks
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;

View File

@ -99,11 +99,6 @@ class Fix : protected Pointers {
ExecutionSpace execution_space; ExecutionSpace execution_space;
unsigned int datamask_read,datamask_modify; unsigned int datamask_read,datamask_modify;
// USER-CUDA per-fix data masks
unsigned int datamask;
unsigned int datamask_ext;
Fix(class LAMMPS *, int, char **); Fix(class LAMMPS *, int, char **);
virtual ~Fix(); virtual ~Fix();
void modify_params(int, char **); void modify_params(int, char **);
@ -211,9 +206,6 @@ class Fix : protected Pointers {
virtual double memory_usage() {return 0.0;} virtual double memory_usage() {return 0.0;}
virtual unsigned int data_mask() {return datamask;}
virtual unsigned int data_mask_ext() {return datamask_ext;}
protected: protected:
int instance_me; // which Fix class instantiation I am int instance_me; // which Fix class instantiation I am

View File

@ -18,12 +18,11 @@
#include "error.h" #include "error.h"
using namespace LAMMPS_NS; using namespace LAMMPS_NS;
#define SMALL 0.001
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
ImbalanceGroup::ImbalanceGroup(LAMMPS *lmp) : Imbalance(lmp), ImbalanceGroup::ImbalanceGroup(LAMMPS *lmp) : Imbalance(lmp), id(0), factor(0)
id(0), factor(0), num(0) {} {}
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
@ -50,7 +49,7 @@ int ImbalanceGroup::options(int narg, char **arg)
if (id[i] < 0) if (id[i] < 0)
error->all(FLERR,"Unknown group in balance weight command"); error->all(FLERR,"Unknown group in balance weight command");
factor[i] = force->numeric(FLERR,arg[2*i+2]); factor[i] = force->numeric(FLERR,arg[2*i+2]);
if (factor[i] < 0.0) error->all(FLERR,"Illegal balance weight command"); if (factor[i] <= 0.0) error->all(FLERR,"Illegal balance weight command");
} }
return 2*num+1; return 2*num+1;
} }
@ -67,13 +66,10 @@ void ImbalanceGroup::compute(double *weight)
for (int i = 0; i < nlocal; ++i) { for (int i = 0; i < nlocal; ++i) {
const int imask = mask[i]; const int imask = mask[i];
double iweight = weight[i];
for (int j = 0; j < num; ++j) { for (int j = 0; j < num; ++j) {
if (imask & bitmask[id[j]]) if (imask & bitmask[id[j]])
iweight *= factor[j]; weight[i] *= factor[j];
} }
if (iweight < SMALL) weight[i] = SMALL;
else weight[i] = iweight;
} }
} }

View File

@ -22,14 +22,14 @@
#include "error.h" #include "error.h"
using namespace LAMMPS_NS; using namespace LAMMPS_NS;
#define SMALL 0.001
#define BIG 1.0e20
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
ImbalanceNeigh::ImbalanceNeigh(LAMMPS *lmp) : Imbalance(lmp) ImbalanceNeigh::ImbalanceNeigh(LAMMPS *lmp) : Imbalance(lmp)
{ {
did_warn = 0; did_warn = 0;
factor = 1.0;
} }
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
@ -38,8 +38,7 @@ int ImbalanceNeigh::options(int narg, char **arg)
{ {
if (narg < 1) error->all(FLERR,"Illegal balance weight command"); if (narg < 1) error->all(FLERR,"Illegal balance weight command");
factor = force->numeric(FLERR,arg[0]); factor = force->numeric(FLERR,arg[0]);
if ((factor < 0.0) || (factor > 2.0)) if (factor <= 0.0) error->all(FLERR,"Illegal balance weight command");
error->all(FLERR,"Illegal balance weight command");
return 1; return 1;
} }
@ -52,7 +51,7 @@ void ImbalanceNeigh::compute(double *weight)
if (factor == 0.0) return; if (factor == 0.0) return;
// find suitable neighbor list // find suitable neighbor list
// we can only make use of certain (conventional) neighbor lists // can only use certain conventional neighbor lists
for (req = 0; req < neighbor->old_nrequest; ++req) { for (req = 0; req < neighbor->old_nrequest; ++req) {
if ((neighbor->old_requests[req]->half || if ((neighbor->old_requests[req]->half ||
@ -65,37 +64,46 @@ void ImbalanceNeigh::compute(double *weight)
if (req >= neighbor->old_nrequest || neighbor->ago < 0) { if (req >= neighbor->old_nrequest || neighbor->ago < 0) {
if (comm->me == 0 && !did_warn) if (comm->me == 0 && !did_warn)
error->warning(FLERR,"No suitable neighbor list found. " error->warning(FLERR,"Balance weight neigh skipped b/c no list found");
"Neighbor weighted balancing skipped");
did_warn = 1; did_warn = 1;
return; return;
} }
// neighsum = total neigh count for atoms on this proc
// localwt = weight assigned to each owned atom
NeighList *list = neighbor->lists[req]; NeighList *list = neighbor->lists[req];
bigint neighsum = 0;
const int inum = list->inum; const int inum = list->inum;
const int * const ilist = list->ilist; const int * const ilist = list->ilist;
const int * const numneigh = list->numneigh; const int * const numneigh = list->numneigh;
int nlocal = atom->nlocal;
// first pass: get local number of neighbors bigint neighsum = 0;
for (int i = 0; i < inum; ++i) neighsum += numneigh[ilist[i]]; for (int i = 0; i < inum; ++i) neighsum += numneigh[ilist[i]];
double localwt = 0.0;
if (nlocal) localwt = 1.0*neighsum/nlocal;
double allatoms = static_cast <double>(atom->natoms); if (nlocal && localwt <= 0.0) error->one(FLERR,"Balance weight <= 0.0");
if (allatoms == 0.0) allatoms = 1.0;
double allavg;
double myavg = static_cast<double>(neighsum)/allatoms;
MPI_Allreduce(&myavg,&allavg,1,MPI_DOUBLE,MPI_SUM,world);
// second pass: compute and apply weights
double scale = 1.0/allavg; // apply factor if specified != 1.0
for (int ii = 0; ii < inum; ++ii) { // wtlo,wthi = lo/hi values excluding 0.0 due to no atoms on this proc
const int i = ilist[ii]; // lo value does not change
weight[i] *= (1.0-factor) + factor*scale*numneigh[i]; // newhi = new hi value to give hi/lo ratio factor times larger/smaller
if (weight[i] < SMALL) weight[i] = SMALL; // expand/contract all localwt values from lo->hi to lo->newhi
if (factor != 1.0) {
double wtlo,wthi;
if (localwt == 0.0) localwt = BIG;
MPI_Allreduce(&localwt,&wtlo,1,MPI_DOUBLE,MPI_MIN,world);
if (localwt == BIG) localwt = 0.0;
MPI_Allreduce(&localwt,&wthi,1,MPI_DOUBLE,MPI_MAX,world);
if (wtlo == wthi) return;
double newhi = wthi*factor;
localwt = wtlo + ((localwt-wtlo)/(wthi-wtlo)) * (newhi-wtlo);
} }
for (int i = 0; i < nlocal; i++) weight[i] *= localwt;
} }
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */

View File

@ -19,15 +19,16 @@
#include "timer.h" #include "timer.h"
#include "error.h" #include "error.h"
// DEBUG
#include "update.h"
using namespace LAMMPS_NS; using namespace LAMMPS_NS;
#define SMALL 0.001
#define BIG 1.0e20
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
ImbalanceTime::ImbalanceTime(LAMMPS *lmp) : Imbalance(lmp) ImbalanceTime::ImbalanceTime(LAMMPS *lmp) : Imbalance(lmp) {}
{
factor = 1.0;
}
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
@ -35,8 +36,7 @@ int ImbalanceTime::options(int narg, char **arg)
{ {
if (narg < 1) error->all(FLERR,"Illegal balance weight command"); if (narg < 1) error->all(FLERR,"Illegal balance weight command");
factor = force->numeric(FLERR,arg[0]); factor = force->numeric(FLERR,arg[0]);
if ((factor < 0.0) || (factor > 2.0)) if (factor <= 0.0) error->all(FLERR,"Illegal balance weight command");
error->all(FLERR,"Illegal balance weight command");
return 1; return 1;
} }
@ -53,37 +53,60 @@ void ImbalanceTime::init()
void ImbalanceTime::compute(double *weight) void ImbalanceTime::compute(double *weight)
{ {
const int nlocal = atom->nlocal; if (!timer->has_normal()) return;
const bigint natoms = atom->natoms;
if (factor == 0.0) return; // cost = CPU time for relevant timers since last invocation
// localwt = weight assigned to each owned atom
// just return if no time yet tallied
// compute the cost function of based on relevant timers double cost = -last;
cost += timer->get_wall(Timer::PAIR);
if (timer->has_normal()) { cost += timer->get_wall(Timer::NEIGH);
double cost = -last; cost += timer->get_wall(Timer::BOND);
cost += timer->get_wall(Timer::PAIR); cost += timer->get_wall(Timer::KSPACE);
cost += timer->get_wall(Timer::NEIGH);
cost += timer->get_wall(Timer::BOND);
cost += timer->get_wall(Timer::KSPACE);
double allcost; /*
MPI_Allreduce(&cost,&allcost,1,MPI_DOUBLE,MPI_SUM,world); printf("TIME %ld %d %g %g: %g %g %g %g\n",
update->ntimestep,atom->nlocal,last,cost,
timer->get_wall(Timer::PAIR),
timer->get_wall(Timer::NEIGH),
timer->get_wall(Timer::BOND),
timer->get_wall(Timer::KSPACE));
*/
if ((allcost > 0.0) && (nlocal > 0)) { double maxcost;
const double avgcost = allcost/natoms; MPI_Allreduce(&cost,&maxcost,1,MPI_DOUBLE,MPI_MAX,world);
const double localcost = cost/nlocal; if (maxcost <= 0.0) return;
const double scale = (1.0-factor) + factor*localcost/avgcost;
for (int i = 0; i < nlocal; ++i) {
weight[i] *= scale;
if (weight[i] < SMALL) weight[i] = SMALL;
}
}
// record time up to this point int nlocal = atom->nlocal;
double localwt = 0.0;
if (nlocal) localwt = cost/nlocal;
last += cost; if (nlocal && localwt <= 0.0) error->one(FLERR,"Balance weight <= 0.0");
// apply factor if specified != 1.0
// wtlo,wthi = lo/hi values excluding 0.0 due to no atoms on this proc
// lo value does not change
// newhi = new hi value to give hi/lo ratio factor times larger/smaller
// expand/contract all localwt values from lo->hi to lo->newhi
if (factor != 1.0) {
double wtlo,wthi;
if (localwt == 0.0) localwt = BIG;
MPI_Allreduce(&localwt,&wtlo,1,MPI_DOUBLE,MPI_MIN,world);
if (localwt == BIG) localwt = 0.0;
MPI_Allreduce(&localwt,&wthi,1,MPI_DOUBLE,MPI_MAX,world);
if (wtlo == wthi) return;
double newhi = wthi*factor;
localwt = wtlo + ((localwt-wtlo)/(wthi-wtlo)) * (newhi-wtlo);
} }
for (int i = 0; i < nlocal; i++) weight[i] *= localwt;
// record time up to this point
last += cost;
} }
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */

View File

@ -24,11 +24,10 @@
#include "update.h" #include "update.h"
using namespace LAMMPS_NS; using namespace LAMMPS_NS;
#define SMALL 0.001
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
ImbalanceVar::ImbalanceVar(LAMMPS *lmp) : Imbalance(lmp), name(0), id(0) {} ImbalanceVar::ImbalanceVar(LAMMPS *lmp) : Imbalance(lmp), name(0) {}
/* -------------------------------------------------------------------- */ /* -------------------------------------------------------------------- */
@ -76,10 +75,15 @@ void ImbalanceVar::compute(double *weight)
memory->create(values,nlocal,"imbalance:values"); memory->create(values,nlocal,"imbalance:values");
input->variable->compute_atom(id,all,values,1,0); input->variable->compute_atom(id,all,values,1,0);
for (int i = 0; i < nlocal; ++i) {
weight[i] *= values[i]; int flag = 0;
if (weight[i] < SMALL) weight[i] = SMALL; for (int i = 0; i < nlocal; i++)
} if (values[i] <= 0.0) flag = 1;
int flagall;
MPI_Allreduce(&flag,&flagall,1,MPI_INT,MPI_SUM,world);
if (flagall) error->one(FLERR,"Balance weight <= 0.0");
for (int i = 0; i < nlocal; i++) weight[i] *= values[i];
memory->destroy(values); memory->destroy(values);
} }

View File

@ -38,9 +38,6 @@ Improper::Improper(LAMMPS *lmp) : Pointers(lmp)
vatom = NULL; vatom = NULL;
setflag = NULL; setflag = NULL;
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;
datamask_modify = ALL_MASK; datamask_modify = ALL_MASK;

View File

@ -29,10 +29,9 @@ class Improper : protected Pointers {
double energy; // accumulated energies double energy; // accumulated energies
double virial[6]; // accumlated virial double virial[6]; // accumlated virial
double *eatom,**vatom; // accumulated per-atom energy/virial double *eatom,**vatom; // accumulated per-atom energy/virial
unsigned int datamask;
unsigned int datamask_ext;
// KOKKOS host/device flag and data masks // KOKKOS host/device flag and data masks
ExecutionSpace execution_space; ExecutionSpace execution_space;
unsigned int datamask_read,datamask_modify; unsigned int datamask_read,datamask_modify;
int copymode; int copymode;
@ -49,9 +48,6 @@ class Improper : protected Pointers {
virtual void write_data(FILE *) {} virtual void write_data(FILE *) {}
virtual double memory_usage(); virtual double memory_usage();
virtual unsigned int data_mask() {return datamask;}
virtual unsigned int data_mask_ext() {return datamask_ext;}
protected: protected:
int suffix_flag; // suffix compatibility flag int suffix_flag; // suffix compatibility flag

View File

@ -328,12 +328,6 @@ E: Package command after simulation box is defined
The package command cannot be used afer a read_data, read_restart, or The package command cannot be used afer a read_data, read_restart, or
create_box command. create_box command.
E: Package cuda command without USER-CUDA package enabled
The USER-CUDA package must be installed via "make yes-user-cuda"
before LAMMPS is built, and the "-c on" must be used to enable the
package.
E: Package gpu command without GPU package installed E: Package gpu command without GPU package installed
The GPU package must be installed via "make yes-gpu" before LAMMPS is The GPU package must be installed via "make yes-gpu" before LAMMPS is

View File

@ -88,9 +88,6 @@ KSpace::KSpace(LAMMPS *lmp, int narg, char **arg) : Pointers(lmp)
eatom = NULL; eatom = NULL;
vatom = NULL; vatom = NULL;
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;
datamask_modify = ALL_MASK; datamask_modify = ALL_MASK;

View File

@ -80,10 +80,8 @@ class KSpace : protected Pointers {
int group_group_enable; // 1 if style supports group/group calculation int group_group_enable; // 1 if style supports group/group calculation
unsigned int datamask;
unsigned int datamask_ext;
// KOKKOS host/device flag and data masks // KOKKOS host/device flag and data masks
ExecutionSpace execution_space; ExecutionSpace execution_space;
unsigned int datamask_read,datamask_modify; unsigned int datamask_read,datamask_modify;
int copymode; int copymode;

View File

@ -168,19 +168,10 @@ E: Cannot use -cuda on and -kokkos on together
This is not allowed since both packages can use GPUs. This is not allowed since both packages can use GPUs.
E: Cannot use -cuda on without USER-CUDA installed
The USER-CUDA package must be installed via "make yes-user-cuda"
before LAMMPS is built.
E: Cannot use -kokkos on without KOKKOS installed E: Cannot use -kokkos on without KOKKOS installed
Self-explanatory. Self-explanatory.
E: Using suffix cuda without USER-CUDA package enabled
Self-explanatory.
E: Using suffix gpu without GPU package installed E: Using suffix gpu without GPU package installed
Self-explanatory. Self-explanatory.

View File

@ -100,9 +100,6 @@ Pair::Pair(LAMMPS *lmp) : Pointers(lmp)
// KOKKOS per-fix data masks // KOKKOS per-fix data masks
datamask = ALL_MASK;
datamask_ext = ALL_MASK;
execution_space = Host; execution_space = Host;
datamask_read = ALL_MASK; datamask_read = ALL_MASK;
datamask_modify = ALL_MASK; datamask_modify = ALL_MASK;

View File

@ -97,9 +97,6 @@ class Pair : protected Pointers {
class NeighList *listmiddle; class NeighList *listmiddle;
class NeighList *listouter; class NeighList *listouter;
unsigned int datamask;
unsigned int datamask_ext;
int allocated; // 0/1 = whether arrays are allocated int allocated; // 0/1 = whether arrays are allocated
// public so external driver can check // public so external driver can check
int compute_flag; // 0 if skip compute() int compute_flag; // 0 if skip compute()
@ -191,9 +188,6 @@ class Pair : protected Pointers {
virtual void min_xf_get(int) {} virtual void min_xf_get(int) {}
virtual void min_x_set(int) {} virtual void min_x_set(int) {}
virtual unsigned int data_mask() {return datamask;}
virtual unsigned int data_mask_ext() {return datamask_ext;}
// management of callbacks to be run from ev_tally() // management of callbacks to be run from ev_tally()
protected: protected:

View File

@ -20,9 +20,8 @@ namespace Suffix {
static const int NONE = 0; static const int NONE = 0;
static const int OPT = 1<<0; static const int OPT = 1<<0;
static const int GPU = 1<<1; static const int GPU = 1<<1;
static const int CUDA = 1<<2; static const int OMP = 1<<2;
static const int OMP = 1<<3; static const int INTEL = 1<<3;
static const int INTEL = 1<<4;
} }
} }

View File

@ -81,14 +81,6 @@ class Update : protected Pointers {
/* ERROR/WARNING messages: /* ERROR/WARNING messages:
E: USER-CUDA mode requires CUDA variant of run style
CUDA mode is enabled, so the run style must include a cuda suffix.
E: USER-CUDA mode requires CUDA variant of min style
CUDA mode is enabled, so the min style must include a cuda suffix.
E: Illegal ... command E: Illegal ... command
Self-explanatory. Check the input script syntax and compare to the Self-explanatory. Check the input script syntax and compare to the

View File

@ -4813,72 +4813,6 @@ double Variable::evaluate_boolean(char *str)
return argstack[0].value; return argstack[0].value;
} }
/* ---------------------------------------------------------------------- */
unsigned int Variable::data_mask(int ivar)
{
if (eval_in_progress[ivar]) return EMPTY_MASK;
eval_in_progress[ivar] = 1;
unsigned int datamask = data_mask(data[ivar][0]);
eval_in_progress[ivar] = 0;
return datamask;
}
/* ---------------------------------------------------------------------- */
unsigned int Variable::data_mask(char *str)
{
unsigned int datamask = EMPTY_MASK;
for (unsigned int i = 0; i < strlen(str)-2; i++) {
int istart = i;
while (isalnum(str[i]) || str[i] == '_') i++;
int istop = i-1;
int n = istop - istart + 1;
char *word = new char[n+1];
strncpy(word,&str[istart],n);
word[n] = '\0';
// ----------------
// compute
// ----------------
if ((strncmp(word,"c_",2) == 0) && (i>0) && (!isalnum(str[i-1]))) {
if (domain->box_exist == 0)
error->all(FLERR,
"Variable evaluation before simulation box is defined");
int icompute = modify->find_compute(word+2);
if (icompute < 0)
error->all(FLERR,"Invalid compute ID in variable formula");
datamask &= modify->compute[icompute]->data_mask();
}
if ((strncmp(word,"f_",2) == 0) && (i>0) && (!isalnum(str[i-1]))) {
if (domain->box_exist == 0)
error->all(FLERR,
"Variable evaluation before simulation box is defined");
int ifix = modify->find_fix(word+2);
if (ifix < 0) error->all(FLERR,"Invalid fix ID in variable formula");
datamask &= modify->fix[ifix]->data_mask();
}
if ((strncmp(word,"v_",2) == 0) && (i>0) && (!isalnum(str[i-1]))) {
int ivar = find(word+2);
if (ivar < 0) error->all(FLERR,"Invalid variable name in variable formula");
datamask &= data_mask(ivar);
}
delete [] word;
}
return datamask;
}
/* ---------------------------------------------------------------------- /* ----------------------------------------------------------------------
class to read variable values from a file class to read variable values from a file
for flag = SCALARFILE, reads one value per line for flag = SCALARFILE, reads one value per line

View File

@ -49,9 +49,6 @@ class Variable : protected Pointers {
tagint int_between_brackets(char *&, int); tagint int_between_brackets(char *&, int);
double evaluate_boolean(char *); double evaluate_boolean(char *);
unsigned int data_mask(int ivar);
unsigned int data_mask(char *str);
private: private:
int me; int me;
int nvar; // # of defined variables int nvar; // # of defined variables

View File

@ -1 +1 @@
#define LAMMPS_VERSION "30 Sep 2016" #define LAMMPS_VERSION "5 Oct 2016"