From ca5ad04b019d88df14ff62729b5be809c6e8b8a7 Mon Sep 17 00:00:00 2001 From: sjplimp Date: Wed, 21 Sep 2016 22:15:17 +0000 Subject: [PATCH] git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@15624 f3b2605a-c512-4ea7-a41b-209d697bcdaa --- doc/src/balance.txt | 162 ++++++++++------------------------------ doc/src/fix_balance.txt | 161 +++++++++------------------------------ 2 files changed, 74 insertions(+), 249 deletions(-) diff --git a/doc/src/balance.txt b/doc/src/balance.txt index 85d0b52521..f375efe604 100644 --- a/doc/src/balance.txt +++ b/doc/src/balance.txt @@ -10,7 +10,7 @@ balance command :h3 [Syntax:] -balance thresh style args ... keyword args ... :pre +balance thresh style args ... keyword value ... :pre thresh = imbalance threshhold that must be exceeded to perform a re-balance :ulb,l one style/arg pair can be used (or multiple for {x},{y},{z}) :l @@ -32,23 +32,9 @@ style = {x} or {y} or {z} or {shift} or {rcb} :l Niter = # of times to iterate within each dimension of dimstr sequence stopthresh = stop balancing when this imbalance threshhold is reached {rcb} args = none :pre -zero or more keyword/arg pairs may be appended :l -keyword = {weight} or {out} :l - {weight} style args = use weighted particle counts for the balancing - {style} = {group} or {neigh} or {time} or {var} or {store} - {group} args = Ngroup group1 weight1 group2 weight2 ... - Ngroup = number of groups with assigned weights - group1, group2, ... = group IDs - weight1, weight2, ... = corresponding weight factors - {neigh} factor = compute weight based on number of neighbors - factor = scaling factor (> 0) - {time} factor = compute weight based on time spend computing - factor = scaling factor (> 0) - {var} name = take weight from atom-style variable - name = name of the atom-style variable - {store} name = store weight in custom atom property defined by "fix property/atom"_fix_property_atom.html command - name = atom property name (without d_ prefix) - {out} arg = filename +zero or more keyword/value pairs may be appended :l +keyword = {out} :l + {out} value = filename filename = write each processor's sub-domain to a file :pre :ule @@ -58,41 +44,28 @@ balance 0.9 x uniform y 0.4 0.5 0.6 balance 1.2 shift xz 5 1.1 balance 1.0 shift xz 5 1.1 balance 1.1 rcb -balance 1.0 shift x 10 1.1 weight group 2 fast 0.5 slow 2.0 -balance 1.0 shift x 10 1.1 weight time 0.8 weight neigh 0.5 weight store balance balance 1.0 shift x 20 1.0 out tmp.balance :pre [Description:] This command adjusts the size and shape of processor sub-domains within the simulation box, to attempt to balance the number of -particles and thus indirectly the computational cost (load) more -evenly across processors. The load balancing is "static" in the sense -that this command performs the balancing once, before or between -simulations. The processor sub-domains will then remain static during -the subsequent run. To perform "dynamic" balancing, see the "fix +particles and thus the computational cost (load) evenly across +processors. The load balancing is "static" in the sense that this +command performs the balancing once, before or between simulations. +The processor sub-domains will then remain static during the +subsequent run. To perform "dynamic" balancing, see the "fix balance"_fix_balance.html command, which can adjust processor sub-domain sizes and shapes on-the-fly during a "run"_run.html. -Load-balancing is typically most useful if the particles in the -simulation box have a spatially-varying density distribution or when -the computational cost varies signficantly between different atoms or -particles. E.g. a model of a vapor/liquid interface, or a solid with -an irregular-shaped geometry containing void regions, or "hybrid pair -style simulations"_pair_hybrid.html which combine pair styles with -different computational cost. In these cases, the LAMMPS default of +Load-balancing is typically only useful if the particles in the +simulation box have a spatially-varying density distribution. E.g. a +model of a vapor/liquid interface, or a solid with an irregular-shaped +geometry containing void regions. In this case, the LAMMPS default of dividing the simulation box volume into a regular-spaced grid of 3d -bricks, with one equal-volume sub-domain per procesor, may assign -numbers of particles per processor in a way that the computational -effort varies significantly. This can lead to poor performance when -the simulation is run in parallel. - -The balancing can be performed with or without per-particle weighting. -Without any particle weighting, the balancing attempts to assign an -equal number of particles to each processor. With weighting, the -balancing attempts to assign an equal weight to each processor, which -typically means a different number of atoms per processor. Details on -the various weighting options are given below. +bricks, with one equal-volume sub-domain per procesor, may assign very +different numbers of particles per processor. This can lead to poor +performance when the simulation is run in parallel. Note that the "processors"_processors.html command allows some control over how the box volume is split across processors. Specifically, for @@ -105,9 +78,9 @@ sub-domains will still have the same shape and same volume. The requested load-balancing operation is only performed if the current "imbalance factor" in particles owned by each processor exceeds the specified {thresh} parameter. The imbalance factor is -defined as the maximum number of particles (or weight) owned by any -processor, divided by the average number of particles (or weight) per -processor. Thus an imbalance factor of 1.0 is perfect balance. +defined as the maximum number of particles owned by any processor, +divided by the average number of particles per processor. Thus an +imbalance factor of 1.0 is perfect balance. As an example, for 10000 particles running on 10 processors, if the most heavily loaded processor has 1200 particles, then the factor is @@ -135,7 +108,7 @@ defined above. But depending on the method a perfect balance (1.0) may not be achieved. For example, "grid" methods (defined below) that create a logical 3d grid cannot achieve perfect balance for many irregular distributions of particles. Likewise, if a portion of the -system is a perfect lattice, e.g. the initial system is generated by +system is a perfect lattice, e.g. the intiial system is generated by the "create_atoms"_create_atoms.html command, then "grid" methods may be unable to achieve exact balance. This is because entire lattice planes will be owned or not owned by a single processor. @@ -161,11 +134,11 @@ The {x}, {y}, {z}, and {shift} styles are "grid" methods which produce a logical 3d grid of processors. They operate by changing the cutting planes (or lines) between processors in 3d (or 2d), to adjust the volume (area in 2d) assigned to each processor, as in the following 2d -diagram where processor sub-domains are shown and particles are -colored by the processor that owns them. The leftmost diagram is the -default partitioning of the simulation box across processors (one -sub-box for each of 16 processors); the middle diagram is after a -"grid" method has been applied. +diagram where processor sub-domains are shown and atoms are colored by +the processor that owns them. The leftmost diagram is the default +partitioning of the simulation box across processors (one sub-box for +each of 16 processors); the middle diagram is after a "grid" method +has been applied. :image(JPG/balance_uniform_small.jpg,JPG/balance_uniform.jpg),image(JPG/balance_nonuniform_small.jpg,JPG/balance_nonuniform.jpg),image(JPG/balance_rcb_small.jpg,JPG/balance_rcb.jpg) :c @@ -173,8 +146,8 @@ sub-box for each of 16 processors); the middle diagram is after a The {rcb} style is a "tiling" method which does not produce a logical 3d grid of processors. Rather it tiles the simulation domain with rectangular sub-boxes of varying size and shape in an irregular -fashion so as to have equal numbers of particles (or weight) in each -sub-box, as in the rightmost diagram above. +fashion so as to have equal numbers of particles in each sub-box, as +in the rightmost diagram above. The "grid" methods can be used with either of the "comm_style"_comm_style.html command options, {brick} or {tiled}. The @@ -257,7 +230,7 @@ counts do not match the target value for the plane, the position of the cut is adjusted to be halfway between a low and high bound. The low and high bounds are adjusted on each iteration, using new count information, so that they become closer together over time. Thus as -the recursion progresses, the count of particles on either side of the +the recustion progresses, the count of particles on either side of the plane gets closer to the target value. Once the rebalancing is complete and final processor sub-domains @@ -289,75 +262,21 @@ the longest dimension, leaving one new box on either side of the cut. All the processors are also partitioned into 2 groups, half assigned to the box on the lower side of the cut, and half to the box on the upper side. (If the processor count is odd, one side gets an extra -processor.) The cut is positioned so that the number of particles in -the lower box is exactly the number that the processors assigned to -that box should own for load balance to be perfect. This also makes -load balance for the upper box perfect. The positioning is done -iteratively, by a bisectioning method. Note that counting particles -on either side of the cut requires communication between all -processors at each iteration. +processor.) The cut is positioned so that the number of atoms in the +lower box is exactly the number that the processors assigned to that +box should own for load balance to be perfect. This also makes load +balance for the upper box perfect. The positioning is done +iteratively, by a bisectioning method. Note that counting atoms on +either side of the cut requires communication between all processors +at each iteration. That is the procedure for the first cut. Subsequent cuts are made recursively, in exactly the same manner. The subset of processors assigned to each box make a new cut in the longest dimension of that -box, splitting the box, the subset of processsors, and the particles -in the box in two. The recursion continues until every processor is -assigned a sub-box of the entire simulation domain, and owns the -particles in that sub-box. - -:line - -This sub-section describes how to perform weighted load balancing via -the {weight} keyword. - -One or more weight factors can be assigned to individual or sets of -particles. By default all particles have an initial weight of 1.0. -After weighting is applied, a particle with a total weight of 5 will -be balanced with 5x the computational cost of a particle with the -default weight of 1.0. - -If one or more weight styles are specified, they are processed in the -order they are specified. Each style computes a factor which -multiplies the existing factor to produce a cummulative weight on -individual particles. - -The {group} weight style assigns weight factors to specified groups of -particles. The {group} style keyword is followed by the number of -groups, then pairs of group IDs and the corresponding weight factor. -A particle may belong to zero or one or more than one specified group. -Its final factor is simply the product of all individual weight -factors for the groups it belongs to. - -The {neigh} weight style assigns a weight to each particle equal to -its number of neighbors divided by the avergage number of neighbors -for all particles. The {factor} setting is then appied as an overall -scale factor to all the {neigh} weights. Thus {factor} effectively -sets a relative impact for this weight style. This weight style will -use the first suitable neighbor list it finds internally. It will -print a warning if there is no neighbor list or it is not current, -e.g. if the balance command is used before a "run"_run.html or -"minimize"_minimize.html command is used, which can mean that no -neighbor list has yet been built. - -The {time} weight style uses "timer data"_timer.html to calculate a -weight for each particle. The {factor} setting is then appied as an -overall scale factor to all the {time} weights. Effectively it sets a -relative impact for this weight style. Timer information is taken -from the preceding run. NOTE: Entire run or last portion of run? -Which sub-timings within the run? How is it normalized? If no such -information is available, e.g. at the beginning of an input, of when -the "timer"_timer.html level is set to either {loop} or {off}, this -style is ignored. - -The {var} weight style assigns per-particle weights by evaluating an -atom-style "variable"_variable.html specified by {name}. - -The {store} weight style does not compute a weight factor. Instead it -stores the current accumulated weights in a custom per-atom property -specified by {name}. This must be a property defined as {d_name} via -the "fix property/atom"_fix_property_atom.html command. Note that -these custom per-atom properties can be output in a "dump"_dump.html -file, so this is a way to examine per-particle weights. +box, splitting the box, the subset of processsors, and the atoms in +the box in two. The recursion continues until every processor is +assigned a sub-box of the entire simulation domain, and owns the atoms +in that sub-box. :line @@ -423,7 +342,6 @@ appear in {dimstr} for the {shift} style. [Related commands:] -"group"_group.html, "processors"_processors.html, -"fix balance"_fix_balance.html +"processors"_processors.html, "fix balance"_fix_balance.html [Default:] none diff --git a/doc/src/fix_balance.txt b/doc/src/fix_balance.txt index 1941d07e04..c997b7c27e 100644 --- a/doc/src/fix_balance.txt +++ b/doc/src/fix_balance.txt @@ -10,7 +10,7 @@ fix balance command :h3 [Syntax:] -fix ID group-ID balance Nfreq thresh style args keyword args ... :pre +fix ID group-ID balance Nfreq thresh style args keyword value ... :pre ID, group-ID are documented in "fix"_fix.html command :ulb,l balance = style name of this fix command :l @@ -21,24 +21,10 @@ style = {shift} or {rcb} :l dimstr = sequence of letters containing "x" or "y" or "z", each not more than once Niter = # of times to iterate within each dimension of dimstr sequence stopthresh = stop balancing when this imbalance threshhold is reached - {rcb} args = none :pre -zero or more keyword/arg pairs may be appended :l -keyword = {weight} or {out} :l - {weight} style args = use weighted particle counts for the balancing - {style} = {group} or {neigh} or {time} or {var} or {store} - {group} args = Ngroup group1 weight1 group2 weight2 ... - Ngroup = number of groups with assigned weights - group1, group2, ... = group IDs - weight1, weight2, ... = corresponding weight factors - {neigh} factor = compute weight based on number of neighbors - factor = scaling factor (> 0) - {time} factor = compute weight based on time spend computing - factor = scaling factor (> 0) - {var} name = take weight from atom-style variable - name = name of the atom-style variable - {store} name = store weight in custom atom property defined by "fix property/atom"_fix_property_atom.html command - name = atom property name (without d_ prefix) - {out} arg = filename + rcb args = none :pre +zero or more keyword/value pairs may be appended :l +keyword = {out} :l + {out} value = filename filename = write each processor's sub-domain to a file, at each re-balancing :pre :ule @@ -46,9 +32,6 @@ keyword = {weight} or {out} :l fix 2 all balance 1000 1.05 shift x 10 1.05 fix 2 all balance 100 0.9 shift xy 20 1.1 out tmp.balance -fix 2 all balance 100 0.9 shift xy 20 1.1 weight group 3 substrate 3.0 solvent 1.0 solute 0.8 out tmp.balance -fix 2 all balance 100 1.0 shift x 10 1.1 weight time 0.8 -fix 2 all balance 100 1.0 shift xy 5 1.1 weight var myweight weight neigh 0.6 weight store allweight fix 2 all balance 1000 1.1 rcb :pre [Description:] @@ -61,31 +44,14 @@ rebalancing is performed periodically during the simulation. To perform "static" balancing, before or between runs, see the "balance"_balance.html command. -Load-balancing is typically most useful if the particles in the -simulation box have a spatially-varying density distribution or -where the computational cost varies signficantly between different -atoms. E.g. a model of a vapor/liquid interface, or a solid with -an irregular-shaped geometry containing void regions, or -"hybrid pair style simulations"_pair_hybrid.html which combine -pair styles with different computational cost. In these cases, the -LAMMPS default of dividing the simulation box volume into a -regular-spaced grid of 3d bricks, with one equal-volume sub-domain -per procesor, may assign numbers of particles per processor in a -way that the computational effort varies significantly. This can -lead to poor performance when the simulation is run in parallel. - -The balancing can be performed with or without per-particle weighting. -Without any particle weighting, the balancing attempts to assign an -equal number of particles to each processor. With weighting, the -balancing attempts to assign an equal weight to each processor, which -typically means a different number of atoms per processor. Details on -the various weighting options are given below. - -SJP: Need a pointer here to an examples dir that has simple -examples for where weighting is useful, e.g. rRESPA, pair hybrid, -other? Also a summary of what weighting can buy you, maybe -in a small table: e.g. respa = 2x, pair hybrid = 3x, etc. -All the SJP notes here and below also apply to balance.txt. +Load-balancing is typically only useful if the particles in the +simulation box have a spatially-varying density distribution. E.g. a +model of a vapor/liquid interface, or a solid with an irregular-shaped +geometry containing void regions. In this case, the LAMMPS default of +dividing the simulation box volume into a regular-spaced grid of 3d +bricks, with one equal-volume sub-domain per processor, may assign +very different numbers of particles per processor. This can lead to +poor performance when the simulation is run in parallel. Note that the "processors"_processors.html command allows some control over how the box volume is split across processors. Specifically, for @@ -98,9 +64,9 @@ sub-domains will still have the same shape and same volume. On a particular timestep, a load-balancing operation is only performed if the current "imbalance factor" in particles owned by each processor exceeds the specified {thresh} parameter. The imbalance factor is -defined as the maximum number of particles (or weight) owned by any -processor, divided by the average number of particles (or weight) per -processor. Thus an imbalance factor of 1.0 is perfect balance. +defined as the maximum number of particles owned by any processor, +divided by the average number of particles per processor. Thus an +imbalance factor of 1.0 is perfect balance. As an example, for 10000 particles running on 10 processors, if the most heavily loaded processor has 1200 particles, then the factor is @@ -151,8 +117,8 @@ applied. The {rcb} style is a "tiling" method which does not produce a logical 3d grid of processors. Rather it tiles the simulation domain with rectangular sub-boxes of varying size and shape in an irregular -fashion so as to have equal numbers of particles (or weight) in each -sub-box, as in the rightmost diagram above. +fashion so as to have equal numbers of particles in each sub-box, as +in the rightmost diagram above. The "grid" methods can be used with either of the "comm_style"_comm_style.html command options, {brick} or {tiled}. The @@ -173,9 +139,12 @@ from scratch. :line -The {group-ID} is ignored. However the impact of balancing on -different groups of atoms can be affected by using the {group} weight -style as described below. +The {group-ID} is currently ignored. In the future it may be used to +determine what particles are considered for balancing. Normally it +would only makes sense to use the {all} group. But in some cases it +may be useful to balance on a subset of the particles, e.g. when +modeling large nanoparticles in a background of small solvent +particles. The {Nfreq} setting determines how often a rebalance is performed. If {Nfreq} > 0, then rebalancing will occur every {Nfreq} steps. Each @@ -256,7 +225,7 @@ than {Niter} and exit early. The {rcb} style invokes a "tiled" method for balancing, as described above. It performs a recursive coordinate bisectioning (RCB) of the -simulation domain. The basic idea is as follows. +simulation domain. The basic idea is as follows. The simulation domain is cut into 2 boxes by an axis-aligned cut in the longest dimension, leaving one new box on either side of the cut. @@ -281,72 +250,10 @@ in that sub-box. :line -This sub-section describes how to perform weighted load balancing via -the {weight} keyword. - -SJP: This list of options will be confusing to users. They -need some guidelines here about how to use the weight options. E.g. -try these single options first for these scenarios. Try adding -an option if ... - -One or more weight factors can be assigned to individual or sets of -particles. By default all particles have an initial weight of 1.0. -After weighting is applied, a particle with a total weight of 5 will -be balanced with 5x the computational cost of a particle with the -default weight of 1.0. - -If one or more weight styles are specified, they are processed in the -order they are specified. Each style computes a factor which -multiplies the existing factor to produce a cummulative weight on -individual particles. - -The {group} weight style assigns weight factors to specified groups of -particles. The {group} style keyword is followed by the number of -groups, then pairs of group IDs and the corresponding weight factor. -A particle may belong to zero or one or more than one specified group. -Its final factor is simply the product of all individual weight -factors for the groups it belongs to. - -The {neigh} weight style assigns a weight to each particle equal to -its number of neighbors divided by the avergage number of neighbors -for all particles. The {factor} setting is then appied as an overall -scale factor to all the {neigh} weights. Thus {factor} effectively -sets a relative impact for this weight style. This weight style will -use the first suitable neighbor list it finds internally. It will -print a warning if there is no neighbor list or it is not current, -e.g. if the balance command is used before a "run"_run.html or -"minimize"_minimize.html command is used, which can mean that no -neighbor list has yet been built. - -The {time} weight style uses "timer data"_timer.html to calculate a -weight for each particle. The {factor} setting is then appied as an -overall scale factor to all the {time} weights. Effectively it sets a -relative impact for this weight style. Timer information is taken -from the preceding run. If no such information is available, e.g. at -the beginning of an input, of when the "timer"_timer.html level is set -to either {loop} or {off}, this style is ignored. - -SJP: Not enough details about how timer option works. Entire last run -or last portion of run? (for balance vs fix balance) Which sub-timings -within the run, can user choose those? How is it normalized? Does -the timer command need to be specified in a certain way? - -The {var} weight style assigns per-particle weights by evaluating an -atom-style "variable"_variable.html specified by {name}. - -The {store} weight style does not compute a weight factor. Instead it -stores the current accumulated weights in a custom per-atom property -specified by {name}. This must be a property defined as {d_name} via -the "fix property/atom"_fix_property_atom.html command. Note that -these custom per-atom properties can be output in a "dump"_dump.html -file, so this is a way to examine per-particle weights. - -:line - -The {out} keyword writes text to the specified {filename} with the -results of each rebalancing operation. The file contains the bounds -of the sub-domain for each processor after the balancing operation -completes. The format of the file is compatible with the +The {out} keyword writes a text file to the specified {filename} with +the results of each rebalancing operation. The file contains the +bounds of the sub-domain for each processor after the balancing +operation completes. The format of the file is compatible with the "Pizza.py"_pizza {mdump} tool which has support for manipulating and visualizing mesh files. An example is shown here for a balancing by 4 processors for a 2d problem: @@ -414,8 +321,8 @@ values in the vector are as follows: 3 = imbalance factor right before the last rebalance was performed :ul As explained above, the imbalance factor is the ratio of the maximum -number of particles (or total weight) on any processor to the average -number of particles (or total weight) per processor. +number of particles on any processor to the average number of +particles per processor. These quantities can be accessed by various "output commands"_Section_howto.html#howto_15. The scalar and vector values @@ -429,11 +336,11 @@ minimization"_minimize.html. [Restrictions:] -For 2d simulations, the {z} style cannot be used. Nor can a "z" -appear in {dimstr} for the {shift} style. +For 2d simulations, a "z" cannot appear in {dimstr} for the {shift} +style. [Related commands:] -"group"_group.html, "processors"_processors.html, "balance"_balance.html +"processors"_processors.html, "balance"_balance.html [Default:] none