- The unsteady adjoint equations are integrated backwards in time. Since
each adjoint time-step requires the primal solution of that time-step
to be known, schemes for managing the storage/retrieval of the entire
flow series are necessary. These are implemented through the
primalStorage class and its derived ones. The latter manipulate a new
class of fields, called compressedGeometricFields, which provide hooks
for compressing/decompressing a field during the time integration of
the primal/adjoint equations. The method used for
compressing/decompressing is run-time selectable.
- The current commit provides the shortGeometricField implementation
which avoids the storage of patchFields that can be retrieved from the
internalField (e.g. coupled, zeroGradient, symmetry, etc) , to cut on
the storage requirements. More elaborate compression approaches will
be included in the future, during the exaFoam project.
- Two primalStorage options are included: compressedFullStorage and
binomialCheckPointing.
- compressedFullStorage stores the entire flow time-series,
potentially by compressing each time-step (only the
above-mentioned short approach is available for the moment).
- binomialCheckPointing is based on the homonymous algorithm
found in
\verbatim
Wang, Q., Moin, P., & Iaccarino, G..
Minimal Repetition Dynamic Checkpointing Algorithm for
Unsteady Adjoint Calculation (2009).
SIAM Journal on Scientific Computing, 31(4), 2549-2567.
10.1137/080727890,
\endverbatim
which stores the solution of the flow equations in a predefined
number of time-steps, named checkpoints. During the
backwards-in-time integration of the adjoint equations, if the
primal solution at a certain time-step is not available, it is
retrieved by re-computing the primal flow field starting from the
closest checkpoint. Checkpoints are optimally distributed
throughout the time-series to invoke the least number of flow
recomputations during the backwards-in-time solution of the
adjoint equations. Binomial checkpointing is the current state of
the art though its re-computation cost frequently amounts for an
extra solution of the flow equations in medium-to-large cases.
- The adjoint to the PISO and PIMPLE solvers, along with their
solverControl variants, are additionally included.
- Objective functions are integrated in time, through appropriate
entries in the dictionaries defining them.
Authored by Andreas Margetis and reviewed by Vaggelis Papoutsis, with
earlier contributions from Dr. Ioannis Kavvadias.
These are used by the adjoint code and are necessary for unsteady
adjoint simulations. Both additions are implemented through optional
arguments with default values, to maintain backwards compatibility for
the rest of the code base.
Since the unsteady adjoint equations are integrated backwards in time,
the -- operator and the reverseEnd and reverseLoop methods were added to
control the flow of time and the ending criteria.
For example,
T
{
solver PBiCGStab;
preconditioner DILU;
tolerance 1e-6;
norm none;
}
STYLE: define defaultMaxIter, defaultTolerance directly in lduMatrix
- in situations where the simulation diverges, the ensight writing can
be incomplete. If the case file is updated prior to writing geometry
or fields, the generated case may refer to incomplete entries (which
make loading problematic).
NOTE: if multiple fields are sampled and written, this change cannot
entirely prevent case files addressing corrupt fields. For example,
1a. write U field, update case file with new times/fields
1b. write p field, update case file with new times/fields
2a. write U field, update case file with new times
2b. write p field, but fails
Since 2a already updates the case file with a new time-step entry
(for the U field), the case glob patterns will automatically include
the not-yet-written 'p' field. If this write fails with an
incomplete/corrupt field, the case file will still be addressing it!
- barycentric coordinates in interpolation (instead of x/y/z)
- ease U (velocity) requirement.
Needn't be named in the sampled fields.
- default tracking direction is 'forward'
In movePoints had some duplicated code but did not update the
lower level (polyPatch) areas. This caused scaling to be applied
multiple times (so only 1.0 would not be affected)
- the file removal cleanup, which makes reasonable sense for
redistribute mode, always forced the removal of the reconstructed
lagrangian fields (since all of the non-master fields are empty by
definition)!
Detect reconstruct mode (by using constructSize from the map) to
circumvent this logic.
phaseSystemModels function objects are relocated within
functionObjects in order to enable broader usage.
ENH: multiphaseInterHtcModel: new heatTransferCoeff function object model
COMP: createExternalCoupledPatchGeometry: add new dependencies
COMP: alphaContactAngle: avoid duplicate entries between multiphaseEuler and reactingEuler
TUT: damBreak4Phase: rename alphaContactAngle as multiphaseEuler::alphaContactAngle
thermoTools is a relocation of various existing tools:
- src/TurbulenceModels/compressible/turbulentFluidThermoModels/derivedFvPatchFields/
- src/semiPermeableBaffle/derivedFvPatchFields/
- src/thermophysicalModels/thermophysicalPropertiesFvPatchFields/liquidProperties/
ENH: Allwmake: reordering various compilation steps
Co-authored-by: Kutalmis Bercin <kutalmis.bercin@esi-group.com>
This is on
- incompressible/pimpleFoam/laminar/mixerVesselAMI2D/mixerVesselAMI2D-topologyChange
- redistributePar -reconstruct
where the fvMesh::updateMesh does an early trigger of
mesh.phi() calculation
Specific to the VOF-to-lagrangian FO is to generate particles
which potentially do not relate to the mesh. So here they
are preserved instead of trying to locate them on the
reconstructed mesh. Note: this has the same effect
of actually copying the file...
speciesSorption is a zeroGradient BC which absorbs mass given by a first
order time derivative, absoprtion rate and an equilibrium value
calculated based on internal species values next to the wall.
patchCellsSource is a source fvOption which applies to the corresponding
species and apply the source calculated on the speciesSorption BC.
A new abstract virtual class was created to group BC's which
don't introduce a source to the matrix (i.e zeroGradient) but calculate
a mass sink/source which should be introduced into the matrix. This
is done through the fvOption patchCellsSource.
- this allows the "relocation" of sampled surfaces. For example,
to reposition into a different coordinate system for importing
into CAD.
- incorporate output scaling for all surface writer types.
This was previously done on an adhoc basis for different writers,
but with now included in the base-level so that all writers
can automatically use scale + transform.
Example:
formatOptions
{
vtk
{
scale 1000; // m -> mm
transform
{
origin (0.05 0 0);
rotation axisAngle;
axis (0 0 1);
angle -45;
}
}
}
in RASModelVariables were doing this by checking whether the
corresponding pointer was allocated. In some cases, however, even if the
field does not exist, the pointer is not null, leading to the wrong
output. Made the correspding functions virtual and overwritten their
return values in the derived classes. Kept the initial implementation in
base to facilitate the clone function.
in cases with more than one primal or adjoint solvers
TUT: removed all occurances of useSolverNameForFields
from the optimisation tutorials since it is now set
automatically.
in the sensitivity patches, symmetry::evaluate() needs access to the
internalField which does exist, leading to wrong memory access.
Fixed by specifying a calculated type fvPatchField for all patches when
creating a boundaryField<Type>
Using a symmetry(Plane) as a sensitivity patch is quite rare and
borderline wrong, but this provides a fix nonetheless.
The multiplier of grad(dxdb) is a volTensorField which, by itself, is
memory consuming. The function computing it though was sloppy in terms
of memory management, constituting the peak memory consumption during an
adjoint optimisation. Initial changes to remedy the problem include the
deallocation of some of the volTensorFields included in the computation
of grad(dxdb) once unneeded, the utilisation of volSymmTensorFields
instead of volTensorFields where possible and avoiding allocating some
unnecessary intermediate fields.
Actions to further reduce memory consumption:
- For historical reasons, the code computes/stores the transpose of
grad(dxdb), which is then transposed when used in the computation of
the FI or the ESI sensitivity derivatives. This redundant
transposition can be avoid, saving the allocation of an additional
volTensorField, but the changes need to permeate a number of places in
the code that contribute to grad(dxdb) (e.g. ATC, adjoint turbulence
models, adjoint MRF, etc).
- Allocation of unnecessary pointers in the objective class should be
avoided.
- ATCstandard, ATCUaGradU:
the ATC is now added as a dimensioned field and not as an fvMatrix
to UaEqn. This get rid of many unnecessary allocations.
- ATCstandard:
gradU is cached within the class to avoid its re-computation in
every adjoint iteration of the steady state solver.
- Inlined a number of functions within the primal and adjoint solvers.
This probably has a negligible effect since they likely were inlined
by the compiler either way.
- The momentum diffusivity at the boundary, used by the adjoint boundary
conditions, was computed for the entire field and, then, only the
boundary field of each adjoint boundary condition was used. If many
outlet boundaries exist, the entire nuEff field would be computed as
many times as the number of boundaries, leading to an unnecessary
computational overhead.
- Outlet boundary conditions (both pressure and velocity) use the local
patch gradient to compute their fluxes. This patch gradient requires
the computation of the adjacent cell gradient, which is done on the
fly, on a per patch basis. To compute this patch adjacent gradient
however, the field under the grad sign is interpolated on the entire
mesh. If many outlets exist, this leads to a huge computational
overhead. Solved by caching the interpolated field to the database and
re-using it, in a way similar to the caching of gradient fields (see
fvc::grad).
WIP: functions returning references to primal and adjoint boundary
fields within boundaryAdjointContributions seem to have a non-negligible
overhead for cases with many patches. No easy work-around here since
these are virtual and cannot be inlined.
WIP: introduced the code structure for caching the contributions to
the adjoint boundary conditions that depend only on the primal fields
and reusing. The process needs to be completed and evaluated, to make
sure that the extra code complexity is justified by gains in
performance.
is now appended by the name of the adjoint solver, if more than one
exist. This was necessary for an accurate continuation since, before
these changes, only the ma field of the last solver was written. As a
result, when restarting the first adjoint solver was reading the ma
field of the last one. No changes are needed in fvSolution and fvSchemes
w.r.t. the previous code version.
as a step towards machine-accuracy continuation of the optimisation
loop.
Additionally, control points are now written under the time/uniform
folder, to be in-line with rest of the code structure for continuation.
As a side-effect, the controlPointsDefinition in
constant/dynamicMeshDict does not need to be changed to 'fromFile'
anymore in order to perform the continuation. The 'fromFile' option is
still valid if the user wants to supply the control points manually but,
as with all other controlPointsDefinitions, it will be disregarded if the
proper file exists under the time/uniform/volumetricBSplines folder.
Before the commit, the sensitivity classes were receiving references of
the (incompressible) primal and adjoint variables. However, if
additional physics was added (energy equation, multiphase, etc), the
infrastructure wasn't convenient for accommodating (new terms in the FI
and E-SI formulations, new terms in the sensitivity map, etc).
Now, the sensitivity classes receive a reference to an
incompressibleAdjointSolver and receive the terms for the FI and
sensitivity maps through there. The latter is still WIP.
Modified adjointSimple to incorporate these changes as well.
Each solver now writes its sensitivity derivatives to its dictionary,
enabling also a binary format. If present, the sensitivities are then
re-read from the dictionary, avoiding thus possible loss of information
due to re-computation.
As a side-effect, sensitivities are computed after the completion of
each adjoint solver, instead of being computed after all adjoint solvers
have been completed.
for incompressible flows. The typical convention of appending the primal
field name with 'a' to form the adjoint field is followed for the
adjoint turbulent kinetic energy (i.e. 'ka') but since this would produce
an ugly variable name for the adjoint to omega (i.e. omegaa), the latter
is abbreviated to 'wa'.
The work is based on
\verbatim
Kavvadias, I., Papoutsis-Kiachagias, E.,
Dimitrakopoulos, G., & Giannakoglou, K. (2014).
The continuous adjoint approach to the k–$omega$ SST turbulence model with
applications in shape optimization
Engineering Optimization, 47(11), 1523-1542.
https://doi.org/10.1080/0305215X.2014.979816
\endverbatim
with changes in the discretisation of
a number of differential operators and the formulation of the adjoint to
the wall functions employed by the primal model.
Regarding the latter, the code assumes (and differentiates) the default
behaviour of nutkWallFunction (i.e. nutWallFunction::blendingType::STEPWISE)
and omegaWallFunction (i.e. omegaWallFunction::blendingType::BINOMIAL2).
Due to the availability of a number of terms required for the
formulation of the wall function for ka, the latter is implemented
within adjointkOmegaSST itself, with contributions from objective functions
implemented within kaqRWallFunction. Wall functions for wa are
implemented within waWallFunction.
The initial implementation of the above-mentioned reference was
performed by Dr. Ioannis Kavvadias
the Jacobian of an objective function, defined at the boundary, wrt nut
and gradU. Also modified the current objectives that include such
contributions
- update the area-centres processor/processor information as part of
faMesh::init() after all of the global data and geometry data is
setup.
- improve flattenEdgeField helper to properly handle empty patches.
This change removes the false fails when testing edge-centre
redistribution (FULLDEBUG mode).
TUT: add filmPanel (rivulet) tutorial
- include constant/faMesh cleanup (cleanFaMesh) as part of standard
cleanCase
- simplify cleanPolyMesh function to now just warn about old
constant/polyMesh/blockMeshDict but not try to remove anything
- cleanup cellDist.vtu (decomposePar -dry-run) as well
ENH: foamRunTutorials - fallback to Allrun-parallel, Allrun-serial
TUT: call m4 with file argument instead of redirected stdin
TUT: adjust suffixes on decomposeParDict variants
- enables runtime selection of operand coefficients by 'coefficients' entry
- removes binning - now handled using the new 'binField' FO
Co-authored-by: Kutalmis Bercin <kutalmis.bercin@esi-group.com>
The new 'binField' function object calculates binned data,
where specified patches are divided into segments according to
various input bin characteristics, so that spatially-localised
information can be output for each segment.
Co-authored-by: Kutalmis Bercin <kutalmis.bercin@esi-group.com>
- simpler to write for sampled cutting planes etc.
For example,
slice
{
type cuttingPlane;
point (0 0 0);
normal (0 0 1);
interpolate true;
}
instead of
slice
{
type cuttingPlane;
planeType pointAndNormal;
pointAndNormalDict
{
point (0 0 0);
normal (0 0 1);
}
interpolate true;
}
STYLE: add noexcept to some plane methods
- Previous state of the condition was largely inoperative
due to bugs and lack of functionalities
- New state of the condition is more versatile, elegant, robust and faster
ENH: turbulentDigitalFilter: add new scalar-based synthetic turbulence condition
- Realistic temperature and/or concentration fluctuations
can be generated based on given input statistics
- can specify rotations that are not "axes" in a compact form:
transform
{
origin (0 0 0);
rotation none;
}
transform
{
origin (0 0 0);
rotation axisAngle;
axis (0 0 1);
angle 45;
}
An expanded dictionary form also remains possible:
transform
{
origin (0 0 0);
rotation
{
type axisAngle;
axis (0 0 1);
angle 45;
}
}
STYLE: verbose deprecation for "coordinateRotation" keyword
- the "coordinateRotation" keyword was replaced by the "rotation"
keyword (OpenFOAM-v1812 and later) but was handled silently.
Now elevated to non-silent.
STYLE: alias lookups "axesRotation", "EulerRotation", "STARCDRotation"
- these warn and report the equivalent short form, which aids in
upgrading. Previously had silent lookups.
- append single character
- make append() methods void: methods are never chained anyhow
- refactor digest comparison (code reduction)
COMP: add overflow handling for OSHA1stream
- add overflow() method to the SHA1 streambuf. Previously could rely
on xsputn for adding to sha1 content, but streams now check pptr()
first to test for the buffering range and thus overflow() is needed.
- can be more intuitive to specify for some cases:
rotation
{
type euler;
order rollPitchYaw;
angles (0 20 45);
}
- refactor starcd rotation to reuse Euler ZXY ordering
(code reduction)
ENH: add -rotate-x, -rotate-y, -rotate-z for transformPoints etc
- easier to specify for simple rotations
- aligns calling signatures with wordList, for possible future
replacement
- drop construct from const char** (can use initializer_list instead)
ENH: replace hashedWordList with plain wordList in triSurfaceLoader
- additional hashing optimisation (and overhead) is not worth it for
the comparatively small lists of surfaces used.
- catch extra punctuation tokens in chemical equations
- catch unknown species
- simplify generation of reaction string (output)
ENH: allow access of solid concentrations from sub-classes (#2441)
- ensightWrite, vtkWrite, fv::cellSetOption
ENH: additional topoSet "ignore" action
- this no-op can be used to skip an action step, instead of removing
the entire entry
- this allows more flexibility when defining the location or intensity
of sources.
For example,
{
type scalarSemiImplicitSource;
volumeMode specific;
selectionMode all;
sources
{
tracer0
{
explicit
{
type exprField;
functions<scalar>
{
square
{
type square;
scale 0.0025;
level 0.0025;
frequency 10;
}
}
expression
#{
(hypot(pos().x() + 0.025, pos().y()) < 0.01)
? fn:square(time())
: 0
#};
}
}
}
}
ENH: SemiImplicitSource: handle "sources" with explicit/implicit entries
- essentially the same as injectionRateSuSp with Su/Sp,
but potentially clearer in purpose.
ENH: add Function1 good() method to define if function can be evaluated
- for example, provides a programmatic means of avoiding the 'none'
function
- avoid any operations for zero sources
- explicit sources are applied to the entire mesh can be added directly,
without an intermediate DimensionedField
- update some legacy faMatrix/fvMatrix methods that used Istream
instead of dictionary or dimensionSet for their parameters.
Simplify handling of tmps.
- align faMatrix methods with the updated their fvMatrix counterparts
(eg, DimensionedField instead of GeometricField for sources)
- similar to the geometric decomposition constraint,
allows a compositing selection of cells based on topoSet sources
which also include various searchableSurface mechanisms.
This makes for potentially easier placement of sources without
resorting to defining a cellSet.
ENH: support zone group selection for fv::cellSetOption and fa::faceSetOption
- select motion for the entire mesh, or restrict to a subset
of points based on a specified cellSet or cellZone(s).
Can now combine cellSet and cellZone specifications
(uses an 'or' combination).
- move consistent use of keyType and wordRe to allow regex selection,
possibly using zone groups
STYLE: remove duplicate code in solidBodyMotionSolver
- shorter lookup names for more consistency
ENH: accept point1/point2 as alternative to p1/p2 for sources
- better alignment with searchable specification
- refactor so that cylinderAnnulus sources derive directly from
cylinder sources (which handle an annulus as well).
Accept radius or outerRadius as synonyms.
STYLE: noexcept on topoBitSet access methods
DOC: update description for geometricConstraint
- in various situations with mesh regions it is also useful to
filter out or remove the defaultRegion name (ie, "region0").
Can now do that conveniently from the polyMesh itself or as a static
function. Simply use this
const word& regionDir = polyMesh::regionName(regionName);
OR mesh.regionName()
instead of
const word& regionDir =
(
regionName != polyMesh::defaultRegion
? regionName
: word::null
);
Additionally, since the string '/' join operator filters out empty
strings, the following will work correctly:
(polyMesh::regionName(regionName)/polyMesh::meshSubDir)
(mesh.regionName()/polyMesh::meshSubDir)
Reports cloud information for particles passing through a specified cell
zone.
Example usage:
cloudFunctions
{
particleZoneInfo1
{
type particleZoneInfo;
cellZone leftFluid;
// Optional entries
//writer vtk;
}
}
Results are written to file:
- \<case\>/postProcessing/lagrangian/\<cloudName\>/\<functionName\>/\<time\>
\# cellZone : leftFluid
\# time : 1.0000000000e+00
\#
\# origID origProc (x y z) time0 age d0 d mass0 mass
Where
- origID : particle ID
- origProc : processor ID
- (x y z) : Cartesian co-ordinates
- time0 : time particle enters the cellZone
- age : time spent in the cellZone
- d0 : diameter on entry to the cellZone
- d : current diameter
- mass0 : mass on entry to the cellZone
- mass : current mass
If the optional \c writer entry is supplied, cloud data is written in the
specified format.
During the run, output statistics are reported after the cloud solution,
e.g.:
particleZoneInfo:
Cell zone = leftFluid
Contributions = 257
Here, 'Contributions' refers to the number of incremental particle-move
contributions recorded during this time step. At write times, the output
is extended, e.g.:
particleZoneInfo:
Cell zone = leftFluid
Contributions = 822
Number of particles = 199
Written data to "postProcessing/lagrangian/reactingCloud1/
TUT: filter: add an example for the particleZoneInfo function object
- Previously, the multiFieldValue function object was limited to operate on
lists of fieldValue function objects.
- Any function objects that generate results can now be used, e.g.
pressureAverage
{
type multiFieldValue;
libs (fieldFunctionObjects);
operation average;
functions
{
inlet
{
type surfaceFieldValue;
operation areaAverage;
regionType patch;
name inlet;
fields (p);
writeFields no;
writeToFile no;
log no;
resultFields (areaAverage(inlet,p));
}
outlet
{
type surfaceFieldValue;
operation areaAverage;
regionType patch;
name outlet;
fields (p);
writeFields no;
writeToFile no;
log no;
}
average
{
type valueAverage;
functionObject testSample1;
fields (average(p));
writeToFile no;
log no;
}
}
}
TUT: cavity: add an example for the multiFieldValue function object
- now have both compactData(),compactLocalData(), compactRemoteData()
depending on where the compaction information is actually known.
The compactData() performs a consistent union of local and remote
values, which eliminates the danger of mapping to non-existent
locations but does require a double communication to setup.
Typically needed for point maps (for example).
The compactLocalData() and compactRemoteData() work on the
assumption that the source or target values are sufficent for
creating unique compact maps.
Can be used, for example, when compacting cell maps since there is
no possibility of a source cell being represented on different
target processors (ie, each cell is unique and only occurs once).
The existing compact() is equivalent to compactRemoteData()
and is now simply a redirect.
- use bitSet for defining compaction, but the existing compact()
continues to use a boolList (for code compatibility).
BUG: compaction in non-parallel mode didn't compact anything.
STYLE: compact ascii output for procAddressing
- simplify procAddressing read/write
- avoid accessing points in faMeshReconstructor.
Can rely on the patch meshPoints (labelList), which does not need
access to a pointField
- report number of points on decomposed mesh.
Can be useful additional information.
Additional statistics for finite area decomposition
- provide bundled reconstructAllFields for various reconstructors
- remove reconstructPar checks for very old face addressing
(from foam2.0 - ie, older than OpenFOAM itself)
- bundle all reading into fieldsDistributor tools,
where it can be reused by various utilities as required.
- combine decomposition fields as respective fieldsCache
which eliminates most of the clutter from decomposePar
and similfies reuse in the future.
STYLE: remove old wordHashSet selection (deprecated in 2018)
BUG: incorrect face flip handling for faMeshReconstructor
- a latent bug which is not yet triggered since the faMesh faces are
currently only definable on boundary faces (which never flip)
Geometry calculation scheme that performs geometry updates only in regions
where the mesh has changed, identified by comparing current and old points.
Example usage in fvSchemes:
geometry
{
type solidBody;
// Optional entries
// If set to false, update the entire mesh
partialUpdate yes;
// Cache the motion addressing (changed points, faces, cells etc)
cacheMotion yes;
}
The most frequent changes have been as follows.
from:
tmp<scalarField> tuTau(new scalarField(patch().size(), Zero));
scalarField& uTau = tuTau.ref();
to:
auto tuTau = tmp<scalarField>::New(patch().size(), Zero);
auto& uTau = tuTau.ref();
- Other changes involved the addition of - wherever approapriate -:
const
noexcept
auto
Previously, a nutWallFunctionFvPatchScalarField ref should be
created in epsilon, k, and omega wall functions to fetch various
common wall-function coefficients necessary to carry out and complete
local operations inside these wall functions.
However, this arrangement required the use of a nut wall function,
even when unnecessary, when any of non-nut wall functions are being used.
Therefore, some users had been redundantly restrained and
obstructed with rather obscure casting-error messages.
Also, the wall-function coefficients Cmu, kappa and E have been obtained
from the specified nutWallFunction in order to ensure that each patch
possesses the same set of values for these coefficients.
Although the motivation sounds reasonable, it has also been putting redundant
restraints on users and disregarding the specifics of each wall-function.
For example, the variation of epsilon in near-wall regions is usually very
steep and non-monotonic specific - an expert user may therefore want to use
an epsilon-specific coefficient, and this was not allowed by the previous
arrangement.
This commit introduces a new class (i.e. wallFunctionCoefficients) comprising
all common wall-function coefficients and yPlus calculations.
Previously, a number of wall functions were not not writing
their boundary-condition entries in the defacto order
(i.e. from type to value) while writing a field. For example:
<patchName>
{
lowReCorrection 1;
blending stepwise;
n 2;
type epsilonWallFunction; <!-- expected to be the first entry
value uniform 1; <!-- expected to be the last entry
}
Also, various wall functions have been writing out entries that
have not been being used by the wall function. For example:
<patchName>
{
type nutUSpaldingWallFunction;
...
blending stepwise; <!-- no blending treatment in nutUSpaldingWF
...
}
Additionally, various derived wall functions (e.g. atmOmegaWallFunction)
have been failing to write some of the inherited entries even though
these entries have been being used in carrying out wall-function calculations.
Taken these into consideration, wall functions have been reworked to obtain
reliable and consistent way of writing their traits while writing out a field.
- writeLocalEntries uses writeIfDifferent if constructed with getOrDefault.
ENH: simple faMeshSubset (zero-sized meshes only)
ENH: additional access methods for faMesh, primitive geometry mode
- wrapped walking of boundary edgeLabels as list of list
(similar to edgeFaces).
- primitive finiteArea geometry mode with reduced communication:
primarily interesting for decomposition/redistribution (#2436)
ENH: extra vtk debug outputs for checkFaMesh
- report per-processor sizes in the mesh summary
- similar functionality as newMesh etc.
Relocated to finiteVolume since there are no dynamicMesh dependencies.
- use simpler procAddressing (with updated mapDistributeBase).
separated from redistributePar
- returns UPtrList view (read-only or read/write) of the objects
- shorter names for IOobject checks: hasHeaderClass(), isHeaderClass()
- remove unused IOobject::isHeaderClassName(const word&) method.
The typed versions are preferable/recommended, but can still check
directly if needed:
(io.headerClassName() == "foo")
- additional distribute/reverseDistribute with specified commsType.
Improves flexibility.
- distribute with nullValue
- support move construct mapDistribute from mapDistributeBase
- refactor handling of schedules (as whichSchedule method) to
simplify code.
- renumberMap helper for working with compact sub maps
and renumberVisit for handling walk-ordered compaction.
COMP: make mapDistributeBase data private
- accessor methods are available - direct access is unnecessary
- mapDistribute : inherit mapDistributeBase constructors
STYLE: use List<labelPair>::null() for schedule placeholders
- clearer that they are doing nothing
- for int64 compilations this disambiguates between '0' as int32 (size)
or as bool 'false' for local processor validity
Eg,
IOList list(io, 0); <- With label-size 64: is this bool or label?
IOList list(io, Zero); <- Size = 0 (int32/int64), not a bool
- for indirect lists we use element-wise output streaming and read
back as a regular list. This approach cannot however work with
non-blocking mode - the receive buffers will simply not be filled
before attempting to read from them.
For contiguous data, the lowest overhead solution is to locally
flatten the indirect list and use the regular gather routines
for non-blocking mode. For non-contiguous data, can continue to
use the element-wise output, but cannot use non-blocking for it.
STYLE: use non-blocking consistently as default for globalIndex gather(s)
- most of the front-facing code was already using non-blocking,
but there were a few low-level routines defaulting to scheduled
(but never relied upon in the code).
- previously filtered on the existence of area fields, but with
faMesh::TryNew this is not required anymore.
STYLE: enable -verbose for various parallel utilities (consistency)
- introduced UList<bool>::operator()(label) as part of bf0b3d8872
but with gcc-4.8.5 this participates in operator resolution even
for non-bool lists!!
Partial revert until this predicate handling is really required.
- use DynamicList instead of List in the cache, which reduces the
number of allocations occuring each time.
- since the cached times are stored in sorted order, first check if the
new time is greater than the last list entry. Can then simply append
without performing a binary search and can obviously also skip any
subsequent sorting.
STYLE: add noexcept to Instant methods, declare in header (like Tuple2)
- as part of #2358 the writing was changed to be lazy.
Which means that files are only created before they are actually
written, which helps avoid flooding the filesystem if sample-only
is required and also handles case such as "rho.*" where the sampled
fields are not known from the objectRegistry at startup.
- now create any new files using the startTime value, which means they
are easier to find but still retains the lazy construct.
Don't expect any file collisions with this, but there could be some
corner cases where the user has edited to remove fields (during
runtime) and then re-edits to add them back in. In this case the
file pointers would be closed but reopened later and overwriting
the old probed values. This could be considered a feature or a bug.
BUG: bad indexing for streamlines (fixes#2454)
- a cut-and-paste error
- only wrap compiler calls (not things like flex/bison)
- avoid single quoted '&&' (causes syntax errors)
STYLE: report WM_COMPILE_CONTROL value in top-level Allwmake
- relocate templating to factory method 'New'.
Adds provisions for more general re-use.
- expose processor topology in globalMesh as topology()
- wrap proc->patch lookup as processorTopology::procPatchLookup method
(failsafe). May consider using Map<label> for its storage in the
future.
- Uses a refPtr to reference external content.
Useful (for example) when writing data without copying.
Reading into external locations is not implemented
(no current requirement for that).
* IOFieldRef -> IOField
* IOListRef -> IOList
* IOmapDistributePolyMeshRef -> IOmapDistributePolyMesh
Eg,
labelList addressing = ...;
io.rename("cellProcAddressing");
IOListRef<label>(io, addressing).write();
Or,
primitivePatch patch = ...;
IOFieldRef<vector>(io, patch.localPoints()).write();
- the values from non-overlapping blocks were simply ignored,
which meant that ('111111111111' & '111111') would not mask out
the unset values at all.
- similar oddities in other operations (|=, ^= etc)
where the original implementation tried hard to avoid touching the
sizing at all, but now better resolved as follows:
- '|=' : Set may grow to accommodate new 'on' bits.
- '^=' : Set may grow to accommodate new 'on' bits.
- '-=' : Never changes the original set size.
- '&=' : Never changes the original set size.
Non-overlapping elements are considered 'off'.
These definitions are consistent with HashSet behaviour
and also ensures that (a & b) == (b & a)
ENH: improve short-circuiting within bitSet ops
- in a few places can optimise by checking for none() instead of
empty() and avoid unnecessary block operations.
ENH: added bitSet::resize_last() method
- as the name says: resizes to the last bit set.
A friendlier way of writing `resize(find_last()+1)`
- uniq() : creates an IndirectList with duplicated entries
filtered out
- subset() : creates an IndirectList with positions that satisfy
a condition predicate.
- subset_if() : creates an IndirectList with values that satisfy a
given predicate.
An indirect subset will be cheaper than creating a subset copy
of the original data, and also allows modification.
STYLE: combine UIndirectList.H into UIndirectList.H (reduce file clutter)
- the sorted() method fills a UPtrList with sorted entries. In some
places this can provide a more convenient means of traversing a
HashTable in consistent order, without the extra step of creating
a sortedToc(). The sorted() method with a UPtrList will also have
a lower overhead than creating any sortedToc() or toc() since it is
list of pointers and not full copies of the keys.
Instead of this:
HashTable<someType> table = ...;
for (const word& key : table.sortedToc())
{
Info<< key << " => " << table[key] << nl;
}
can write this:
for (const auto& iter : table.sorted())
{
Info<< iter.key() << " => " << iter.val() << nl;
}
STYLE:
- declare hash entry key 'const' since it is immutable
- local writeHeaderEntry helper was not marked as file-scope static.
- use do/while to simplify handling of padding spaces
ENH: IOobject - copy construct, resetting name and local component
- when copying with a new local component, this is simpler than
constructing from all of the components, which was previously the
only possibility for setting a new local component.
- commonly used, only depends on routines defined in UList
(don't need the rest of ListOps for it).
ENH: implement boolList::operator() const
- allows use as a predicate functor, as per bitSet and labelHashSet
GIT: combine SubList, UList into List directory (intertwined concepts)
STYLE: default initialize DynamicList instead of with size 0
- specifies the number of consecutive cells to assign to the same
randomly chosen processor. Can be used to have a less extremely
random distribution for testing possible breaking points.
Eg,
method random;
coeffs
{
agglom 4;
}
- Add finiteArea cellID (actually face ids) / faceLabel and procID
for foamToVTK with -write-ids. Useful when this type of information
is needed.
- Arbitrary number of outlets can be connected to a single inlet
- Each inlet can be connected to different and arbitrary
combination of outlets
- Each outlet-inlet connection has:
- Optional filtration fraction as a Function1 type
- Optional offset as a Function1 type (i.e. adding/substracting a substance)
- Optional time delay (from outlet to inlet) as a Function1 type
- Each inlet has an optional base inlet-field as a PatchFunction1 type
The blendingFactor function object overwrites the DEShybrid:Factor
field internally when blendedSchemeBase debug flag is active.
However, users are allowed to write out the original DEShybrid:Factor
field by executing the writeObjects function object before
any blendingFactor function object execution.
- direct construct and reset method for creating a zero-sized (dummy)
subMesh. Has no exposed faces and no parallel synchronization
required.
- core mapping (interpolate) functionality with direct handling
of subsetting in fvMeshSubset (src/finiteVolume).
Does not use dynamicMesh topology changes
- two-step subsetting as fvMeshSubsetter (src/dynamicMesh).
Does use dynamicMesh topology changes.
This is apparently only needed by the subsetMesh application itself.
DEFEATURE: remove deprecated setLargeCellSubset() method
- was deprecated JUL-2018, now removed (see issue #951)
- allows restricted evaluation to specific coupled patch types.
Code relocated/refactored from redistributePar.
STYLE: ensure use of waitRequests() also corresponds to nonBlocking
ENH: additional copy/move construct GeometricField from DimensionedField
STYLE: processorPointPatch owner()/neighbour() as per processorPolyPatch
STYLE: orientedType with bool cast operator and noexcept
- move construct from components. Construct with optional IO control
- separate init() method (as per polyMesh) to delay evaluation of
globalData and base geometry.
- faMesh removeFiles method
ENH: faBoundaryMeshEntries for reading faBoundary files without a mesh
ENH: adjust debug output for {fa,fae,fv,fvs}patchField::New
- add alternative constraint type selection for faePatchField.
- unify handling of "patchType" reading.
Make less noisy when reporting dictionary defaults.
- allows reuse by finiteArea, for example.
- simplify edge looping with face thisLabel/nextLabel method
ENH: additional storage checks for mesh weights (faMesh + fvMesh)
- allow finite-area field decomposition without edge weights.
STYLE: use tmp New in various places. Simpler updateGeom check
STYLE: remove spurious (no-op) processor boundary evaluations
- boundary fields for faceAreaCentres and edgeCentres had no-op
initEvaluate/evaluate pair on processor boundaries.
Now consistent with each other and with how finiteVolume is defined.
STYLE: add comments about which private methods trigger communication
- reduce the amount of communication when checking zones and patches
by performing the synchonization check on the gathered strings
(master only) and reduce or broadcast the result.
STYLE: simplify coupled() checks depending only on parRun
* lessEqOp -> lessEqualOp
* greaterEqOp -> greaterEqualOp
to avoid ambiguitity with other forms such as 'plusEqOp' where the
'Eq' implies an assigment. The name change also aligns better with
C++ <functional> names such as std::less_equal, std::greater_equal
ENH: simple labelRange predicates gt0/ge0/lt0/le0
- mirrors scalarRange tests.
Lower overhead than using labelMinMax::ge(0) etc since it does not
create an intermediate (is stateless) and can be used as a constexpr
- was in fvMotionSolver, but only requires PatchFunction1 capabilities
(from within meshTools).
GIT: relocate IOmapDistributePolyMesh (from dynamicMesh to OpenFOAM)
- adds handling of negative start times for masterUncollatedFileOperation
as well (#1112).
- handle failures *after* restoring non-parRun mode.
This ensures exit(FatalError) will exit MPI properly as well.
STYLE: replace "polyMesh" with polyMesh::meshSubDir
STYLE: adjust IOobject read/write enumerated values
- provision for possible bitwise handling
- additional Pstream::broadcasts() method to serialize/deserialize
multiple items.
- revoke the broadcast specialisations for std::string and List(s) and
use a generic broadcasting template. In most cases, the previous
specialisations would have required two broadcasts:
(1) for the size
(2) for the contiguous content.
Now favour reduced communication over potential local (intermediate)
storage that would have only benefited a few select cases.
ENH: refine PstreamBuffers access methods
- replace 'bool hasRecvData(label)' with 'label recvDataCount(label)'
to recover the number of unconsumed receive bytes from specified
processor. Can use 'labelList recvDataCounts()' to recover the
number of unconsumed receive bytes from all processor.
- additional peekRecvData() method (for transcribing contiguous data)
ENH: globalIndex whichProcID - check for isLocal first
- reasonable to assume that local items are searched for more
frequently, so do preliminary check for isLocal before performing
a more costly binary search of globalIndex offsets
ENH: masterUncollatedFileOperation - bundled scatter of status
Eg,
export WM_COMPILER=Clang130
export WM_COMPILE_CONTROL="version=13.0 +lld"
- also support the mold linker (+mold) for clang
STYLE: report as 'link' stage instead of 'ld' in short messages
- use vector::removeCollinear a few places
COMP: incorrect initialization order in edgeFaceCirculator
COMP: Silence boost bind deprecation warnings (before CGAL-5.2.1)
- for most field types this is a no-op, but for a field of floatVector
or doubleVector (eg, vector and solveVector) it will normalise each
element with divide-by-zero protection.
More reliable and efficient than dividing a field by the mag of itself
(even with VSMALL protection).
Applied to FieldField and GeometricField as well.
Eg,
fld.normalise();
vs.
fld /= mag(fld) + VSMALL;
ENH: support optional tolerance for vector::normalise
- for cases where tolerances larger than ROOTVSMALL are preferable.
Not currently available for the field method (a templating question).
ENH: vector::removeCollinear method
- when working with geometries it is frequently necessary to have a
normal vector without any collinear components. The removeCollinear
method provides for clearer, compacter code.
Eg,
vector edgeNorm = ...;
const vector edgeDirn = e.unitVec(points());
edgeNorm.removeCollinear(edgeDirn);
edgeNorm.normalise();
vs.
vector edgeNorm = ...;
const vector edgeDirn = e.unitVec(points());
edgeNorm -= edgeDirn*(edgeDirn & edgeNorm);
edgeNorm /= mag(edgeNorm);
- for obtaining set entries from a boolList
- BitOps::select to mirror bitSet constructor but returning a boolList
- BitOps::set/unset for boolList
ENH: construct bitSet from a labelRange
- useful, for example, when marking up patch slices
ENH: ListOps methods
- ListOps::count_if to mirror std::count_if but with list indexing.
- ListOps::find_if to mirror std::find_if but with list indexing.
ENH: UPtrList::test() method.
- includes bounds checks, which means it can be used in more places
(eg, even if the storage is empty).
Previous commit solved: "mixture rho to volume-based in rhoThermo."
This proved to work correctly for rho=constant EoS but not for
idealGas. Fixes#2304. The previous gitlab issue was #1812.
- `functions<scalar>` and `functions<vector>` were erroneously
documented in header as `lookup<scalar>` etc.
INT: handle fluent square brackets (fixes#2429)
- patch applied from openfoam.org
- support direct processing of CompactListList instead of requiring
a conversion to labelListList for bandCompression and renumbering
methods.
- manage FIFO with CircularBuffer instead of SLList (avoids
allocations in inner loops). Invert logic to use a bitSet of
unvisited cells, which improves looping as the matrix becomes more
sparse.
- fix missed weighting in bandCompression (same as #1376).
In polyTopoChange, handle removed cells immediately to simplify
the logic and align more closely with bandCompression.
STYLE: enclose bandCompression within meshTools namespace
ENH: PrimitivePatch pointFaces with DynamicList instead of SLList
- MPI_Gatherv requires contiguous data, but a byte-wise transfer can
quickly exceed the 'int' limits used for MPI sizes/offsets. Thus
gather label/scalar components when possible to increase the
effective size limit.
For non-contiguous types (or large contiguous data) now also
reverts to manual handling
ENH: handle contiguous data in GAMGAgglomeration gather values
- delegate to globalIndex::gatherValues static method (new)
- bundles frequently used 'gather/scatter' patterns more consistently.
- combineAllGather -> combineGather + broadcast
- listCombineAllGather -> listCombineGather + broadcast
- mapCombineAllGather -> mapCombineGather + broadcast
- allGatherList -> gatherList + scatterList
- reduce -> gather + broadcast (ie, allreduce)
- The allGatherList currently wraps gatherList/scatterList, but may be
replaced with a different algorithm in the future.
STYLE: PstreamCombineReduceOps.H is mostly unneeded now
STYLE: LduInterfaceFieldPtrsList as alias instead of a class
STYLE: define patch lists typedefs when defining the base patch
- eg, polyPatchList typedef within polyPatch.H
INT: relocate GeometricField::Boundary -> GeometricBoundaryField
- was internal to GeometricField but moving it outside simplifies
forward declarations etc. Code adapted from openfoam.org
Two problems:
- flipping inside snappyHexMesh is not done in a parallel
consistent way. So e.g. the octree-cached inside/outside information
has already been calculated. For now flipping of
distributedTriSurfaceMesh is disabled.
- octree-cached inside/outside information was using already
cached information and would only work for outwards pointing
volumes
- percent of cells is taken relative to selection size.
- percent of faces is taken relative to the number of boundary faces
that do not fix velocity themselves.
ENH: avoid correctBoundaryConditions() if values were not limited
- when writing surface formats (eg, vtk, ensight etc) the sampled
surfaces merge the faces/points originating from different
processors into a single surface (ie, patch gatherAndMerge).
Previous versions of mergePoints simply merged all points possible,
which proves to be rather slow for larger meshes. This has now been
modified to only consider boundary points, which reduces the number
of points to consider. As part of this change, the reference point
is now always equivalent to the min of the bounding box, which
reduces the number of search loops. The merged points retain their
original order.
- inplaceMergePoints version to simplify use and improve code
robustness and efficiency.
ENH: make PrimitivePatch::boundaryPoints() less costly
- if edge addressing does not already exist, it will now simply walk
the local face edges directly to define the boundary points.
This avoids a rather large overhead of the full faceFaces,
edgeFaces, faceEdges addressing.
This operation is now more important since it is used in the revised
patch gatherAndMerge.
ENH: topological merge for mesh-based surfaces in surfaceFieldValue
- lower memory overhead, simpler code and eliminates need for
ListListOps::combineOffset()
- optional handling of local faces/points for re-using in different
contexts
STYLE: labelUList instead of labelList for globalMesh mergePoints
STYLE: adjust verbose information from mergePoints
- also report the current new-point location
- also disables PointData if manifold cells are detected.
This is a partial workaround for volPointInterpolation problems
with handling manifold cells.
- additional verbosity option for conversions
- ignore old `-finite-area` option and always convert available
finiteArea mesh/fields unless `-no-finite-area` is specified (#2374)
ENH: simplify point offset handling for ensight output
- extend writing to include compact face/cell lists
- a try/catch approach is not really robust enough (or even possible)
since read failures likely do not occur on all ranks simultaneously.
This leads to situations where the master has thrown an exception
(and thus exiting the current routine) while other ranks are still
waiting to receive data and the program blocks completely.
Since this primarily affects data conversion routines such as
foamToEnsight etc, treat similarly to lagrangian: check for the
existence of essential files before proceeding or not. This is
wrapped into a TryNew factory method:
autoPtr<faMesh> faMeshPtr(faMesh::TryNew(mesh));
if (faMeshPtr) ...
- gather/scatter types of operations can avoid AllToAll communication
and use simple MPI gather (or scatter) to establish the receive sizes.
New methods: finishedGathers() / finishedScatters()
BUG: masterUncollatedFileOperation checking of file-size
- used Foam:fileSize check to decide on scheduled/nonBlocking but this
was being done on all ranks and subsequently broadcast.
Now avoid unnecessary filesystem access on non-master ranks.
- both schemes and solutions data are treated as MUST_READ_IF_MODIFIED
even if the requested readOption is nominally MUST_READ or
READ_IF_PRESENT, but now delay this change.
- do not need contruct or move assign from SortableList.
Rarely (never) used and can simply treat like a normal list
by applying shrink beforehand.
- make append() methods return void instead of returning self, which
makes it easier to derive from. Having them return self was a bit of
an original design mistake.
Chaining appends do not actually occur anywhere. Even if they were
to be used, would not want to rely on them (fear of slicing on any
derived classes).
BUG: IndirectList iterator comparison loses constness
- eliminate redundant size_ accounting
- drop extra 'Container' template parameter and replace functionality
with more flexible pack/unpack methods.
There is also a pack() method that handles indirect lists of lists
that can be used, for example, to pack a patch slice of faces.
Drop the 'operator()' method in favour of unpack to expose and properly
document the conversion. Should revisit the corresponding code in
some places for optimization potential.
- align some method names with globalIndex:
totalSize(), maxSize() etc
- less communication than gatherList/scatterList
ENH: refine send granularity in Pstream::exchange
STYLE: ensure PstreamBuffers and defaultCommsType agree
- simpler loops for lduSchedule
- can restrict calculation of D32 and other spray properties to a
subset of parcels. Uses a predicate selection mechanism similar to
vtkCloud etc.
ENH: code cleanup in scalar predicates
- pass by value not reference in predicates
- additional assign() method to refactor common code
- with the special setFormat "probes", all of the sampled sets are
treated more similarly to probes, with an ensemble output to raw
probed format.
This is of course less useful when the number of sampled points
becomes very large.
- can now specify sampled sets as dictionary entries instead of a list
entry.
can now use: sets { ... }
instead of: sets ( ... );
This is similar to sampled surfaces and makes it easier to
manage with dictionary manipulation tools.
TUT: update to use writeTime instead of outputTime
- in v2112 the functionObject results were only delivering values from
the last set listed (ie, overwritten).
Now that the values are properly scoped by the name of the set itself
Eg, `average(lines,p)` for the average for 'lines' set, existing
workflows will break.
It thus makes reasonble sense to also handle results without a
qualifier as ensemble values.
average(p) // Ensemble average of all listed sets
- the very old 'writer' class was fully stateless and always templated
on an particular output type.
This is now replaced with a 'coordSetWriter' with similar concepts
as previously introduced for surface writers (#1206).
- writers change from being a generic state-less set of routines to
more properly conforming to the normal notion of a writer.
- Parallel data is done *outside* of the writers, since they are used
in a wide variety of contexts and the caller is currently still in
a better position for deciding how to combine parallel data.
ENH: update sampleSets to sample on per-field basis (#2347)
- sample/write a field in a single step.
- support for 'sampleOnExecute' to obtain values at execution
intervals without writing.
- support 'sets' input as a dictionary entry (as well as a list),
which is similar to the changes for sampled-surface and permits use
of changeDictionary to modify content.
- globalIndex for gather to reduce parallel communication, less code
- qualify the sampleSet results (properties) with the name of the set.
The sample results were previously without a qualifier, which meant
that only the last property value was actually saved (previous ones
overwritten).
For example,
```
sample1
{
scalar
{
average(line,T) 349.96521;
min(line,T) 349.9544281;
max(line,T) 350;
average(cells,T) 349.9854619;
min(cells,T) 349.6589286;
max(cells,T) 350.4967271;
average(line,epsilon) 0.04947733869;
min(line,epsilon) 0.04449639927;
max(line,epsilon) 0.06452856475;
}
label
{
size(line,T) 79;
size(cells,T) 1720;
size(line,epsilon) 79;
}
}
```
ENH: update particleTracks application
- use globalIndex to manage original parcel addressing and
for gathering. Simplify code by introducing a helper class,
storing intermediate fields in hash tables instead of
separate lists.
ADDITIONAL NOTES:
- the regionSizeDistribution largely retains separate writers since
the utility of placing sum/dev/count for all fields into a single file
is questionable.
- the streamline writing remains a "soft" upgrade, which means that
scalar and vector fields are still collected a priori and not
on-the-fly. This is due to how the streamline infrastructure is
currently handled (should be upgraded in the future).
Automatic hole closure:
- introduces 'holeToFace' topoSet source
- used when detecting a 'leak-path'
- creates additional baffles to close the leak
Multi-stage layer addition:
- Can add layers in multiple passes
See issues: #2403, #2404
- for metis-like graphs there is no guarantee that a zero-sized graph
has an offsets list with size 1 or size 0, so always use
numCells = max(0, xadj.size()-1)
this was already done in most places, but missed in the
decomposeGeneral method
STYLE: use sumOp<label>() instead of plusOp<label>()
- the internal data are contiguous so can broadcast size and internals
directly without an intermediate stream.
ENH: split out broadcast time for profilingPstream information
STYLE: minor Pstream cleanup
- UPstream::commsType_ from protected to private, since it already has
inlined noexcept getters/setters that should be used.
- don't pass unused/unneed tag into low-level MPI reduction templates.
Document where tags are not needed
- had Pstream::broadcast instead of UPstream::broadcast in internals
- used Pstream::maxCommsSize (bytes) for the lower limit when sending.
This would have send more data on each iteration than expected based
on maxCommsSize and finish with a number of useless iterations.
Was generally not a serious bug since maxCommsSize (if used) was
likely still far away from the MPI limits and exchange() is primarily
harnessed by PstreamBuffers, which is sending character data
(ie, number of elements and number of bytes is identical).
- For v2112 and earlier: pre-assembled lists of particles
to be transferred and target patch on a per processor basis.
Apart from memory overhead of assembling the lists this adds
allocations/de-allocation when building linked-lists.
- Now stream particle transfer tuples directly into PstreamBuffers.
Use a local cache of UOPstream wrappers for the formatters
(since there are potentially many particles being shifted about).
On the receiving size, read out tuple-wise.
- Communication on transfers now restricted to the immediate
neighbours instead of using an all-to-all to exchange sizes.
Applied to Cloud::move and RecycleInteraction
- now largely encapsulated using PstreamBuffers methods,
which makes it simpler to centralize and maintain
- avoid building intermediate structures when sending data,
remove unused methods/data
TUT: parallel version of depthCharge2D
STYLE: minor update in ProcessorTopology
- PstreamBuffers nProcs() and allProcs() methods to recover the rank
information consistent with the communicator used for construction
- allowClearRecv() methods for more control over buffer reuse
For example,
pBufs.allowClearRecv(false);
forAll(particles, particlei)
{
pBufs.clear();
fill...
read via IPstream(..., pBufs);
}
This preserves the receive buffers memory allocation between calls.
- finishedNeighbourSends() method as compact wrapper for
finishedSends() when send/recv ranks are identically
(eg, neighbours)
- hasSendData()/hasRecvData() methods for PstreamBuffers.
Can be useful for some situations to skip reading entirely.
For example,
pBufs.finishedNeighbourSends(neighProcs);
if (!returnReduce(pBufs.hasRecvData(), orOp<bool>()))
{
// Nothing to do
continue;
}
...
On an individual basis:
for (const int proci : pBufs.allProcs())
{
if (pBufs.hasRecvData(proci))
{
...
}
}
Also conceivable to do the following instead (nonBlocking only):
if (!returnReduce(pBufs.hasSendData(), orOp<bool>()))
{
// Nothing to do
pBufs.clear();
continue;
}
pBufs.finishedNeighbourSends(neighProcs);
...
- a somewhat specialized use case, but can be useful when there are
many ranks with sparse communication but for which the access
pattern is established during inner loops.
PstreamBuffers pBufs(Pstream::commsTypes::nonBlocking);
pBufs.allowClearRecv(false);
PtrList<OPstream> output(Pstream::nProcs());
while (condition)
{
// Rewind existing streams
forAll(output, proci)
{
auto* osptr = output.get(proci);
if (osptr)
{
(*osptr).rewind();
}
}
for (Particle& p : myCloud)
{
label toProci = ...;
// Get or create output stream
auto* osptr = output.get(toProci);
if (!osptr)
{
osptr = new OPstream(toProci, pBufs);
output.set(toProci, osptr);
}
// Append more data...
(*osptr) << p;
}
pBufs.finishedSends();
... reads
}
- split off a Pstream::genericBroadcast() which uses UOPBstream during
serialization and UOPBstream during de-serialization.
This function will not normally be used directly by callers, but
provides a base layer for higher-level broadcast calls.
- low-level UPstream broadcast of string content.
Since std::string has length and contiguous content, it is possible
to handle directly by the following:
1. broadcast size
2. resize
3. broadcast content when size != 0
Although this is a similar amount of communication as the generic
streaming version (min 1, max 2 broadcasts) it is more efficient
by avoiding serialization/de-serialization overhead.
- handle broadcast of List content distinctly.
Allows an optimized path for contiguous data, similar to how
std::string is handled (broadcast size, resize container, broadcast
content when size != 0), but can revert to genericBroadcast (streamed)
for non-contiguous data.
- make various scatter variants simple aliases for broadcast, since
that is what they are doing behind the scenes anyhow:
* scatter()
* combineScatter()
* listCombineScatter()
* mapCombineScatter()
Except scatterList() which remains somewhat different.
Beyond the additional (size == nProcs) check, the only difference to
using broadcast(List<T>&) or a regular scatter(List<T>&) is that
processor-local data is skipped. So leave this variant as-is.
STYLE: rename/prefix implementation code with 'Pstream'
- better association with its purpose and provides a unique name
- reduces later surprises and simplifies effort for the caller
- more flexible globalIndex scatter with auto-sized return field.
- Avoid communication for scattering into zero-sized fields.
- the data front for isoAdvection can be particularly sparse and at
higher processor counts there is an advantage to avoiding all-to-all
communication for the PstreamBuffers exchange
Based on code changes from T.Aoyagi(RIST), A.Azami(RIST)
- use MPI_Bcast intrinsic instead of manual tree to reduce the overall
number of messages.
Old behaviour can be re-enabled with
`#define Foam_Pstream_scatter_nobroadcast`
- The idea of broadcast streams is to replace multiple master to
subProcs communications with a single MPI_Bcast.
if (Pstream::master())
{
OPBstream toAll(Pstream::masterNo());
toAll << data;
}
else
{
IPBstream fromMaster(Pstream::masterNo());
fromMaster >> data;
}
// vs.
if (Pstream::master())
{
for (const int proci : Pstream::subProcs())
{
OPstream os(Pstream::commsTypes::scheduled, proci);
os << data;
}
}
else
{
IPstream is(Pstream::commsTypes::scheduled, Pstream::masterNo());
is >> data;
}
Can simply use UPstream::broadcast() directly for contiguous data
with known lengths.
Based on ideas from T.Aoyagi(RIST), A.Azami(RIST)
- native MPI min/max/sum reductions for float/double
irrespective of WM_PRECISION_OPTION
- native MPI min/max/sum reductions for (u)int32_t/(u)int64_t types,
irrespective of WM_LABEL_SIZE
- replace rarely used vector2D sum reduction with FixedList as a
indicator of its intent and also generalizes to different lengths.
OLD:
vector2D values; values.x() = ...; values.y() = ...;
reduce(values, sumOp<vector2D>());
NEW:
FixedList<scalar,2> values; values[0] = ...; values[1] = ...;
reduce(values, sumOp<scalar>());
- allow returnReduce() to use native reductions. Previous code (with
linear/tree selector) would have bypassed them inadvertently.
ENH: added support for MPI broadcast (for a memory span)
ENH: select communication schedule as a static method
- UPstream::whichCommunication(comm) to select linear/tree
communication instead of ternary or
if (Pstream::nProcs() < Pstream::nProcsSimpleSum) ...
STYLE: align nProcsSimpleSum static value with etc/controlDict override
- refactor as an MPI-independent base class.
Add bufferIPC{send,recv} private methods for construct/destruct.
Eliminates code duplication from two constructor forms and reduces
additional constructor definitions in dummy library.
- add PstreamBuffers access methods, refactor common finish sends
code, tweak member packing
ENH: resize_nocopy for processorLduInterface buffers
- content is immediately overwritten
STYLE: cull unneeded includes in processorFa*
- handled by processorLduInterface
- this can be used to apply a uniform field level to remove from
a sampled field. For example,
fieldLevel
{
"p.*" 1e5; // Absolute -> gauge [Pa]
T 273.15; // [K] -> [C]
U #eval{ 10/sqrt(3) }; // Uniform mag(U)=10
}
After the fieldLevel has been removed, any fieldScale is applied.
For example
fieldScale
{
"p.*" 0.01; // [Pa] -> [mbar]
}
The fieldLevel for vector and tensor fields may still need some
further refinement.
The runTimeControl function object can activate further function objects using
triggers. Previously the trigger index could only advance; this change set
allows users to set smaller values to enable function object recycling, e.g.
Repeat for N cycles:
1. average the pressure at a point in space
2. when the average stabilises, run for a further 100 iterations
3. set a new patch inlet velocity
- back to (1)
- Removes old default behaviour that only permitted an increase in the
trigger level. This type of 'ratcheting' mechanism (if required) is
now the responsibility of the derived function object.
- notably affects writing continuous data in binary. If generating a
compound token (eg, List<label>), need to add in the size prefix
otherwise it cannot actually be parsed properly as a List.
BUG: bad fallthrough for compound reading (FixedList)
- the branch was likely never reached, but would have attempted to
read twice due to a bad fall-through condition.
GIT: relocate globalIndex (is independent of mesh)
STYLE: include label/scalar Fwd in contiguous.H
STYLE: unneed commSchedule include in GeometricField
- as a side-effect of changes to probes, the file pointers are not
automatically creating when reading the dictionary but delayed
until prepare(WRITE_ACTION) is called.
This nuance was missed in thermoCoupleProbes.
- when used for example with wallShearStress, the stress field is
initially created as incompressible but later updated with the
correct compressible/incompressible dimensions.
If this field is sampled as a surface and stored on the registry
the dimensions should be reset() and not '=' assigned, since that
causes a dimension check which will obviously fail.
- occurs with newer gcc on ubuntu impish (gcc-11.2.0), but may perhaps
actually be related to `-flto=auto` or to the destruction order of
the static variables (race condition?).
Leaving the compat table around for automatic cleanup does not
impact on other lookups (which are nullptr checked anyhow).
- previously used the size of distributed roots to transmit if the
case was running in distributed mode, but this behaves rather poorly
with bad input. Specifically, the following questionable setup:
distributed true;
roots ( /*none*/ );
Now transmit the ParRunControl distributed() value instead,
and also emit a gentle warning for the user:
WARNING: running distributed but did not specify roots!
2021-09-08 09:29:27 +02:00
11578 changed files with 149049 additions and 66018 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.