- can use traits to distinguish label vs scalar types and
setComponents to properly index into single or multi-component
types without needing template specialisations for the task.
This avoids the need for a concrete translation unit and the
reported problem of multiply-defined specialisations when the header
is included in different places.
- the default (uninitialised) value for edge connections of -1
could be confused with a tagged finiteArea patch, which used
(-patchid-1) encoding. This would lead to messages about erroneous
processor-processor addressing, but is in fact an mismatched edge
connection.
Now tag the finiteArea patch as (-patchid-2) to avoid this ambiguity
and correctly generate an "Undefined connection:" message instead.
Properly flush the VTP writers before raising a FatalError
to ensure that they are not prematurely truncated.
Open Point:
The base problem of "Undefined connection:" is largely related to
multiply-connected face edges (ie, from the underlying volume mesh).
Not easily remedied in the finiteArea generation.
TUT: basic finiteArea setup on motorBike
- have read(nullptr, count) and readRaw(nullptr, count) act like a
forward seek instead of failing.
This lets it be used to advance through a file without needing to
allocate (and discard) storage space etc.
- construct from components, or use word::null to ensure
consistent avoid naming between IOobject vs dimensioned type.
- support construct with parameter ordering as per DimensionedField
ENH: instantiate a uniformDimensionedLabelField
- eg, for registering standalone integer counters
- directory discovery originally designed for a sub-dir location
(eg, etc/openfoam) but failed if called from within the sub-dir
itself.
Now simply assume it is located in the project directory or the etc/
sub-dir, so that it can also be relocated into the project directory
in the future (pending changes to RPM and debian packaging)
- for querying all outstanding requests:
if (UPstream::finishedRequests(startRequest)) ...
if (UPstream::finishedRequests(startRequest, -1)) ...
- for querying slice of outstanding requests:
if (UPstream::finishedRequests(startRequest, 10)) ...
- simplifies communication structuring with intra-host communication.
Can be used for IO only, or for specialised communication.
Demand-driven construction. Gathers the SHA1 of host names when
determining the connectivity. Internally uses an MPI_Gather of the
digests and a MPI_Bcast of the unique host indices.
NOTE:
does not use MPI_Comm_splt or MPI_Comm_splt_type since these
return MPI_COMM_NULL on non-participating process which does not
easily fit into the OpenFOAM framework.
Additionally, if using the caching version of
UPstream::commInterHost() and UPstream::commIntraHost()
the topology is determined simultaneously
(ie, equivalent or potentially lower communication).
- make sizing of commsStruct List demand-driven as well
for more robustness, fewer unneeded allocations.
- fix potential latent bug with allBelow/allNotBelow for proc 0
(linear communication).
ENH: remove unused/unusable UPstream::communicator optional parameter
- had constructor option to avoid constructing the MPI backend,
but this is not useful and inconsistent with what the reset or
destructor expect.
STYLE: local use of UPstream::communicator
- automatically frees communicator when it leaves scope
- these are primarily when encountering sparse (eg, inter-host)
communicators. Additional UPstream convenience methods:
is_rank(comm)
=> True if process corresponds to a rank in the communicators.
Can be a master rank or a sub-rank.
is_parallel(comm)
=> True if parallel algorithm or exchange is used on the process.
same as
(parRun() && (nProcs(comm) > 1) && is_rank(comm))
- for robustness with small edges (which can occur with snappy meshes),
the Le() and magLe() are limited to SMALL (commit a0f1e98d24).
Now use factor sqrt(1/3) in the components to maintain magnitude of 1.
ENH: add fvMesh::unitSf() and faMesh::unitLe() methods
- simple wrappers around Sf()/magSf() and Le()/magLe() but with
the potential for additional/alternative corrections.
STYLE: thisDb() in faMesh code to simplify future changes in storage
ENH: do not register finite-area geometric fields
- consistent with finite-volume treatment
- replace the "one-size-fits-all" approach of tensor field inv()
with individual 'failsafe' inverts.
The inv() field function historically just checked the first entry
to detect 2D cases and adjusted/readjusted *all* tensors accordingly
(to avoid singularity tensors and/or noisy inversions).
This seems to have worked reasonably well with 3D volume meshes, but
breaks down for 2D area meshes, which can be axis-aligned
differently on different sections of the mesh.
- with (nPollProcInterfaces < 0) it does the following:
- loop, waiting for some requests to finish
- for each out-of-date interface, check if its associated
requests have now finished (ie, the ready() check).
- if ready() -> call updateInterfaceMatrix()
In contrast to (nPollProcInterfaces > 0) which loops a specified
number of times with several calls to MPI_Test each time, the
(nPollProcInterfaces < 0) variant relies on internal MPI looping
within MPI_Waitsome to progress communication.
The actual dispatch still remains non-deterministic (ie, waiting for
some requests to finish does not mean that any particular interface
is eligible for update, or in any particular order). However, using
Waitsome places the tight looping into the MPI layer, which results
in few calls and eliminates behaviour dependent on the value of
nPollProcInterfaces.
TUT: add polling to windAroundBuildings case (for testing purposes)
- fewer calls, potentially more consistent
ENH: update sendRequest state after recvRequest wait
- previously had this type of code:
// Treat send as finished when recv is done
UPstream::waitRequest(recvRequest_);
recvRequest_ = -1;
sendRequest_ = -1;
Now refined as follows:
// Require receive data. Update the send request state.
UPstream::waitRequest(recvRequest_);
recvRequest_ = -1;
if (UPstream::finishedRequest(sendRequest_)) sendRequest_ = -1;
Can potentially investigate with requiring both,
but this may be over-contrained.
Example,
// Require receive data, but also wait for sends too
UPstream::waitRequestPair(recvRequest_, sendRequest_);
- checks requests from completion, returning true when some requests
have completed and false when there are no active requests.
This allows it to be used in a polling loop to progress MPI
and then respond when as requests become satisfied.
When using as part of a dispatch loop, waitSomeRequests() is
probably more efficient than calling waitAnyRequest() and can help
avoid biasing which client requests are serviced.
Takes an optional return parameter, to retrieve the indices,
but more importantly to avoid inner-loop reallocations.
Example,
DynamicList<int> indices;
while (UPstream::waitSomeRequests(startRequest, &indices))
{
// Dispatch something ....
}
// Reset list of outstanding requests with 'Waitall' for safety
UPstream::waitRequests(startRequest);
---
If only dealing with single items and an index is required for
dispatching, it can be better to use a list of UPstream::Request
instead.
Example,
List<UPstream::Request> requests = ...;
label index = -1;
while ((index = UPstream::waitAnyRequest(requests)) >= 0)
{
// Do something at index
}
ENH: pair-wise wrappers for MPI_Test or MPI_Wait
- for send/recv pairs of requests, can bundle both together and use a
single MPI_Testsome and MPI_Waitall instead of two individual
calls.
- previously had an additional stack for freedRequests_,
which were used to 'remember' locations into the list of
outstandingRequests_ that were handled by 'waitRequest()'.
This was principally done for sanity checks on shutdown,
but we now just test for any outstanding requests that
are *not* MPI_REQUEST_NULL instead (much simpler).
The framework with freedRequests_ also had a provision to 'recycle'
them by popping from that stack, but this is rather fragile since it
would only triggered by some collectives
(MPI_Iallreduce, MPI_Ialltoall, MPI_Igather, MPI_Iscatter)
with no guarantee that these will all be properly removed again.
There was also no pruning of extraneous indices.
ENH: consolidate internal reset/push of requests
- replace duplicate code with inline functions
reset_request(), push_request()
ENH: null out trailing requests
- extra safety (paranoia) for the UPstream::Request versions
of finishedRequests(), waitAnyRequest()
CONFIG: document nPollProcInterfaces in etc/controlDict
- still experimental, but at least make the keyword known
- mechanism has been unused for at least a decade or more
(or was never used). Message tags are assigned on an ad hoc basis
locally when collision avoidance is necessary.
- not currently used, but it is possible that communicator allocation
modifies the list of sub-ranks. Ensure that the correct size is used
when (re)initialising the linear/tree structures.
STYLE: adjust MPI test applications
- remove some clutter and unneeded grouping.
Some ideas for host-only communicators
- allow reporting even when profiling is suspended
- consolidate reporting into profilingPstream itself
(avoids code scatter).
Example of possible advanced use for timing only one section of
code:
====
// Profile local operations
profilingPstream::enable();
... do something
// Don't profile elsewhere
profilingPstream::suspend();
====
- separate broadcast times from reduce/gather/scatter time
- separate wait times from all-to-all time
- support invocation counts, split off requests time/count
from others to avoid flooding the counts
- support 'detail' switch to increase the output information.
Format may change in the future
- attributes such as assignable(), coupled() etc
- common patchField types: calculatedType(), zeroGradientType() etc.
This simplifies reference to these types without actually needing a
typed patchField version.
ENH: add some basic patchField types to fieldTypes namespace
- allows more general use of the names
ENH: set extrapolated/calculated from patchInternalField directly
- avoids intermediate tmp
- with the current handling of small edges (finite-area), the LSQ
vectors can result in singular/2D tensors. However, the regular
2D handling in field inv() only detects based on the first element.
Provide a 'failsafe' inv() method for symmTensor and tensor that
follows a similar logic for avoiding zero determinates, but it is
applied on a per element basis, instead of deciding based on the
first field element.
The symmTensor::inv(bool) and tensor::inv(bool) methods have a
fairly modest additional overhead.
- unroll the field inv() function to avoid creating an intermediate
field. Reduce the number of operations when adjusting/re-adjusting
the diagonal.