- the general OpenFOAM way of collecting data often looks like this:
List<scalar> allValues(numProcs);
allValues[myProc] = localValue;
Pstream::gatherList(allValues);
// Now possible like this
List<scalar> allValues(numProcs);
allValues[myProc] = localValue;
UPstream::mpiGather(nullptr, allValues.data_bytes(), sizeof(scalar));
- adjusted Pstream::gatherList accordingly.
Split out the manual implementations, give them new names
(..._tree_algorithm) and make them private.
STYLE: rename UPstream::gather -> UPstream::mpiGatherv
- easier to identify and establishes the connection to the MPI call
- soft renames (ie, old names still available via typedefs) for more
reasonable names and more coverage with std stream variants.
The old names could be a bit cryptic.
For example, uiliststream (== an unallocated/external list storage),
which is written as std::ispanstream for C++23.
Could similarly argue that IListStream is better named as
ICharStream, since it is an input stream of characters and the
internal storage mechanism (List or something else) is mostly
irrelevant.
Extending the coverage to include all std stream variants, and
simply rewrap them for OpenFOAM IOstream types. This simplifies the
inheritance patterns and allows reuse of icharstream/ocharstream as
a drop-in replace for istringstream/ostringstream in other wrappers.
Classes:
* icharstream / ICharStream [old: none / IListStream]
* ocharstream / OCharStream [old: none / OListStream]
* ispanstream / ISpanStream [old: uiliststream / UIListStream]
* ospanstream / OSpanStream [old: none / UOListStream]
Possible new uses : read file contents into a buffer, broadcast
buffer contents to other ranks and then transfer into an icharstream
to be read from. This avoid the multiple intermediate copies that
would be associated when using an istringstream.
- Use size doubling instead of block-wise incremental for ocharstream
(OCharStream). This corresponds to the sizing behaviour as per
std::stringstream (according to gcc-11 includes)
STYLE: drop Foam_IOstream_extras constructors for memory streams
- transitional/legacy constructors but not used in any code
- handle existence/non-existence of a FoamFile header automatically
- support an upper limit when getting the number of blocks and
use that for a hasBlock(...) method, which will stop reading sooner.
- additional Pstream::broadcasts() method to serialize/deserialize
multiple items.
- revoke the broadcast specialisations for std::string and List(s) and
use a generic broadcasting template. In most cases, the previous
specialisations would have required two broadcasts:
(1) for the size
(2) for the contiguous content.
Now favour reduced communication over potential local (intermediate)
storage that would have only benefited a few select cases.
ENH: refine PstreamBuffers access methods
- replace 'bool hasRecvData(label)' with 'label recvDataCount(label)'
to recover the number of unconsumed receive bytes from specified
processor. Can use 'labelList recvDataCounts()' to recover the
number of unconsumed receive bytes from all processor.
- additional peekRecvData() method (for transcribing contiguous data)
ENH: globalIndex whichProcID - check for isLocal first
- reasonable to assume that local items are searched for more
frequently, so do preliminary check for isLocal before performing
a more costly binary search of globalIndex offsets
ENH: masterUncollatedFileOperation - bundled scatter of status
- gather/scatter types of operations can avoid AllToAll communication
and use simple MPI gather (or scatter) to establish the receive sizes.
New methods: finishedGathers() / finishedScatters()
- PstreamBuffers nProcs() and allProcs() methods to recover the rank
information consistent with the communicator used for construction
- allowClearRecv() methods for more control over buffer reuse
For example,
pBufs.allowClearRecv(false);
forAll(particles, particlei)
{
pBufs.clear();
fill...
read via IPstream(..., pBufs);
}
This preserves the receive buffers memory allocation between calls.
- finishedNeighbourSends() method as compact wrapper for
finishedSends() when send/recv ranks are identically
(eg, neighbours)
- hasSendData()/hasRecvData() methods for PstreamBuffers.
Can be useful for some situations to skip reading entirely.
For example,
pBufs.finishedNeighbourSends(neighProcs);
if (!returnReduce(pBufs.hasRecvData(), orOp<bool>()))
{
// Nothing to do
continue;
}
...
On an individual basis:
for (const int proci : pBufs.allProcs())
{
if (pBufs.hasRecvData(proci))
{
...
}
}
Also conceivable to do the following instead (nonBlocking only):
if (!returnReduce(pBufs.hasSendData(), orOp<bool>()))
{
// Nothing to do
pBufs.clear();
continue;
}
pBufs.finishedNeighbourSends(neighProcs);
...
- a Pstream::master with a Pstream::parRun guard in case Pstream has
not yet been initialised, as will be the case for low-level messages
during startup.
- propagate relativeName handling into IOstreams
- simply adds in the reinterpret_cast, which simplifies coding for
binary data movement.
Name complements the size_bytes() method for contiguous data
STYLE: container IO.C files into main headers for better visibility
STYLE: include CompactListList.H in polyTopoChange
- avoids future mismatches if the CompactListList template signature
changes
GIT: relocate CompactListList into CompactLists/ directory
- currently add to mesh zones to provide a table of contents
of the zone names that allows downstream consumers quick access to
the information without needing to parse the entire file.
- support selective enable/disable of the file banner.
ENH: improve code isolation for decomposedBlockData
- use readBlockEntry/writeBlockEntry to encapsulate the IO handling,
which ensures more consistency
- new decomposedBlockData::readHeader for chaining into the
block header information.
- remove unused constructors for decomposedBlockData
ENH: minor cleanup of collated fileOperations
- improves interface and data consistency.
Older signatures are still active (via the Foam_IOstream_extras
define).
- refine internals for IOstreamOption streamFormat, versionNumber
ENH: improve data alignment for IOstream and IOobject
- fit sizeof label/scalar into unsigned char
STYLE: remove dead code
- for use when the is_contiguous check has already been done outside
the loop. Naming as per std::span.
STYLE: use data/cdata instead of begin
ENH: replace random_shuffle with shuffle, fix OSX int64 ambiguity
- direct check of punctuation.
For example,
while (!tok.isPunctuation(token::BEGIN_LIST)) ..
instead of
while (!(tok.isPunctuation() && tok.pToken() == token::BEGIN_LIST)) ..
Using direct comparison (tok != token::BEGIN_LIST) can be fragile
when comparing int values:
int c = readChar(is);
while (tok != c) .. // Danger, uses LABEL comparison!
- direct check of word.
For example,
if (tok.isWord("uniform")) ..
instead of
if (tok.isWord() && tok.wordToken() == "uniform") ..
- make token lineNumber() a setter method
ENH: adjust internal compound method empty() -> moved()
- support named compound tokens
STYLE: setter method for stream indentation
- this was previously suppressed for ASCII format as being 'clutter',
but without it there is no context for interpreting the type of data
contained in ASCII files: potentially leading to integer overflows
when reading in ParaView etc.
- returns a range of `int` values that can be iterated across.
For example,
for (const int proci : Pstream::subProcs()) { ... }
instead of
for
(
int proci = Pstream::firstSlave();
proci <= Pstream::lastSlave();
++proci
)
{
...
}
- clearer than passing a reference to a dummy variable,
or relying on move occuring within the copy constructor
(historical, but should be deprecated)
STYLE: consistent autoPtr syntax for uncollated file operations
* Support default values for format/compress enum lookups.
- Avoids situations where the preferred default format is not ASCII.
For example, with dictionary input:
format binar;
The typing mistake would previously have caused formatEnum to
default to ASCII. We can now properly control its behaviour.
IOstream::formatEnum
(
dict.get<word>("format"), IOstream::BINARY
);
Allowing us to switch ascii/binary, using BINARY by default even in
the case of spelling mistakes. The mistakes are flagged, but the
return value can be non-ASCII.
* The format/compression lookup behave as pass-through if the lookup
string is empty.
- Allows the following to work without complaint
IOstream::formatEnum
(
dict.getOrDefault("format", word::null), IOstream::BINARY
);
- Or use constructor-like failsafe method
IOstream::formatEnum("format", dict, IOstream::BINARY);
- Apply the same behaviour with setting stream format/compression
from a word.
is.format("binar");
will emit a warning, but leave the stream format UNCHANGED
* Rationalize versionNumber construction
- constexpr constructors where possible.
Default construct is the "currentVersion"
- Construct from token to shift the burden to versionNumber.
Support token as argument to version().
Now:
is.version(headerDict.get<token>("version"));
or failsafe constructor method
is.version
(
IOstreamOption::versionNumber("version", headerDict)
);
Before (controlled input):
is.version
(
IOstreamOption::versionNumber
(
headerDict.get<float>("version")
)
);
Old, uncontrolled input - has been removed:
is.version(headerDict.lookup("version"));
* improve consistency, default behaviour for IOstreamOption construct
- constexpr constructors where possible
- add copy construct with change of format.
- construct IOstreamOption from streamFormat is now non-explicit.
This is a commonly expected result with no ill-effects
The collated container ('decomposedBlockData') is always binary
but the 'payload' might be ascii so use that header information
instead of the decomposeBlockData header.
- makes the intent clearer and avoids the need for additional
constructor casting. Eg,
labelList(10, Zero) vs. labelList(10, 0)
scalarField(10, Zero) vs. scalarField(10, scalar(0))
vectorField(10, Zero) vs. vectorField(10, vector::zero)
- For compatibility, access to the old global names is provided via
macros
#define FOAMversion foamVersion::version
#define FOAMbuild foamVersion::build
#define FOAMbuildArch foamVersion::buildArch
- this isolation makes it easier to provide additional scoped methods
for dealing with version related information. Eg, printBuildInfo()
twoPhaseMixtureThermo writes the temperatures during construction only
for them to be read again immediately after by construction of the
individual phases' thermo models. When running with collated file
handling this behaviour is not thread safe. This change deactivates
threading for the duration of this behaviour.
Patch contributed by Mattijs Janssens
- avoids compiler ambiguity when virtual methods such as
IOdictionary::read() exist.
- the method was introduced in 1806, and was thus not yet widely used
Improvements to existing functionality
--------------------------------------
- MPI is initialised without thread support if it is not needed e.g. uncollated
- Use native c++11 threading; avoids problem with static destruction order.
- etc/cellModels now only read if needed.
- etc/controlDict can now be read from the environment variable FOAM_CONTROLDICT
- Uniform files (e.g. '0/uniform/time') are now read only once on the master only
(with the masterUncollated or collated file handlers)
- collated format writes to 'processorsNNN' instead of 'processors'. The file
format is unchanged.
- Thread buffer and file buffer size are no longer limited to 2Gb.
The global controlDict file contains parameters for file handling. Under some
circumstances, e.g. running in parallel on a system without NFS, the user may
need to set some parameters, e.g. fileHandler, before the global controlDict
file is read from file. To support this, OpenFOAM now allows the global
controlDict to be read as a string set to the FOAM_CONTROLDICT environment
variable.
The FOAM_CONTROLDICT environment variable can be set to the content the global
controlDict file, e.g. from a sh/bash shell:
export FOAM_CONTROLDICT=$(foamDictionary $FOAM_ETC/controlDict)
FOAM_CONTROLDICT can then be passed to mpirun using the -x option, e.g.:
mpirun -np 2 -x FOAM_CONTROLDICT simpleFoam -parallel
Note that while this avoids the need for NFS to read the OpenFOAM configuration
the executable still needs to load shared libraries which must either be copied
locally or available via NFS or equivalent.
New: Multiple IO ranks
----------------------
The masterUncollated and collated fileHandlers can now use multiple ranks for
writing e.g.:
mpirun -np 6 simpleFoam -parallel -ioRanks '(0 3)'
In this example ranks 0 ('processor0') and 3 ('processor3') now handle all the
I/O. Rank 0 handles 0,1,2 and rank 3 handles 3,4,5. The set of IO ranks should always
include 0 as first element and be sorted in increasing order.
The collated fileHandler uses the directory naming processorsNNN_XXX-YYY where
NNN is the total number of processors and XXX and YYY are first and last
processor in the rank, e.g. in above example the directories would be
processors6_0-2
processors6_3-5
and each of the collated files in these contains data of the local ranks
only. The same naming also applies when e.g. running decomposePar:
decomposePar -fileHandler collated -ioRanks '(0 3)'
New: Distributed data
---------------------
The individual root directories can be placed on different hosts with different
paths if necessary. In the current framework it is necessary to specify the
root per slave process but this has been simplified with the option of specifying
the root per host with the -hostRoots command line option:
mpirun -np 6 simpleFoam -parallel -ioRanks '(0 3)' \
-hostRoots '("machineA" "/tmp/" "machineB" "/tmp")'
The hostRoots option is followed by a list of machine name + root directory, the
machine name can contain regular expressions.
New: hostCollated
-----------------
The new hostCollated fileHandler automatically sets the 'ioRanks' according to
the host name with the lowest rank e.g. to run simpleFoam on 6 processors with
ranks 0-2 on machineA and ranks 3-5 on machineB with the machines specified in
the hostfile:
mpirun -np 6 --hostfile hostfile simpleFoam -parallel -fileHandler hostCollated
This is equivalent to
mpirun -np 6 --hostfile hostfile simpleFoam -parallel -fileHandler collated -ioRanks '(0 3)'
This example will write directories:
processors6_0-2/
processors6_3-5/
A typical example would use distributed data e.g. no two nodes, machineA and
machineB, each with three processes:
decomposePar -fileHandler collated -case cavity
# Copy case (constant/*, system/*, processors6/) to master:
rsync -a cavity machineA:/tmp/
# Create root on slave:
ssh machineB mkdir -p /tmp/cavity
# Run
mpirun --hostfile hostfile icoFoam \
-case /tmp/cavity -parallel -fileHandler hostCollated \
-hostRoots '("machineA" "/tmp" "machineB" "/tmp")'
Contributed by Mattijs Janssens
- IOstreamOption class to encapsulate format, compression, version.
This is ordered to avoid internal padding in the structure, which
reduces several bytes of memory overhead for stream objects
and other things using this combination of data.
Byte-sizes:
old IOstream:48 PstreamBuffers:88 Time:928
new IOstream:24 PstreamBuffers:72 Time:904
====
STYLE: remove support for deprecated uncompressed/compressed selectors
In older versions, the system/controlDict used these types of
specifications:
writeCompression uncompressed;
writeCompression compressed;
As of DEC-2009, these were deprecated in favour of using normal switch
names:
writeCompression true;
writeCompression false;
writeCompression on;
writeCompression off;
Now removed these deprecated names and treat like any other unknown
input and issue a warning. Eg,
Unknown compression specifier 'compressed', assuming no compression
====
STYLE: provide Enum of stream format names (ascii, binary)
====
COMP: fixed incorrect IFstream construct in FIREMeshReader
- spurious bool argument (presumably meant as uncompressed) was being
implicitly converted to a versionNumber. Now caught by making
IOstreamOption::versionNumber constructor explicit.
- bad version specifier in changeDictionary
This class is largely a pre-C++11 holdover. It is now possible to
simply use move construct/assignment directly.
In a few rare cases (eg, polyMesh::resetPrimitives) it has been
replaced by an autoPtr.
Improve alignment of its behaviour with std::unique_ptr
- element_type typedef
- release() method - identical to ptr() method
- get() method to get the pointer without checking and without releasing it.
- operator*() for dereferencing
Method name changes
- renamed rawPtr() to get()
- renamed rawRef() to ref(), removed unused const version.
Removed methods/operators
- assignment from a raw pointer was deleted (was rarely used).
Can be convenient, but uncontrolled and potentially unsafe.
Do allow assignment from a literal nullptr though, since this
can never leak (and also corresponds to the unique_ptr API).
Additional methods
- clone() method: forwards to the clone() method of the underlying
data object with argument forwarding.
- reset(autoPtr&&) as an alternative to operator=(autoPtr&&)
STYLE: avoid implicit conversion from autoPtr to object type in many places
- existing implementation has the following:
operator const T&() const { return operator*(); }
which means that the following code works:
autoPtr<mapPolyMesh> map = ...;
updateMesh(*map); // OK: explicit dereferencing
updateMesh(map()); // OK: explicit dereferencing
updateMesh(map); // OK: implicit dereferencing
for clarity it may preferable to avoid the implicit dereferencing
- prefer operator* to operator() when deferenced a return value
so it is clearer that a pointer is involve and not a function call
etc Eg, return *meshPtr_; vs. return meshPtr_();
so the write thread does not have to do any parallel communication. This avoids
the bugs in the threading support in OpenMPI.
Patch contributed by Mattijs Janssens
Resolves bug-report https://bugs.openfoam.org/view.php?id=2669
Original commit message:
------------------------
Parallel IO: New collated file format
When an OpenFOAM simulation runs in parallel, the data for decomposed fields and
mesh(es) has historically been stored in multiple files within separate
directories for each processor. Processor directories are named 'processorN',
where N is the processor number.
This commit introduces an alternative "collated" file format where the data for
each decomposed field (and mesh) is collated into a single file, which is
written and read on the master processor. The files are stored in a single
directory named 'processors'.
The new format produces significantly fewer files - one per field, instead of N
per field. For large parallel cases, this avoids the restriction on the number
of open files imposed by the operating system limits.
The file writing can be threaded allowing the simulation to continue running
while the data is being written to file. NFS (Network File System) is not
needed when using the the collated format and additionally, there is an option
to run without NFS with the original uncollated approach, known as
"masterUncollated".
The controls for the file handling are in the OptimisationSwitches of
etc/controlDict:
OptimisationSwitches
{
...
//- Parallel IO file handler
// uncollated (default), collated or masterUncollated
fileHandler uncollated;
//- collated: thread buffer size for queued file writes.
// If set to 0 or not sufficient for the file size threading is not used.
// Default: 2e9
maxThreadFileBufferSize 2e9;
//- masterUncollated: non-blocking buffer size.
// If the file exceeds this buffer size scheduled transfer is used.
// Default: 2e9
maxMasterFileBufferSize 2e9;
}
When using the collated file handling, memory is allocated for the data in the
thread. maxThreadFileBufferSize sets the maximum size of memory in bytes that
is allocated. If the data exceeds this size, the write does not use threading.
When using the masterUncollated file handling, non-blocking MPI communication
requires a sufficiently large memory buffer on the master node.
maxMasterFileBufferSize sets the maximum size in bytes of the buffer. If the
data exceeds this size, the system uses scheduled communication.
The installation defaults for the fileHandler choice, maxThreadFileBufferSize
and maxMasterFileBufferSize (set in etc/controlDict) can be over-ridden within
the case controlDict file, like other parameters. Additionally the fileHandler
can be set by:
- the "-fileHandler" command line argument;
- a FOAM_FILEHANDLER environment variable.
A foamFormatConvert utility allows users to convert files between the collated
and uncollated formats, e.g.
mpirun -np 2 foamFormatConvert -parallel -fileHandler uncollated
An example case demonstrating the file handling methods is provided in:
$FOAM_TUTORIALS/IO/fileHandling
The work was undertaken by Mattijs Janssens, in collaboration with Henry Weller.