Original commit message:
------------------------
Parallel IO: New collated file format
When an OpenFOAM simulation runs in parallel, the data for decomposed fields and
mesh(es) has historically been stored in multiple files within separate
directories for each processor. Processor directories are named 'processorN',
where N is the processor number.
This commit introduces an alternative "collated" file format where the data for
each decomposed field (and mesh) is collated into a single file, which is
written and read on the master processor. The files are stored in a single
directory named 'processors'.
The new format produces significantly fewer files - one per field, instead of N
per field. For large parallel cases, this avoids the restriction on the number
of open files imposed by the operating system limits.
The file writing can be threaded allowing the simulation to continue running
while the data is being written to file. NFS (Network File System) is not
needed when using the the collated format and additionally, there is an option
to run without NFS with the original uncollated approach, known as
"masterUncollated".
The controls for the file handling are in the OptimisationSwitches of
etc/controlDict:
OptimisationSwitches
{
...
//- Parallel IO file handler
// uncollated (default), collated or masterUncollated
fileHandler uncollated;
//- collated: thread buffer size for queued file writes.
// If set to 0 or not sufficient for the file size threading is not used.
// Default: 2e9
maxThreadFileBufferSize 2e9;
//- masterUncollated: non-blocking buffer size.
// If the file exceeds this buffer size scheduled transfer is used.
// Default: 2e9
maxMasterFileBufferSize 2e9;
}
When using the collated file handling, memory is allocated for the data in the
thread. maxThreadFileBufferSize sets the maximum size of memory in bytes that
is allocated. If the data exceeds this size, the write does not use threading.
When using the masterUncollated file handling, non-blocking MPI communication
requires a sufficiently large memory buffer on the master node.
maxMasterFileBufferSize sets the maximum size in bytes of the buffer. If the
data exceeds this size, the system uses scheduled communication.
The installation defaults for the fileHandler choice, maxThreadFileBufferSize
and maxMasterFileBufferSize (set in etc/controlDict) can be over-ridden within
the case controlDict file, like other parameters. Additionally the fileHandler
can be set by:
- the "-fileHandler" command line argument;
- a FOAM_FILEHANDLER environment variable.
A foamFormatConvert utility allows users to convert files between the collated
and uncollated formats, e.g.
mpirun -np 2 foamFormatConvert -parallel -fileHandler uncollated
An example case demonstrating the file handling methods is provided in:
$FOAM_TUTORIALS/IO/fileHandling
The work was undertaken by Mattijs Janssens, in collaboration with Henry Weller.
There are a few issues:
- error would only throw exceptions if not parallel
- if we change this we also need to make sure the functionObjectList
construction is synchronised
- bounding box overlap was not returning the correct status so the code
to avoid the issue of 'badly formed bounding box' was not triggered.
- consolidate word::validated() into word::validate() and also allow
as short form for string::validate<word>(). Also less confusing than
having similarly named methods that essentially do the same thing.
- more consistent const access when iterating over strings
- add valid(char) for keyType and wordRe
- error::throwExceptions(bool) returning the previous state makes it
easier to set and restore states.
- throwing() method to query the current handling (if required).
- the normal error::throwExceptions() and error::dontThrowExceptions()
also return the previous state, to make it easier to restore later.
Now the postProcess utility '-region' option works correctly, e.g. for
the chtMultiRegionSimpleFoam/heatExchanger case
postProcess -region air -func "mag(U)"
calculates 'mag(U)' for all the time steps in region 'air'.
To re-use existing 'sampleDict' files simply add the following entries:
type sets;
libs ("libsampling.so");
and run
postProcess -func sampleDict
It is probably better to also rename 'sampleDict' -> 'sample' and then run
postProcess -func sampleDict
Replaced the 'postProcess' argument to the 'write' and 'execute'
functions with the single static member 'postProcess' in the
functionObject base-class.
e.g.
functions
{
#includeFunc mag(U)
}
executes 'mag' on the field 'U' writing the field 'mag(U)'.
The equivalent post-processing command is
postProcess -func 'mag(U)'
with the more general and flexible 'postProcess' utility and '-postProcess' solver option
Rationale
---------
Both the 'postProcess' utility and '-postProcess' solver option use the
same extensive set of functionObjects available for data-processing
during the run avoiding the substantial code duplication necessary for
the 'foamCalc' and 'postCalc' utilities and simplifying maintenance.
Additionally consistency is guaranteed between solver data processing
and post-processing.
The functionObjects have been substantially re-written and generalized
to simplify development and encourage contribution.
Configuration
-------------
An extensive set of simple functionObject configuration files are
provided in
OpenFOAM-dev/etc/caseDicts/postProcessing
and more will be added in the future. These can either be copied into
'<case>/system' directory and included into the 'controlDict.functions'
sub-dictionary or included directly from 'etc/caseDicts/postProcessing'
using the '#includeEtc' directive or the new and more convenient
'#includeFunc' directive which searches the
'<etc>/caseDicts/postProcessing' directories for the selected
functionObject, e.g.
functions
{
#includeFunc Q
#includeFunc Lambda2
}
'#includeFunc' first searches the '<case>/system' directory in case
there is a local configuration.
Description of #includeFunc
---------------------------
Specify a functionObject dictionary file to include, expects the
functionObject name to follow (without quotes).
Search for functionObject dictionary file in
user/group/shipped directories.
The search scheme allows for version-specific and
version-independent files using the following hierarchy:
- \b user settings:
- ~/.OpenFOAM/\<VERSION\>/caseDicts/postProcessing
- ~/.OpenFOAM/caseDicts/postProcessing
- \b group (site) settings (when $WM_PROJECT_SITE is set):
- $WM_PROJECT_SITE/\<VERSION\>/caseDicts/postProcessing
- $WM_PROJECT_SITE/caseDicts/postProcessing
- \b group (site) settings (when $WM_PROJECT_SITE is not set):
- $WM_PROJECT_INST_DIR/site/\<VERSION\>/caseDicts/postProcessing
- $WM_PROJECT_INST_DIR/site/caseDicts/postProcessing
- \b other (shipped) settings:
- $WM_PROJECT_DIR/etc/caseDicts/postProcessing
An example of the \c \#includeFunc directive:
\verbatim
#includeFunc <funcName>
\endverbatim
postProcess
-----------
The 'postProcess' utility and '-postProcess' solver option provide the
same set of controls to execute functionObjects after the run either by
reading a specified set of fields to process in the case of
'postProcess' or by reading all fields and models required to start the
run in the case of '-postProcess' for each selected time:
postProcess -help
Usage: postProcess [OPTIONS]
options:
-case <dir> specify alternate case directory, default is the cwd
-constant include the 'constant/' dir in the times list
-dict <file> read control dictionary from specified location
-field <name> specify the name of the field to be processed, e.g. U
-fields <list> specify a list of fields to be processed, e.g. '(U T p)' -
regular expressions not currently supported
-func <name> specify the name of the functionObject to execute, e.g. Q
-funcs <list> specify the names of the functionObjects to execute, e.g.
'(Q div(U))'
-latestTime select the latest time
-newTimes select the new times
-noFunctionObjects
do not execute functionObjects
-noZero exclude the '0/' dir from the times list, has precedence
over the -withZero option
-parallel run in parallel
-region <name> specify alternative mesh region
-roots <(dir1 .. dirN)>
slave root directories for distributed running
-time <ranges> comma-separated time ranges - eg, ':10,20,40:70,1000:'
-srcDoc display source code in browser
-doc display application documentation in browser
-help print the usage
pimpleFoam -postProcess -help
Usage: pimpleFoam [OPTIONS]
options:
-case <dir> specify alternate case directory, default is the cwd
-constant include the 'constant/' dir in the times list
-dict <file> read control dictionary from specified location
-field <name> specify the name of the field to be processed, e.g. U
-fields <list> specify a list of fields to be processed, e.g. '(U T p)' -
regular expressions not currently supported
-func <name> specify the name of the functionObject to execute, e.g. Q
-funcs <list> specify the names of the functionObjects to execute, e.g.
'(Q div(U))'
-latestTime select the latest time
-newTimes select the new times
-noFunctionObjects
do not execute functionObjects
-noZero exclude the '0/' dir from the times list, has precedence
over the -withZero option
-parallel run in parallel
-postProcess Execute functionObjects only
-region <name> specify alternative mesh region
-roots <(dir1 .. dirN)>
slave root directories for distributed running
-time <ranges> comma-separated time ranges - eg, ':10,20,40:70,1000:'
-srcDoc display source code in browser
-doc display application documentation in browser
-help print the usage
The functionObjects to execute may be specified on the command-line
using the '-func' option for a single functionObject or '-funcs' for a
list, e.g.
postProcess -func Q
postProcess -funcs '(div(U) div(phi))'
In the case of 'Q' the default field to process is 'U' which is
specified in and read from the configuration file but this may be
overridden thus:
postProcess -func 'Q(Ua)'
as is done in the example above to calculate the two forms of the divergence of
the velocity field. Additional fields which the functionObjects may depend on
can be specified using the '-field' or '-fields' options.
The 'postProcess' utility can only be used to execute functionObjects which
process fields present in the time directories. However, functionObjects which
depend on fields obtained from models, e.g. properties derived from turbulence
models can be executed using the '-postProcess' of the appropriate solver, e.g.
pisoFoam -postProcess -func PecletNo
or
sonicFoam -postProcess -func MachNo
In this case all required fields will have already been read so the '-field' or
'-fields' options are not be needed.
Henry G. Weller
CFD Direct Ltd.
the equivalent functionality is provided by the writeRegisteredObject
functionObject in a MUCH simpler, easier and extensible manner.
functionObject: Removed the now redundant 'timeSet' function.
- Avoids the need for the 'OutputFilterFunctionObject' wrapper
- Time-control for execution and writing is now provided by the
'timeControlFunctionObject' which instantiates the processing
'functionObject' and controls its operation.
- Alternative time-control functionObjects can now be written and
selected at run-time without the need to compile wrapped version of
EVERY existing functionObject which would have been required in the
old structure.
- The separation of 'execute' and 'write' functions is now formalized in the
'functionObject' base-class and all derived classes implement the
two functions.
- Unnecessary implementations of functions with appropriate defaults
in the 'functionObject' base-class have been removed reducing
clutter and simplifying implementation of new functionObjects.
- The 'coded' 'functionObject' has also been updated, simplified and tested.
- Further simplification is now possible by creating some general
intermediate classes derived from 'functionObject'.
Simplified and generalized the handling of functionObjects which fail to
construct by removing them from the list rather than maintaining an
"enabled" switch in each functionObject.
Executes application functionObjects to post-process existing results.
If the "dict" argument is specified the functionObjectList is constructed
from that dictionary otherwise the functionObjectList is constructed from
the "functions" sub-dictionary of "system/controlDict"
Multiple time-steps may be processed and the standard utility time
controls are provided.
This functionality is equivalent to execFlowFunctionObjects but in a
more efficient and general manner and will be included in all the
OpenFOAM solvers if it proves effective and maintainable.
The command-line options available with the "-postProcess" option may be
obtained by
simpleFoam -help -postProcess
Usage: simpleFoam [OPTIONS]
options:
-case <dir> specify alternate case directory, default is the cwd
-constant include the 'constant/' dir in the times list
-dict <file> read control dictionary from specified location
-latestTime select the latest time
-newTimes select the new times
-noFunctionObjects
do not execute functionObjects
-noZero exclude the '0/' dir from the times list, has precedence
over the -withZero option
-parallel run in parallel
-postProcess Execute functionObjects only
-region <name> specify alternative mesh region
-roots <(dir1 .. dirN)>
slave root directories for distributed running
-time <ranges> comma-separated time ranges - eg, ':10,20,40:70,1000:'
-srcDoc display source code in browser
-doc display application documentation in browser
-help print the usage
Henry G. Weller
CFD Direct Ltd.
Time::adjustDeltaT(). It allows function objects to manipulate the time step to
dump at adjustable times. The following options are available for output in
function objects: timeStep, outputTime, adjustableTime, runTime, clockTime
and cpuTime.
end of Time::operator++. This allows to know if the next timeIndex will be
a dumping time. The function object "partialWrite" modifyes the write option
of the those fields which will be written down at given intervals of the overall
outout times.