- a -valgrind option for logging with valgrind
- determine number of processors from system/decomposeParDict
or -decomposeParDict if -np was not specified
- this should normally not be triggered, provided that setRefCell was
used. However, if getRefCellValue() was called without any previous
checking on refCelli, it is possible to provoke errors.
Improvements to existing functionality
--------------------------------------
- MPI is initialised without thread support if it is not needed e.g. uncollated
- Use native c++11 threading; avoids problem with static destruction order.
- etc/cellModels now only read if needed.
- etc/controlDict can now be read from the environment variable FOAM_CONTROLDICT
- Uniform files (e.g. '0/uniform/time') are now read only once on the master only
(with the masterUncollated or collated file handlers)
- collated format writes to 'processorsNNN' instead of 'processors'. The file
format is unchanged.
- Thread buffer and file buffer size are no longer limited to 2Gb.
The global controlDict file contains parameters for file handling. Under some
circumstances, e.g. running in parallel on a system without NFS, the user may
need to set some parameters, e.g. fileHandler, before the global controlDict
file is read from file. To support this, OpenFOAM now allows the global
controlDict to be read as a string set to the FOAM_CONTROLDICT environment
variable.
The FOAM_CONTROLDICT environment variable can be set to the content the global
controlDict file, e.g. from a sh/bash shell:
export FOAM_CONTROLDICT=$(foamDictionary $FOAM_ETC/controlDict)
FOAM_CONTROLDICT can then be passed to mpirun using the -x option, e.g.:
mpirun -np 2 -x FOAM_CONTROLDICT simpleFoam -parallel
Note that while this avoids the need for NFS to read the OpenFOAM configuration
the executable still needs to load shared libraries which must either be copied
locally or available via NFS or equivalent.
New: Multiple IO ranks
----------------------
The masterUncollated and collated fileHandlers can now use multiple ranks for
writing e.g.:
mpirun -np 6 simpleFoam -parallel -ioRanks '(0 3)'
In this example ranks 0 ('processor0') and 3 ('processor3') now handle all the
I/O. Rank 0 handles 0,1,2 and rank 3 handles 3,4,5. The set of IO ranks should always
include 0 as first element and be sorted in increasing order.
The collated fileHandler uses the directory naming processorsNNN_XXX-YYY where
NNN is the total number of processors and XXX and YYY are first and last
processor in the rank, e.g. in above example the directories would be
processors6_0-2
processors6_3-5
and each of the collated files in these contains data of the local ranks
only. The same naming also applies when e.g. running decomposePar:
decomposePar -fileHandler collated -ioRanks '(0 3)'
New: Distributed data
---------------------
The individual root directories can be placed on different hosts with different
paths if necessary. In the current framework it is necessary to specify the
root per slave process but this has been simplified with the option of specifying
the root per host with the -hostRoots command line option:
mpirun -np 6 simpleFoam -parallel -ioRanks '(0 3)' \
-hostRoots '("machineA" "/tmp/" "machineB" "/tmp")'
The hostRoots option is followed by a list of machine name + root directory, the
machine name can contain regular expressions.
New: hostCollated
-----------------
The new hostCollated fileHandler automatically sets the 'ioRanks' according to
the host name with the lowest rank e.g. to run simpleFoam on 6 processors with
ranks 0-2 on machineA and ranks 3-5 on machineB with the machines specified in
the hostfile:
mpirun -np 6 --hostfile hostfile simpleFoam -parallel -fileHandler hostCollated
This is equivalent to
mpirun -np 6 --hostfile hostfile simpleFoam -parallel -fileHandler collated -ioRanks '(0 3)'
This example will write directories:
processors6_0-2/
processors6_3-5/
A typical example would use distributed data e.g. no two nodes, machineA and
machineB, each with three processes:
decomposePar -fileHandler collated -case cavity
# Copy case (constant/*, system/*, processors6/) to master:
rsync -a cavity machineA:/tmp/
# Create root on slave:
ssh machineB mkdir -p /tmp/cavity
# Run
mpirun --hostfile hostfile icoFoam \
-case /tmp/cavity -parallel -fileHandler hostCollated \
-hostRoots '("machineA" "/tmp" "machineB" "/tmp")'
Contributed by Mattijs Janssens
Specialized variants of the power law porosity and k epsilon turbulence models
developed to simulate atmospheric flow over forested and non-forested complex
terrain.
Class
Foam::powerLawLopesdaCosta
Description
Variant of the power law porosity model with spatially varying
drag coefficient
given by:
\f[
S = -\rho C_d \Sigma |U|^{(C_1 - 1)} U
\f]
where
\vartable
\Sigma | Porosity surface area per unit volume
C_d | Model linear coefficient
C_1 | Model exponent coefficient
\endvartable
Reference:
\verbatim
Costa, J. C. P. L. D. (2007).
Atmospheric flow over forested and non-forested complex terrain.
\endverbatim
Class
Foam::RASModels::kEpsilonLopesdaCosta
Description
Variant of the standard k-epsilon turbulence model with additional source
terms to handle the changes in turbulence in porous regions represented by
the powerLawLopesdaCosta porosity model.
Reference:
\verbatim
Costa, J. C. P. L. D. (2007).
Atmospheric flow over forested and non-forested complex terrain.
\endverbatim
The default model coefficients are
\verbatim
kEpsilonLopesdaCostaCoeffs
{
Cmu 0.09;
C1 1.44;
C2 1.92;
sigmak 1.0;
sigmaEps 1.3;
}
\endverbatim
Tutorial case to follow.
- The iterator for a HashSet dereferences directly to its key.
- Eg,
for (const label patchi : patchSet)
{
...
}
vs.
forAllConstIter(labelHashSet, patchSet, iter)
{
const label patchi = iter.key();
...
}
- the algorithm was last used in OpenFOAM-2.4, after which it was
replaced with a FaceCellWave version.
Whereas the original (2.4.x) version exhibited performance
degradation on very large meshes (with explicit constraints), the
FaceCellWave version exhibited performance issues with large numbers
of blocked faces.
With large numbers of blocked faces, the FaceCellWave regionSplit
could take between 10 to 100 times longer due to the slow
propagation speed through blocked faces.
The 2.4 regionSplit has been revamped to avoid local memory
allocations, which appears to have been the source of the original
performance issues on large meshes.
For additional performance, intermediate renumbering is also avoided
during the consolidation of regions over processor domains.
- controlled by the the 'printExecutionFormat' InfoSwitch in
etc/controlDict
// Style for "ExecutionTime = " output
// - 0 = seconds (with trailing 's')
// - 1 = day-hh:mm:ss
ExecutionTime = 112135.2 s ClockTime = 113017 s
ExecutionTime = 1-07:08:55.20 ClockTime = 1-07:23:37
- Callable via the new Time::printExecutionTime() method,
which also helps to reduce clutter in the applications.
Eg,
runTime.printExecutionTime(Info);
vs
Info<< "ExecutionTime = " << runTime.elapsedCpuTime() << " s"
<< " ClockTime = " << runTime.elapsedClockTime() << " s"
<< nl << endl;
--
ENH: return elapsedClockTime() and clockTimeIncrement as double
- previously returned as time_t, which is less portable.
For example, with some HashTable or Map container of models
{ model0 => 1, model1 => 4, model2 => 5, model3 => 12, model4 => 15, }
specify the remapping
Map<label> mapper({{1, 3}, {2, 6}, {3, 12}, {5, 8}});
inplaceMapValue(mapper, models) then yields
{ model0 => 3, model1 => 4, model2 => 8, model3 => 12, model4 => 15, }
--
ENH: extend bitSet::count() to optionally count unset bits instead.
--
ENH: BitOps compatibility methods for boolList.
- These ease coding that uses a boolList instead of bitSet and use
short-circuit logic when possible.
Eg, when 'bitset' and 'bools' contain the same information
bitset.count() <-> BitOps::count(bools)
bitset.all() <-> BitOps::all(bools)
bitset.any() <-> BitOps::any(bools)
bitset.none() <-> BitOps::none(bools)
These methods can then be used directly in parameters or in logic.
Eg,
returnReduce(bitset.any(), orOp<bool>());
returnReduce(BitOps::any(bools), orOp<bool>());
if (BitOps::any(bools)) ...
- since PackedBoolList is now a compatibility typedef for bitSet,
it is useful to have an additional means of distinction.
STYLE: simplify internal version tests and compiler defines.
- the API version is now conveyed via the OPENFOAM define directly.
The older OPENFOAM_PLUS define is provided for existing code.
- parsing error state only arises from a missing final newline
in the file (which the dnl macro does not capture).
Report with a warning instead of modifying the dnl macro since
we generally wish to know about this anyhow.
- add missing newline to YEqn.H file.
- flags the following type of problems:
* mismatches:
keyword mismatch ( set of { brackets ) in the } entry;
* underflow (too many closing brackets:
keyword too many ( set of ) brackets ) in ) entry;
- a missing semi-colon
dict
{
keyword entry with missing semi-colon
}
will be flagged as 'underflow', since it parses through the '}' but
did not open with it.
Max monitoring depth is 60 levels of nesting, to avoid incurring any
memory allocation.
- handling of dead links (find -L -delete unsupported)
- remove ignore case flag on 's/../../i' used in have_scotch script.
It is unneeded and not tolerated by Darwin's sed.
- avoid embedded comments in EXE_INC (Make/options files), which do
not work well with the OSX LLVM cpp.
It strips out the comments but also removes the continuation char.
STYLE: adjust notes about paraview library locations
- generalize some of the library extensions (.so vs .dylib).
Provide as wmake 'sysFunctions'
- added note about unsupported/incomplete system support
- centralize detection of ThirdParty packages into wmake/ subdirectory
by providing a series of scripts in the spirit of GNU autoconfig.
For example,
have_boost, have_readline, have_scotch, ...
Each of the `have_<package>` scripts will generally provide the
following type of functions:
have_<package> # detection
no_<package> # reset
echo_<package> # echoing
and the following type of variables:
HAVE_<package> # unset or 'true'
<package>_ARCH_PATH # root for <package>
<package>_INC_DIR # include directory for <package>
<package>_LIB_DIR # library directory for <package>
This simplifies the calling scripts:
if have_metis
then
wmake metisDecomp
fi
As well as reducing clutter in the corresponding Make/options:
EXE_INC = \
-I$(METIS_INC_DIR) \
-I../decompositionMethods/lnInclude
LIB_LIBS = \
-L$(METIS_LIB_DIR) -lmetis
Any additional modifications (platform-specific or for an external build
system) can now be made centrally.
- also simplify parsing by accepting any case on keywords.
This implies that something like "sOlId", "SoLiD" will also
be accepted. Although nobody should really count on this rather
generous behaviour, it does simplfy the state machine even further.
- in 2.4.x the general default for polyMesh::findCell was FACE_DIAG_TRIS,
but this was changed to CELL_TETS for better handling of concave
cells.
- in snappyHexMesh meshRefinement, findCell is used to define
locations in mesh and cells for closer refinement. Using CELL_TETS
causes an octree rebuild when the mesh has changed and this adds
considerable overhead. For this operation, the faster FACE_DIAG_TRIS
mode can be used instead.
- the previous grammar used
'/*' { fgoto comment; }
to start processing multi-line comments and
comment := any* :>> '*/' @{ fgoto main; };
as a finishing action to return to normal lexing, but seemed not to
have been triggered properly.
Now simply trap in a single rule:
'/*' any* :>> '*/'; # Multi-line comment
STYLE: use more compact dnl (delete to newline)
OLD: [^\n]* '\n'
NEW: (any* -- '\n') '\n'
eliminates the intermediate state
- the API-versioned calls (eg, tecini142, teczne142, tecpoly142, tecend142),
the limited availability of the SDK and lack of adequate testing make
proper maintenance very difficult.