- additional dummy template parameter to assist with supporting
derived classes. Currently just used for string types, but can be
extended.
- provide hash specialization for various integer types.
Removes the need for any forwarding.
- change default hasher for HashSet/HashTable from 'string::hash'
to `Hash<Key>`. This avoids questionable hashing calls and/or
avoids compiler resolution problems.
For example,
HashSet<label>::hasher and labelHashSet::hasher now both properly
map to Hash<label> whereas previously HashSet<label> would have
persistently mapped to string::hash, which was incorrect.
- standardize internal hashing functors.
Functor name is 'hasher', as per STL set/map and the OpenFOAM
HashSet/HashTable definitions.
Older code had a local templated name, which added unnecessary
clutter and the template parameter was always defaulted.
For example,
Old: `FixedList<label, 3>::Hash<>()`
New: `FixedList<label, 3>::hasher()`
Unchanged: `labelHashSet::hasher()`
Existing `Hash<>` functor namings are still supported,
but deprecated.
- define hasher and Hash specialization for bitSet and PackedList
- add symmetric hasher for 'face'.
Starts with lowest vertex value and walks in the direction
of the next lowest value. This ensures that the hash code is
independent of face orientation and face rotation.
NB:
- some of keys for multiphase handling (eg, phasePairKey)
still use yet another function naming: `hash` and `symmHash`.
This will be targeted for alignment in the future.
- simplify compile/uncompile, reading, assignment
- implicit construct wordRe from keyType (was explicit) to simplify
future API changes.
- make Foam::isspace consistent with std::isspace (C-locale)
by including vertical tab and form feed
ENH: improve #ifeq float/label comparisons
- originally had tests for regex meta characters strewn across
regExp classes as well as wordRe, keyType, string.
And had special-purpose quotemeta static function within string
that relied on special naming convention for testing the meta
characters.
The regex meta character testing/handling now relegated entirely
to the regExp class(es).
Relocate quotemeta to stringOps, with a predicate.
- avoid code duplication. Reuse some regExpCxx methods in regExpPosix
- silently deprecate 'startsWith', 'endsWith' methods
(added in 2016: 2b14360662), in favour of
'starts_with', 'ends_with' methods, corresponding to C++20 and
allowing us to cull then in a few years.
- handle single character versions of starts_with, ends_with.
- add single character version of removeEnd and silently deprecate
removeTrailing which did the same thing.
- drop the const versions of removeRepeated, removeTrailing.
Unused and with potential confusion.
STYLE: use shrink_to_fit(), erase()
- simplifies their use when reordering lists etc.
(word, fileName, keyType, wordRe)
- "unfriend" IO operators for string types. They require no internal access
- add compile/uncompile methods to keyType for symmetry with wordRe
- when outputting keyType/wordRe, be more explicit about them using
writeQuoted()
- In addition to the traditional Flex-based parser, added a Ragel-based
parser and a handwritten one.
Some representative timings for reading 5874387 points (1958129 tris):
Flex Ragel Manual
5.2s 4.8s 6.7s total reading time
3.8s 3.4s 5.3s without point merging
- more consistent with STL practices for function classes.
- string::hash function class now operates on std::string rather
than Foam::string since we have now avoided inadvertent use of
string conversion from int in more places.
Original commit message:
------------------------
Parallel IO: New collated file format
When an OpenFOAM simulation runs in parallel, the data for decomposed fields and
mesh(es) has historically been stored in multiple files within separate
directories for each processor. Processor directories are named 'processorN',
where N is the processor number.
This commit introduces an alternative "collated" file format where the data for
each decomposed field (and mesh) is collated into a single file, which is
written and read on the master processor. The files are stored in a single
directory named 'processors'.
The new format produces significantly fewer files - one per field, instead of N
per field. For large parallel cases, this avoids the restriction on the number
of open files imposed by the operating system limits.
The file writing can be threaded allowing the simulation to continue running
while the data is being written to file. NFS (Network File System) is not
needed when using the the collated format and additionally, there is an option
to run without NFS with the original uncollated approach, known as
"masterUncollated".
The controls for the file handling are in the OptimisationSwitches of
etc/controlDict:
OptimisationSwitches
{
...
//- Parallel IO file handler
// uncollated (default), collated or masterUncollated
fileHandler uncollated;
//- collated: thread buffer size for queued file writes.
// If set to 0 or not sufficient for the file size threading is not used.
// Default: 2e9
maxThreadFileBufferSize 2e9;
//- masterUncollated: non-blocking buffer size.
// If the file exceeds this buffer size scheduled transfer is used.
// Default: 2e9
maxMasterFileBufferSize 2e9;
}
When using the collated file handling, memory is allocated for the data in the
thread. maxThreadFileBufferSize sets the maximum size of memory in bytes that
is allocated. If the data exceeds this size, the write does not use threading.
When using the masterUncollated file handling, non-blocking MPI communication
requires a sufficiently large memory buffer on the master node.
maxMasterFileBufferSize sets the maximum size in bytes of the buffer. If the
data exceeds this size, the system uses scheduled communication.
The installation defaults for the fileHandler choice, maxThreadFileBufferSize
and maxMasterFileBufferSize (set in etc/controlDict) can be over-ridden within
the case controlDict file, like other parameters. Additionally the fileHandler
can be set by:
- the "-fileHandler" command line argument;
- a FOAM_FILEHANDLER environment variable.
A foamFormatConvert utility allows users to convert files between the collated
and uncollated formats, e.g.
mpirun -np 2 foamFormatConvert -parallel -fileHandler uncollated
An example case demonstrating the file handling methods is provided in:
$FOAM_TUTORIALS/IO/fileHandling
The work was undertaken by Mattijs Janssens, in collaboration with Henry Weller.
- consolidate word::validated() into word::validate() and also allow
as short form for string::validate<word>(). Also less confusing than
having similarly named methods that essentially do the same thing.
- more consistent const access when iterating over strings
- add valid(char) for keyType and wordRe
- ensure that the string-related classes have consistently similar
matching methods. Use operator()(const std::string) as an entry
point for the match() method, which makes it easier to use for
filters and predicates. In some cases this will also permit using
a HashSet as a match predicate.
regExp
====
- the set method now returns a bool to signal that the requested
pattern was compiled.
wordRe
====
- have separate constructors with the compilation option (was previously
a default parameter). This leaves the single parameter constructor
explicit, but the two parameter version is now non-explicit, which
makes it easier to use when building lists.
- renamed compile-option from REGEX (to REGEXP) for consistency with
with the <regex.h>, <regex> header names etc.
wordRes
====
- renamed from wordReListMatcher -> wordRes. For reduced typing and
since it behaves as an entity only slightly related to its underlying
list nature.
- Provide old name as typedef and include for code transition.
- pass through some list methods into wordRes
hashedWordList
====
- hashedWordList[const word& name] now returns a -1 if the name is is
not found in the list of indices. That has been a pending change
ever since hashedWordList was generalized out of speciesTable
(Oct-2010).
- add operator()(const word& name) for easy use as a predicate
STYLE: adjust parameter names in stringListOps
- reflect if the parameter is being used as a primary matcher, or the
matcher will be derived from the parameter.
For example,
(const char* re), which first creates a regExp
versus (const regExp& matcher) which is used directly.
- If the underlying type is contiguous, FixedList hashes its storage directly.
- Drop labelPairHash (non-commutative) from fvMeshDistribute since
FixedList::Hash does the right thing anyhow.
- Hash<edge> specialization is commutative, without multiplication.
- Hash<triFace> specialization kept multiplication (but now uLabel).
There's not much point optimizing it, since it's not used much anyhow.
Misc. changes
- added StaticAssert to NamedEnum.H
- label.H / uLabel.H : define FOAM_LABEL_MAX, FOAM_ULABEL_MAX with the
values finally used for the storage. These can be useful for pre-processor
checks elsewhere (although I stopped needing them in the meantime).
- make table power-of-two, but since it seems to give 1-2% performance
improvement, maybe forget it too.
- remove two-argument form of hashing classes and do the modulus direclty
within HashTable instead. This simplifies things a fair bit.
- migrate Hash<void*> from db/dlLibrary to primitives/hashes/Hash
- it was possible to create a PackedList::iterator from a
PackedList::const_iterator and violate const-ness
- added HashTable::printInfo for emitting some information
- changed default table sizes from 100 -> 128 in preparation for future
2^n table sizes
- a possible future replacement for keyType, but the immediate use is the
wordReList for grepping through other lists.
- note that the argList treatment of '(' ... ')' yields quoted strings,
which we can use for building a wordReList
minor cleanup of regExp class
- constructor from std::string, match std::string and
operator=(std::string&)
rely on automatic conversion to Foam::string
- ditch partialMatch with sub-groups, it doesn't make much sense
- Istream and Ostream now retain backslashes when reading/writing strings.
The previous implementation simply discarded them, except when used to
escape a double-quote or a newline. It is now vitally important to retain
them, eg for quoting regular expression meta-characters.
The backslash continues to be used as an escape character for double-quote
and newline, but otherwise get passed through "as-is" without any other
special meaning (ie, they are *NOT* C-style strings). This helps avoid
'backslash hell'!
For example,
string: "match real dots \.+, question mark \? or any char .*"
C-style: "match real dots \\.+, question mark \\? or any char .*"
- combined subfiles in db/IOstreams, some had more copyright info than code
- OPstreamI.H contained only private methods, moved into OPstream.C
Are these really correct?
IOstreams/Istream.H:# include "HashTable.C"
token/token.H:#define NoHashTableC