distributions: Generalised statistical distributions

This new class hierarchy replaces the distributions previously provided
by the Lagrangian library.

All distributions (except fixedValue) now require a "size exponent", Q,
to be specified along with their other coefficients. If a distribution's
CDF(x) (cumulative distribution function) represents what proportion of
the distribution takes a value below x, then Q determines what is meant
by "proportion":

- If Q=0, then "proportion" means the number of sampled values expected
  to be below x divided by the total number of sampled values.

- If Q=3, then "proportion" means the expected sum of sampled values
  cubed for values below x divided by the total sum of values cubed. If
  x is a length, then this can be interpreted as a proportion of the
  total volume of sampled objects.

- If Q=2, and x is a length, then the distribution might represent the
  proportion of surface area, and so on...

In addition to the user-specification of Q defining what size the given
distribution relates to, an implementation that uses a distribution can
also programmatically define a samplingQ to determine what sort of
sample is being constructed; whether the samples should have an equal
number (sampleQ=0), volume (sampleQ=3), area (sampleQ=2), etc...

A number of fixes to the distributions have been made, including fixing
some fundamental bugs in the returned distribution of samples, incorrect
calculation of the distribution means, renaming misleadingly named
parameters, and correcting some inconsistencies in the way in which
tabulated PDF and CDF data was processed. Distributions no longer
require their parameters to be defined in a sub-dictionary, but a
sub-dictionary is still supported for backwards compatibility.

The distributions can now generate their PDF-s as well as samples, and a
test application has been added (replacing two previous applications),
which thoroughly checks consistency between the PDF and the samples for
a variety of combinations of values of Q and sampleQ.

Backwards incompatible changes are as follows:

- The standard deviation keyword for the normal (and multi-normal)
  distribution is now called 'sigma'. Previously this was 'variance',
  which was misleading, as the value is a standard deviation.

- The 'massRosinRammler' distribution has been removed. This
  functionality is now provided by the standard 'RosinRammler'
  distributon with a Q equal to 0, and a sampleQ of 3.

- The 'general' distribution has been split into separate distributions
  based on whether PDF or CDF data is provided. These distributions are
  called 'tabulatedDensity' and 'tabulatedCumulative', respectively.
This commit is contained in:
Will Bainbridge
2023-03-29 11:25:23 +01:00
parent db83ae3e8a
commit cae41959dd
46 changed files with 4340 additions and 550 deletions

View File

@ -1,11 +1,9 @@
EXE_INC = \
-I$(LIB_SRC)/lagrangian/distributionModels/lnInclude \
-I$(LIB_SRC)/sampling/lnInclude \
-I$(LIB_SRC)/finiteVolume/lnInclude \
-I$(LIB_SRC)/meshTools/lnInclude
EXE_LIBS = \
-ldistributionModels \
-lsampling \
-lfiniteVolume \
-lmeshTools

View File

@ -26,17 +26,18 @@
Random rndGen(label(0));
autoPtr<distributionModel> p
autoPtr<distribution> p
(
distributionModel::New
distribution::New
(
pdfDictionary,
rndGen
rndGen,
0
)
);
const scalar xMin = p->minValue();
const scalar xMax = p->maxValue();
const scalar xMin = p->min();
const scalar xMax = p->max();
autoPtr<OFstream> filePtr(nullptr);

View File

@ -31,7 +31,7 @@ Description
#include "argList.H"
#include "Time.H"
#include "distributionModel.H"
#include "distribution.H"
#include "setWriter.H"
#include "writeFile.H"
#include "OFstream.H"