decompositionMethods::parMetis: New interface to the ParMETIS distributor for load-balancing

ParMETIS is a parallel version of METIS and can be used as an alternative to
ptScotch or Zoltan, supporting multi-constraints and redistribution:

Description
    ParMetis redistribution in parallel

    Note: parMetis methods do not support serial operation.

    Parameters
    - Method of decomposition
      - kWay: multilevel k-way
      - geomKway: combined coordinate-based and multi-level k-way
      - adaptiveRepart: balances the work load of a graph

    - Options
      - options[0]: The specified options are used if options[0] = 1

      - options[1]: Specifies the level of information to be returned during
        the execution of the algorithm. Timing information can be obtained by
        setting this to 1. Additional options for this parameter can be obtained
        by looking at parmetis.h. Default: 0.

      - options[2]: Random number seed for the routine

      - options[3]: Specifies whether the sub-domains and processors are coupled
        or un-coupled.  If the number of sub-domains desired (i.e., nparts) and
        the number of processors that are being used is not the same, then these
        must be un-coupled. However, if nparts equals the number of processors,
        these can either be coupled or de-coupled. If sub-domains and processors
        are coupled, then the initial partitioning will be obtained implicitly
        from the graph distribution. However, if sub-domains are un-coupled from
        processors, then the initial partitioning needs to be obtained from the
        initial values assigned to the part array.

    - itr: Parameter which describes the ratio of inter-processor communication
      time compared to data redistribution time.  Should be set between 0.000001
      and 1000000.0.  If set high, a repartitioning with a low edge-cut will be
      computed. If it is set low, a repartitioning that requires little data
      redistribution will be computed.  Good values for this parameter can be
      obtained by dividing inter-processor communication time by data
      redistribution time. Otherwise, a value of 1000.0 is recommended.
      Default: 1000.

The ParMETIS sources can be downloaded and compiled in ThirdParty-dev using the
link in the README file and the compilation commands in Allwmake.

Note the specific license under which ParMETIS is released:

Copyright & License Notice
--------------------------

The ParMETIS package is copyrighted by the Regents of the
University of Minnesota. It can be freely used for educational and
research purposes by non-profit institutions and US government
agencies only. Other organizations are allowed to use ParMETIS
only for evaluation purposes, and any further uses will require prior
approval. The software may not be sold or redistributed without prior
approval. One may make copies of the software for their use provided
that the copies, are not sold or distributed, are used under the same
terms and conditions.

As unestablished research software, this code is provided on an
``as is'' basis without warranty of any kind, either expressed or
implied. The downloading, or executing any part of this software
constitutes an implicit agreement to these terms. These terms and
conditions are subject to change at any time without prior notice.
This commit is contained in:
Henry Weller
2024-05-22 15:30:46 +01:00
parent 32b7ba09b3
commit 40bcabf79f
9 changed files with 737 additions and 8 deletions

40
etc/config.sh/parMetis Normal file
View File

@ -0,0 +1,40 @@
#----------------------------------*-sh-*--------------------------------------
# ========= |
# \\ / F ield | OpenFOAM: The Open Source CFD Toolbox
# \\ / O peration | Website: https://openfoam.org
# \\ / A nd | Copyright (C) 2024 OpenFOAM Foundation
# \\/ M anipulation |
#------------------------------------------------------------------------------
# License
# This file is part of OpenFOAM.
#
# OpenFOAM is free software: you can redistribute it and/or modify it
# under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# OpenFOAM is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
# for more details.
#
# You should have received a copy of the GNU General Public License
# along with OpenFOAM. If not, see <http://www.gnu.org/licenses/>.
#
# File
# etc/config.sh/parMetis
#
# Description
# Setup file for parMetis include/libraries.
# Sourced during wmake process only.
#
# Note
# A csh version is not needed, since the values here are only sourced
# during the wmake process
#
#------------------------------------------------------------------------------
export PARMETIS_VERSION=parmetis-4.0.3
export PARMETIS_ARCH_PATH=$WM_THIRD_PARTY_DIR/platforms/$WM_ARCH$WM_COMPILER$WM_PRECISION_OPTION$WM_LABEL_OPTION/$PARMETIS_VERSION
#------------------------------------------------------------------------------

View File

@ -0,0 +1,18 @@
#!/bin/sh
cd ${0%/*} || exit 1 # Run from this directory
. $WM_PROJECT_DIR/wmake/scripts/AllwmakeMpiLib
# Get PARMETIS_VERSION, PARMETIS_ARCH_PATH
if settings=`$WM_PROJECT_DIR/bin/foamEtcFile config.sh/parMetis`
then
. $settings
echo " using PARMETIS_ARCH_PATH=$PARMETIS_ARCH_PATH"
wcleanMpiLib $PARMETIS_VERSION parMetisDecomp
else
echo
echo " Error: no config.sh/parMetis settings"
echo
fi
#------------------------------------------------------------------------------

View File

@ -0,0 +1,24 @@
#!/bin/sh
cd ${0%/*} || exit 1 # Run from this directory
# Parse arguments for library compilation
. $WM_PROJECT_DIR/wmake/scripts/AllwmakeParseArguments
. $WM_PROJECT_DIR/wmake/scripts/AllwmakeMpiLib
# get PARMETIS_VERSION, PARMETIS_ARCH_PATH
if settings=`$WM_PROJECT_DIR/bin/foamEtcFile config.sh/parMetis`
then
. $settings
echo " using PARMETIS_ARCH_PATH=$PARMETIS_ARCH_PATH"
if [ -r $PARMETIS_ARCH_PATH/lib/libparmetis.so ]
then
wmakeMpiLib $PARMETIS_VERSION
fi
else
echo
echo " Error: no config.sh/parMetis settings"
echo
fi
#------------------------------------------------------------------------------

View File

@ -0,0 +1,3 @@
parMetis.C
LIB = $(FOAM_LIBBIN)/$(FOAM_MPI)/libparMetisDecomp

View File

@ -0,0 +1,10 @@
-include $(GENERAL_RULES)/mplibType
EXE_INC = \
$(PFLAGS) $(PINC) \
-I$(FOAM_SRC)/Pstream/mpi/lnInclude \
-I$(PARMETIS_ARCH_PATH)/include \
-I../decompositionMethods/lnInclude
LIB_LIBS = \
-L$(PARMETIS_ARCH_PATH)/lib -lparmetis

View File

@ -0,0 +1,433 @@
/*---------------------------------------------------------------------------*\
========= |
\\ / F ield | OpenFOAM: The Open Source CFD Toolbox
\\ / O peration | Website: https://openfoam.org
\\ / A nd | Copyright (C) 2024 OpenFOAM Foundation
\\/ M anipulation |
-------------------------------------------------------------------------------
License
This file is part of OpenFOAM.
OpenFOAM is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
OpenFOAM is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with OpenFOAM. If not, see <http://www.gnu.org/licenses/>.
\*---------------------------------------------------------------------------*/
#include "parMetis.H"
#include "Time.H"
#include "globalIndex.H"
#include "labelIOField.H"
#include "addToRunTimeSelectionTable.H"
extern "C"
{
#include "parmetis.h"
}
// * * * * * * * * * * * * * * Static Data Members * * * * * * * * * * * * * //
namespace Foam
{
namespace decompositionMethods
{
defineTypeNameAndDebug(parMetis, 0);
addToRunTimeSelectionTable
(
decompositionMethod,
parMetis,
distributor
);
}
}
// * * * * * * * * * * * * * Private Member Functions * * * * * * * * * * * //
Foam::label Foam::decompositionMethods::parMetis::decompose
(
const labelList& xadj,
const labelList& adjncy,
const pointField& cellCentres,
const labelList& cellWeights,
const labelList& faceWeights,
labelList& decomp
)
{
// C style numbering
label numFlag = 0;
// Number of weights or balance constraints
label nWeights = cellWeights.size()/cellCentres.size();
scalarList processorWeights;
if (processorWeights_.size())
{
if (processorWeights_.size() != nWeights*nProcessors_)
{
FatalErrorInFunction
<< "Number of processor weights specified in parMetisCoeffs "
<< processorWeights_.size()
<< " does not equal number of constraints * number of domains "
<< nWeights*nProcessors_
<< exit(FatalError);
}
processorWeights = processorWeights_;
}
else
{
processorWeights.setSize(nWeights*nProcessors_, 1.0/nProcessors_);
}
// Imbalance tolerance
Field<scalar> ubvec(nWeights, 1.02);
// If only one processor there is no imbalance
if (nProcessors_ == 1)
{
ubvec[0] = 1;
}
// Distribute to all processors the number of cells on each processor
globalIndex globalMap(cellCentres.size());
// Get the processor-cell offset table
labelList cellOffsets(globalMap.offsets());
// Weight info
label wgtFlag = 0;
const label* vwgtPtr = nullptr;
const label* adjwgtPtr = nullptr;
// Weights on vertices of graph (cells)
if (cellWeights.size())
{
vwgtPtr = cellWeights.begin();
wgtFlag += 2;
}
// Weights on edges of graph (faces)
if (faceWeights.size())
{
adjwgtPtr = faceWeights.begin();
wgtFlag += 1;
}
MPI_Comm comm = MPI_COMM_WORLD;
// Output: cell -> processor addressing
decomp.setSize(cellCentres.size());
// Output: the number of edges that are cut by the partitioning
label edgeCut = 0;
if (method_ == "kWay")
{
ParMETIS_V3_PartKway
(
cellOffsets.begin(),
const_cast<label*>(xadj.begin()),
const_cast<label*>(adjncy.begin()),
const_cast<label*>(vwgtPtr),
const_cast<label*>(adjwgtPtr),
&wgtFlag,
&numFlag,
&nWeights,
&nProcessors_,
processorWeights.begin(),
ubvec.begin(),
const_cast<labelList&>(options_).begin(),
&edgeCut,
decomp.begin(),
&comm
);
}
else if (method_ == "geomKway")
{
// Number of dimensions
label nDims = 3;
// Convert pointField into float
Field<scalar> xyz(nDims*cellCentres.size());
label i = 0;
forAll(cellCentres, celli)
{
const point& cc = cellCentres[celli];
xyz[i++] = cc.x();
xyz[i++] = cc.y();
xyz[i++] = cc.z();
}
ParMETIS_V3_PartGeomKway
(
cellOffsets.begin(),
const_cast<label*>(xadj.begin()),
const_cast<label*>(adjncy.begin()),
const_cast<label*>(vwgtPtr),
const_cast<label*>(adjwgtPtr),
&wgtFlag,
&numFlag,
&nDims,
xyz.begin(),
&nWeights,
&nProcessors_,
processorWeights.begin(),
ubvec.begin(),
const_cast<labelList&>(options_).begin(),
&edgeCut,
decomp.begin(),
&comm
);
}
else if (method_ == "adaptiveRepart")
{
// Size of the vertices with respect to redistribution cost
labelList vsize(cellCentres.size(), 1);
ParMETIS_V3_AdaptiveRepart
(
cellOffsets.begin(),
const_cast<label*>(xadj.begin()),
const_cast<label*>(adjncy.begin()),
const_cast<label*>(vwgtPtr),
const_cast<label*>(vsize.begin()),
const_cast<label*>(adjwgtPtr),
&wgtFlag,
&numFlag,
&nWeights,
&nProcessors_,
processorWeights.begin(),
ubvec.begin(),
&itr_,
const_cast<labelList&>(options_).begin(),
&edgeCut,
decomp.begin(),
&comm
);
}
return edgeCut;
}
// * * * * * * * * * * * * * * * * Constructors * * * * * * * * * * * * * * //
Foam::decompositionMethods::parMetis::parMetis
(
const dictionary& decompositionDict
)
:
decompositionMethod(decompositionDict),
method_("geomKway"),
options_(4, 0),
itr_(1000)
{
// Check for user supplied weights and decomp options
if (decompositionDict.found("parMetisCoeffs"))
{
const dictionary& parMetisCoeffs =
decompositionDict.subDict("parMetisCoeffs");
Info<< type() << ": reading coefficients:" << endl;
if (parMetisCoeffs.readIfPresent("method", method_))
{
if
(
method_ != "kWay"
&& method_ != "geomKWay"
&& method_ != "adaptiveRepart"
)
{
FatalIOErrorInFunction(parMetisCoeffs)
<< "Method " << method_
<< " in parMetisCoeffs in dictionary : "
<< decompositionDict.name()
<< " should be kWay, geomKWay or adaptiveRepart"
<< exit(FatalIOError);
}
Info<< " method: " << method_ << endl;
}
if
(
method_ == "adaptiveRepart"
&& parMetisCoeffs.readIfPresent("itr", itr_)
)
{
Info<< " itr: " << itr_ << endl;
}
if (parMetisCoeffs.readIfPresent("options", options_))
{
if (options_.size() != 4)
{
FatalIOErrorInFunction(parMetisCoeffs)
<< "Number of options in parMetisCoeffs dictionary : "
<< decompositionDict.name()
<< " should be 4, found " << options_
<< exit(FatalIOError);
}
Info<< " options: " << options_ << endl;
}
parMetisCoeffs.readIfPresent("processorWeights_", processorWeights_);
Info << endl;
}
}
// * * * * * * * * * * * * * * * Member Functions * * * * * * * * * * * * * //
Foam::labelList Foam::decompositionMethods::parMetis::decompose
(
const polyMesh& mesh,
const pointField& points,
const scalarField& pointWeights
)
{
if (points.size() != mesh.nCells())
{
FatalErrorInFunction
<< "Can use this decomposition method only for the whole mesh"
<< endl
<< "and supply one coordinate (cellCentre) for every cell." << endl
<< "The number of coordinates " << points.size() << endl
<< "The number of cells in the mesh " << mesh.nCells()
<< exit(FatalError);
}
// Make Metis CSR (Compressed Storage Format) storage
// adjncy : contains neighbours (= edges in graph)
// xadj(celli) : start of information in adjncy for celli
CompactListList<label> cellCells;
calcCellCells
(
mesh,
identityMap(mesh.nCells()),
mesh.nCells(),
true,
cellCells
);
labelList decomp;
decompose
(
cellCells.offsets(),
cellCells.m(),
points,
scaleWeights(pointWeights, pointWeights.size()/points.size()),
labelList(),
decomp
);
return decomp;
}
Foam::labelList Foam::decompositionMethods::parMetis::decompose
(
const polyMesh& mesh,
const labelList& cellToRegion,
const pointField& regionPoints,
const scalarField& pointWeights
)
{
if (cellToRegion.size() != mesh.nCells())
{
FatalErrorInFunction
<< "Size of cell-to-coarse map " << cellToRegion.size()
<< " differs from number of cells in mesh " << mesh.nCells()
<< exit(FatalError);
}
// Make Metis CSR (Compressed Storage Format) storage
// adjncy : contains neighbours (= edges in graph)
// xadj(celli) : start of information in adjncy for celli
CompactListList<label> cellCells;
calcCellCells
(
mesh,
cellToRegion,
regionPoints.size(),
true,
cellCells
);
// Decompose using weights
labelList decomp;
decompose
(
cellCells.m(),
cellCells.offsets(),
regionPoints,
scaleWeights(pointWeights, pointWeights.size()/regionPoints.size()),
labelList(),
decomp
);
// Rework back into decomposition for original mesh
labelList fineDistribution(cellToRegion.size());
forAll(fineDistribution, i)
{
fineDistribution[i] = decomp[cellToRegion[i]];
}
return fineDistribution;
}
Foam::labelList Foam::decompositionMethods::parMetis::decompose
(
const labelListList& globalCellCells,
const pointField& cellCentres,
const scalarField& cellWeights
)
{
if (cellCentres.size() != globalCellCells.size())
{
FatalErrorInFunction
<< "Inconsistent number of cells (" << globalCellCells.size()
<< ") and number of cell centres (" << cellCentres.size()
<< ")." << exit(FatalError);
}
// Make Metis Distributed CSR (Compressed Storage Format) storage
// adjncy : contains neighbours (= edges in graph)
// xadj(celli) : start of information in adjncy for celli
CompactListList<label> cellCells(globalCellCells);
labelList decomp;
decompose
(
cellCells.offsets(),
cellCells.m(),
cellCentres,
scaleWeights(cellWeights, cellWeights.size()/cellCentres.size()),
labelList(),
decomp
);
return decomp;
}
// ************************************************************************* //

View File

@ -0,0 +1,201 @@
/*---------------------------------------------------------------------------*\
========= |
\\ / F ield | OpenFOAM: The Open Source CFD Toolbox
\\ / O peration | Website: https://openfoam.org
\\ / A nd | Copyright (C) 2024 OpenFOAM Foundation
\\/ M anipulation |
-------------------------------------------------------------------------------
License
This file is part of OpenFOAM.
OpenFOAM is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
OpenFOAM is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
for more details.
You should have received a copy of the GNU General Public License
along with OpenFOAM. If not, see <http://www.gnu.org/licenses/>.
Class
Foam::decompositionMethods::parMetis
Description
ParMetis redistribution in parallel
Note: parMetis methods do not support serial operation.
Parameters
- Method of decomposition
- kWay: multilevel k-way
- geomKway: combined coordinate-based and multi-level k-way
- adaptiveRepart: balances the work load of a graph
- Options
- options[0]: The specified options are used if options[0] = 1
- options[1]: Specifies the level of information to be returned during
the execution of the algorithm. Timing information can be obtained by
setting this to 1. Additional options for this parameter can be obtained
by looking at parmetis.h. Default: 0.
- options[2]: Random number seed for the routine
- options[3]: Specifies whether the sub-domains and processors are coupled
or un-coupled. If the number of sub-domains desired (i.e., nparts) and
the number of processors that are being used is not the same, then these
must be un-coupled. However, if nparts equals the number of processors,
these can either be coupled or de-coupled. If sub-domains and processors
are coupled, then the initial partitioning will be obtained implicitly
from the graph distribution. However, if sub-domains are un-coupled from
processors, then the initial partitioning needs to be obtained from the
initial values assigned to the part array.
- itr: Parameter which describes the ratio of inter-processor communication
time compared to data redistribution time. Should be set between 0.000001
and 1000000.0. If set high, a repartitioning with a low edge-cut will be
computed. If it is set low, a repartitioning that requires little data
redistribution will be computed. Good values for this parameter can be
obtained by dividing inter-processor communication time by data
redistribution time. Otherwise, a value of 1000.0 is recommended.
Default: 1000.
SourceFiles
parMetis.C
\*---------------------------------------------------------------------------*/
#ifndef parMetis_H
#define parMetis_H
#include "decompositionMethod.H"
namespace Foam
{
namespace decompositionMethods
{
/*---------------------------------------------------------------------------*\
Class parMetis Declaration
\*---------------------------------------------------------------------------*/
class parMetis
:
public decompositionMethod
{
// Private Member Data
//- Method of decomposition
word method_;
//- Options to control the operation of the decomposer
labelList options_;
// Processor weights for each constraint
scalarList processorWeights_;
//- Parameter which describes the ratio of inter-processor
// communication time compared to data redistribution time.
scalar itr_;
// Private Member Functions
label decompose
(
const labelList& xadj,
const labelList& adjncy,
const pointField& cellCentres,
const labelList& cellWeights,
const labelList& faceWeights,
labelList& finalDecomp
);
public:
//- Runtime type information
TypeName("parMetis");
// Constructors
//- Construct given the decomposition dictionary
parMetis(const dictionary& decompositionDict);
//- Disallow default bitwise copy construction
parMetis(const parMetis&) = delete;
//- Destructor
virtual ~parMetis()
{}
// Member Functions
//- Inherit decompose from decompositionMethod
using decompositionMethod::decompose;
//- Return for every coordinate the wanted processor number. Use the
// mesh connectivity (if needed)
// Weights get normalised so the minimum value is 1 before truncation
// to an integer so the weights should be multiples of the minimum
// value. The overall sum of weights might otherwise overflow.
virtual labelList decompose
(
const polyMesh& mesh,
const pointField& points,
const scalarField& pointWeights
);
//- Return for every coordinate the wanted processor number. Gets
// passed agglomeration map (from fine to coarse cells) and coarse cell
// location. Can be overridden by decomposers that provide this
// functionality natively.
// See note on weights above.
virtual labelList decompose
(
const polyMesh& mesh,
const labelList& cellToRegion,
const pointField& regionPoints,
const scalarField& regionWeights
);
//- Return for every coordinate the wanted processor number. Explicitly
// provided mesh connectivity.
// The connectivity is equal to mesh.cellCells() except for
// - in parallel the cell numbers are global cell numbers (starting
// from 0 at processor0 and then incrementing all through the
// processors)
// - the connections are across coupled patches
// See note on weights above.
virtual labelList decompose
(
const labelListList& globalCellCells,
const pointField& cellCentres,
const scalarField& cellWeights
);
// Member Operators
//- Disallow default bitwise assignment
void operator=(const parMetis&) = delete;
};
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
} // End namespace decompositionMethods
} // End namespace Foam
// * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * //
#endif
// ************************************************************************* //

View File

@ -17,24 +17,21 @@ FoamFile
numberOfSubdomains 12;
decomposer simple;
distributor zoltan;
libs ("libzoltanDecomp.so");
// distributor parMetis;
// libs ("libparMetisDecomp.so");
simpleCoeffs
{
n (2 2 3);
}
hierarchicalCoeffs
{
n (2 2 3);
order xyz;
}
zoltanCoeffs
{
lb_method rcb;
}
// ************************************************************************* //

View File

@ -17,9 +17,13 @@ FoamFile
numberOfSubdomains 12;
decomposer hierarchical;
distributor zoltan;
libs ("libzoltanDecomp.so");
// distributor parMetis;
// libs ("libparMetisDecomp.so");
hierarchicalCoeffs
{
n (6 2 1);
@ -31,5 +35,4 @@ zoltanCoeffs
lb_method rcb;
}
// ************************************************************************* //