From ce44faddcf053de225b5f983b876a15e1bb0ba96 Mon Sep 17 00:00:00 2001 From: Axel Kohlmeyer Date: Tue, 27 May 2025 23:40:52 -0400 Subject: [PATCH] correct discuss FFT benchmark timing for PPPM --- doc/src/Speed_measure.rst | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/doc/src/Speed_measure.rst b/doc/src/Speed_measure.rst index 4bb94df4dd..2fe838cb22 100644 --- a/doc/src/Speed_measure.rst +++ b/doc/src/Speed_measure.rst @@ -230,12 +230,15 @@ breakdown and relative percentages. For example, trying different options for speeding up the long-range solvers will have little impact if they only consume 10% of the run time. If the pairwise time is dominating, you may want to look at GPU or OMP versions of the pair -style, as discussed below. Comparing how the percentages change as -you increase the processor count gives you a sense of how different -operations within the timestep are scaling. Note that if you are -running with a Kspace solver, there is additional output on the -breakdown of the Kspace time. For PPPM, this includes the fraction -spent on FFTs, which can be communication intensive. +style, as discussed below. Comparing how the percentages change as you +increase the processor count gives you a sense of how different +operations within the timestep are scaling. If you are using PPPM as +Kspace solver, you can turn on an additional output with +:doc:`kspace_modify fftbench yes ` which measures the +time spent during PPPM on the 3d FFTs, which can be communication +intensive for larger processor counts. This provides an indication +whether it is worth trying out alternatives to the default FFT settings +for additional performance. Another important detail in the timing info are the histograms of atoms counts and neighbor counts. If these vary widely across