''
git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@13961 f3b2605a-c512-4ea7-a41b-209d697bcdaa
This commit is contained in:
@ -1753,9 +1753,10 @@ thermodynamic state and a total run time for the simulation. It then
|
|||||||
appends statistics about the CPU time and storage requirements for the
|
appends statistics about the CPU time and storage requirements for the
|
||||||
simulation. An example set of statistics is shown here:
|
simulation. An example set of statistics is shown here:
|
||||||
</P>
|
</P>
|
||||||
<PRE>Loop time of 2.81192 on 4 procs for 300 steps with 2004 atoms
|
<P>Loop time of 2.81192 on 4 procs for 300 steps with 2004 atoms
|
||||||
|
</P>
|
||||||
|
<PRE>Performance: 18.436 ns/day 1.302 hours/ns 106.689 timesteps/s
|
||||||
97.0% CPU use with 4 MPI tasks x no OpenMP threads
|
97.0% CPU use with 4 MPI tasks x no OpenMP threads
|
||||||
Performance: 18.436 ns/day 1.302 hours/ns 106.689 timesteps/s
|
|
||||||
</PRE>
|
</PRE>
|
||||||
<PRE>MPI task timings breakdown:
|
<PRE>MPI task timings breakdown:
|
||||||
Section | min time | avg time | max time |%varavg| %total
|
Section | min time | avg time | max time |%varavg| %total
|
||||||
@ -1783,16 +1784,16 @@ Neighbor list builds = 26
|
|||||||
Dangerous builds = 0
|
Dangerous builds = 0
|
||||||
</PRE>
|
</PRE>
|
||||||
<P>The first section provides a global loop timing summary. The loop time
|
<P>The first section provides a global loop timing summary. The loop time
|
||||||
is the total wall time for the section. The second line provides the
|
is the total wall time for the section. The <I>Performance</I> line is
|
||||||
CPU utilzation per MPI task; it should be close to 100% times the number
|
|
||||||
of OpenMP threads (or 1). Lower numbers correspond to delays due to
|
|
||||||
file i/o or unsufficient thread utilization. The <I>Performance</I> line is
|
|
||||||
provided for convenience to help predicting the number of loop
|
provided for convenience to help predicting the number of loop
|
||||||
continuations required and for comparing performance with other similar
|
continuations required and for comparing performance with other
|
||||||
MD codes.
|
similar MD codes. The CPU use line provides the CPU utilzation per
|
||||||
|
MPI task; it should be close to 100% times the number of OpenMP
|
||||||
|
threads (or 1). Lower numbers correspond to delays due to file I/O or
|
||||||
|
insufficient thread utilization.
|
||||||
</P>
|
</P>
|
||||||
<P>The second section gives the breakdown of the CPU run time (in seconds)
|
<P>The MPI task section gives the breakdown of the CPU run time (in
|
||||||
into major categories:
|
seconds) into major categories:
|
||||||
</P>
|
</P>
|
||||||
<UL><LI><I>Pair</I> stands for all non-bonded force computation
|
<UL><LI><I>Pair</I> stands for all non-bonded force computation
|
||||||
<LI><I>Bond</I> stands for bonded interactions: bonds, angles, dihedrals, impropers
|
<LI><I>Bond</I> stands for bonded interactions: bonds, angles, dihedrals, impropers
|
||||||
@ -1811,17 +1812,17 @@ the difference between minimum, maximum and average is small and thus
|
|||||||
the variation from the average close to zero. The final column shows
|
the variation from the average close to zero. The final column shows
|
||||||
the percentage of the total loop time is spent in this section.
|
the percentage of the total loop time is spent in this section.
|
||||||
</P>
|
</P>
|
||||||
<P>When using the <A HREF = "timers.html">timers full</A> setting, and additional column
|
<P>When using the <A HREF = "timers.html">timers full</A> setting, an additional column
|
||||||
is present that also prints the CPU utilization in percent. In addition,
|
is present that also prints the CPU utilization in percent. In
|
||||||
when using <I>timers full</I> and the <A HREF = "package.html">package omp</A> command are
|
addition, when using <I>timers full</I> and the <A HREF = "package.html">package omp</A>
|
||||||
active, a similar timing summary of time spent in threaded regions to
|
command are active, a similar timing summary of time spent in threaded
|
||||||
monitor thread utilization and load balance is provided. A new enrty is
|
regions to monitor thread utilization and load balance is provided. A
|
||||||
the <I>Reduce</I> section, which lists the time spend in reducing the per-thread
|
new entry is the <I>Reduce</I> section, which lists the time spend in
|
||||||
data elements to the storage for non-threaded computation. These thread
|
reducing the per-thread data elements to the storage for non-threaded
|
||||||
timings are taking from the first MPI rank only and and thus, as the
|
computation. These thread timings are taking from the first MPI rank
|
||||||
breakdown for MPI tasks can change from MPI rank to MPI rank, this
|
only and and thus, as the breakdown for MPI tasks can change from MPI
|
||||||
breakdown can be very different for individual ranks. Here is an example
|
rank to MPI rank, this breakdown can be very different for individual
|
||||||
output for this optional output section:
|
ranks. Here is an example output for this section:
|
||||||
</P>
|
</P>
|
||||||
<P>Thread timings breakdown (MPI rank 0):
|
<P>Thread timings breakdown (MPI rank 0):
|
||||||
Total threaded time 0.6846 / 90.6%
|
Total threaded time 0.6846 / 90.6%
|
||||||
|
|||||||
Reference in New Issue
Block a user