From 4d13f3d33da97368a2983fff79a1e33c4a75ef45 Mon Sep 17 00:00:00 2001
From: sjplimp <sjplimp@f3b2605a-c512-4ea7-a41b-209d697bcdaa>
Date: Sat, 29 Aug 2015 00:13:36 +0000
Subject: [PATCH] ''

git-svn-id: svn://svn.icms.temple.edu/lammps-ro/trunk@13961 f3b2605a-c512-4ea7-a41b-209d697bcdaa
---
 doc/doc2/Section_start.html | 45 +++++++++++++++++++------------------
 1 file changed, 23 insertions(+), 22 deletions(-)
diff --git a/doc/doc2/Section_start.html b/doc/doc2/Section_start.html
index de3e1e86ed..bf35e857a5 100644
--- a/doc/doc2/Section_start.html
+++ b/doc/doc2/Section_start.html
@@ -1753,9 +1753,10 @@ thermodynamic state and a total run time for the simulation.  It then
 appends statistics about the CPU time and storage requirements for the
 simulation.  An example set of statistics is shown here:
 </P>
-<PRE>Loop time of 2.81192 on 4 procs for 300 steps with 2004 atoms
-97.0% CPU use with 4 MPI tasks x no OpenMP threads
-Performance: 18.436 ns/day  1.302 hours/ns  106.689 timesteps/s 
+<P>Loop time of 2.81192 on 4 procs for 300 steps with 2004 atoms
+</P>
+<PRE>Performance: 18.436 ns/day  1.302 hours/ns  106.689 timesteps/s
+97.0% CPU use with 4 MPI tasks x no OpenMP threads 
 </PRE>
 <PRE>MPI task timings breakdown:
 Section |  min time  |  avg time  |  max time  |%varavg| %total
@@ -1783,16 +1784,16 @@ Neighbor list builds = 26
 Dangerous builds = 0 
 </PRE>
 <P>The first section provides a global loop timing summary. The loop time
-is the total wall time for the section. The second line provides the
-CPU utilzation per MPI task; it should be close to 100% times the number
-of OpenMP threads (or 1). Lower numbers correspond to delays due to
-file i/o or unsufficient thread utilization. The <I>Performance</I> line is
+is the total wall time for the section.  The <I>Performance</I> line is
 provided for convenience to help predicting the number of loop
-continuations required and for comparing performance with other similar
-MD codes.
+continuations required and for comparing performance with other
+similar MD codes.  The CPU use line provides the CPU utilzation per
+MPI task; it should be close to 100% times the number of OpenMP
+threads (or 1). Lower numbers correspond to delays due to file I/O or
+insufficient thread utilization.
 </P>
-<P>The second section gives the breakdown of the CPU run time (in seconds)
-into major categories:
+<P>The MPI task section gives the breakdown of the CPU run time (in
+seconds) into major categories:
 </P>
 <UL><LI><I>Pair</I> stands for all non-bonded force computation
 <LI><I>Bond</I> stands for bonded interactions: bonds, angles, dihedrals, impropers
@@ -1811,17 +1812,17 @@ the difference between minimum, maximum and average is small and thus
 the variation from the average close to zero. The final column shows
 the percentage of the total loop time is spent in this section.
 </P>
-<P>When using the <A HREF = "timers.html">timers full</A> setting, and additional column
-is present that also prints the CPU utilization in percent. In addition,
-when using <I>timers full</I> and the <A HREF = "package.html">package omp</A> command are
-active, a similar timing summary of time spent in threaded regions to 
-monitor thread utilization and load balance is provided. A new enrty is
-the <I>Reduce</I> section, which lists the time spend in reducing the per-thread
-data elements to the storage for non-threaded computation. These thread
-timings are taking from the first MPI rank only and and thus, as the
-breakdown for MPI tasks can change from MPI rank to MPI rank, this
-breakdown can be very different for individual ranks. Here is an example
-output for this optional output section:
+<P>When using the <A HREF = "timers.html">timers full</A> setting, an additional column
+is present that also prints the CPU utilization in percent. In
+addition, when using <I>timers full</I> and the <A HREF = "package.html">package omp</A>
+command are active, a similar timing summary of time spent in threaded
+regions to monitor thread utilization and load balance is provided. A
+new entry is the <I>Reduce</I> section, which lists the time spend in
+reducing the per-thread data elements to the storage for non-threaded
+computation. These thread timings are taking from the first MPI rank
+only and and thus, as the breakdown for MPI tasks can change from MPI
+rank to MPI rank, this breakdown can be very different for individual
+ranks. Here is an example output for this section:
 </P>
 <P>Thread timings breakdown (MPI rank 0):
 Total threaded time 0.6846 / 90.6%