Updated README and cleaned up

This commit is contained in:
Trung Nguyen
2024-07-21 10:06:37 -05:00
parent e8f09bfb02
commit f23835932c
2 changed files with 345 additions and 377 deletions

View File

@ -1,20 +1,36 @@
The script `run_tests.py` in this folder is used to perform regression tests using in-place example scripts. The script `run_tests.py` in this folder is used to perform regression tests
using in-place example scripts.
What this single script does is to launch the selected LAMMPS binary using a testing configuration defined in a `.yaml` file (e.g., `config.yaml`) for the set of input scripts inside the given `examples/` subfolders, and compare the output thermo with that in the existing log file with the same number of procs. If there are multiple log files with the same input script (e.g., `log.melt.*.g++.1` and `log.melt.*.g++.4`), the one with highest number of procs is chosen. What this single script does is to launch the selected LAMMPS binary
using a testing configuration defined in a `.yaml` file (e.g., `config.yaml`)
for the set of input scripts inside the given `examples/` subfolders,
and compare the output thermo with that in the existing log file with the same number of procs.
If there are multiple log files with the same input script (e.g., `log.melt.*.g++.1` and `log.melt.*.g++.4`),
the one with the highest number of procs is chosen.
The output includes the number of passed and failed tests and an `output.xml` file in the JUnit XML format for downstream reporting. The output and error of any crashed runs are logged. The output includes the number of passed and failed tests and
an `output.xml` file in the JUnit XML format for downstream reporting.
The output and error of any crashed runs are logged.
A test with an input script is considered passed when the given LAMMPS binary produces thermo output quantities consistent with those in the reference log file within the specified tolerances in the test configuration `.yaml` file. A test with an input script is considered passed when the given LAMMPS binary produces
thermo output quantities consistent with those in the reference log file
within the specified tolerances in the test configuration `.yaml` file.
With the current features, users can: With the current features, users can:
+ launch tests with `mpirun` with multiple procs + specify which LAMMPS binary version to test (e.g., the version from a commit, or those from `lammps-testing`)
+ specify which LAMMPS binary version to test (e.g., the version with their new code or those from `lammps-testing`)
+ specify the examples subfolders (thus the reference log files) seperately (e.g. from other LAMMPS versions or commits) + specify the examples subfolders (thus the reference log files) seperately (e.g. from other LAMMPS versions or commits)
+ specify tolerances for individual quantities for any input script to override the global values + specify tolerances for individual quantities for any input script to override the global values
+ launch tests with `mpirun` with all supported command line features (multiple procs, multiple paritions, and suffices)
+ skip certain input files if not interested, or no reference log file exists + skip certain input files if not interested, or no reference log file exists
+ simplify the main LAMMPS builds, as long as a LAMMPS binary is available + simplify the main LAMMPS builds, as long as a LAMMPS binary is available
Limitations:
- input scripts use thermo style multi (e.g., examples/peptide) do not work with the expected thermo output format
- input scripts that require partition runs (e.g. examples/neb) need a separate config file, e.g. "args: --partition 2x1"
- testing accelerator packages (GPU, INTEL, KOKKOS, OPENMP) need separate config files, "args: -sf omp -pk omp 4"
TODO: TODO:
+ keep track of the testing progress to resume the testing from the last checkpoint + keep track of the testing progress to resume the testing from the last checkpoint
@ -22,6 +38,14 @@ TODO:
split the list of input scripts into separate runs (there are 800+ input script under the top-level examples) split the list of input scripts into separate runs (there are 800+ input script under the top-level examples)
+ be able to be invoked from run_tests in the lammps-testing infrastruture + be able to be invoked from run_tests in the lammps-testing infrastruture
The following Python packages need to be installed into an activated environment:
python3 -m venv testing-env
source testing-env/bin/activate
pip install numpy pyyaml junit_xml
Example uses: Example uses:
1. Simple use with the provided `tools/regression-tests/config.yaml` and the `examples/` folder at the top level: 1. Simple use with the provided `tools/regression-tests/config.yaml` and the `examples/` folder at the top level:
@ -38,6 +62,10 @@ Example uses:
--example-folders="/path/to/examples/folder1;/path/to/examples/folder2" \ --example-folders="/path/to/examples/folder1;/path/to/examples/folder2" \
--config-file=/path/to/config/file/config.yaml --config-file=/path/to/config/file/config.yaml
4) Test a LAMMPS binary with the whole top-level /examples folder in a LAMMPS source tree
python3 run_tests.py --lmp-bin=/path/to/lmp_binary --example-top-level=/path/to/lammps/examples
An example of the test configuration `config.yaml` is given as below. An example of the test configuration `config.yaml` is given as below.
--- ---

View File

@ -1,21 +1,35 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
''' '''
UPDATE: July 21, 2024:
pip install numpy pyyaml junit_xml
UPDATE: July 5, 2024:
Launching the LAMMPS binary under testing using a configuration defined in a yaml file (e.g. config.yaml). Launching the LAMMPS binary under testing using a configuration defined in a yaml file (e.g. config.yaml).
Comparing the output thermo with that in the existing log file (with the same nprocs) Comparing the output thermo with that in the existing log file (with the same nprocs)
+ data in the log files are extracted and converted into yaml data structure + data in the log files are extracted and converted into yaml data structure
+ using the in place input scripts, no need to add REG markers to the input scripts + using the in place input scripts, no need to add REG markers to the input scripts
This way we can:
+ launch tests with mpirun with multiple procs With the current features, users can:
+ specify what LAMMPS binary version to test (e.g., testing separate builds) + specify which LAMMPS binary version to test (e.g., the version from a commit, or those from `lammps-testing`)
+ simplify the build configuration (no need to build the Python module) + specify the examples subfolders (thus the reference log files) seperately (e.g. from other LAMMPS versions or commits)
+ specify tolerances for individual quantities for any input script to override the global values + specify tolerances for individual quantities for any input script to override the global values
+ launch tests with `mpirun` with all supported command line features (multiple procs, multiple paritions, and suffices)
+ skip certain input files if not interested, or no reference log file exists
+ simplify the main LAMMPS builds, as long as a LAMMPS binary is available
Limitations:
- input scripts use thermo style multi (e.g., examples/peptide) do not work with the expected thermo output format
- input scripts that require partition runs (e.g. examples/neb) need a separate config file, e.g. "args: --partition 2x1"
- testing accelerator packages (GPU, INTEL, KOKKOS, OPENMP) need separate config files, "args: -sf omp -pk omp 4"
TODO: TODO:
+ distribute the input list across multiple processes via multiprocessing + keep track of the testing progress to resume the testing from the last checkpoint
+ distribute the input list across multiple processes via multiprocessing, or
split the list of input scripts into separate runs (there are 800+ input script under the top-level examples)
+ be able to be invoked from run_tests in the lammps-testing infrastruture
The following Python packages need to be installed into an activated environment:
python3 -m venv testing-env
source testing-env/bin/activate
pip install numpy pyyaml junit_xml
Example usage: Example usage:
1) Simple use (using the provided tools/regression-tests/config.yaml and the examples/ folder at the top level) 1) Simple use (using the provided tools/regression-tests/config.yaml and the examples/ folder at the top level)
@ -26,6 +40,8 @@ UPDATE: July 5, 2024:
python3 run_tests.py --lmp-bin=/path/to/lmp_binary \ python3 run_tests.py --lmp-bin=/path/to/lmp_binary \
--example-folders="/path/to/examples/folder1;/path/to/examples/folder2" \ --example-folders="/path/to/examples/folder1;/path/to/examples/folder2" \
--config-file=/path/to/config/file/config.yaml --config-file=/path/to/config/file/config.yaml
4) Test a LAMMPS binary with the whole top-level /examples folder in a LAMMPS source tree
python3 run_tests.py --lmp-bin=/path/to/lmp_binary --example-top-level=/path/to/lammps/examples
''' '''
import os import os
@ -39,8 +55,8 @@ from multiprocessing import Pool
import logging import logging
# need "pip install numpy pyyaml" # need "pip install numpy pyyaml"
import yaml
import numpy as np import numpy as np
import yaml
# need "pip install junit_xml" # need "pip install junit_xml"
from junit_xml import TestSuite, TestCase from junit_xml import TestSuite, TestCase
@ -60,11 +76,13 @@ class TestResult:
''' '''
get the thermo output from a log file with thermo style yaml
yamlFileName: input YAML file with thermo structured yamlFileName: input YAML file with thermo structured
as described in https://docs.lammps.org/Howto_structured_data.html as described in https://docs.lammps.org/Howto_structured_data.html
return: thermo, which is a list containing a dictionary for each run return: thermo, which is a list containing a dictionary for each run
where the tag "keywords" maps to the list of thermo header strings where the tag "keywords" maps to the list of thermo header strings
and the tag data has a list of lists where the outer list represents the lines and the tag data has a list of lists where the outer list represents the lines
of output and the inner list the values of the columns matching the header keywords for that step. of output and the inner list the values of the columns matching the header keywords for that step.
''' '''
def extract_thermo(yamlFileName): def extract_thermo(yamlFileName):
@ -78,7 +96,7 @@ def extract_thermo(yamlFileName):
''' '''
Convert an existing log.lammps file into a thermo yaml style log Convert an existing log file into a thermo yaml style log
inputFileName = a provided log file in an examples folder (e.g. examples/melt/log.8Apr21.melt.g++.4) inputFileName = a provided log file in an examples folder (e.g. examples/melt/log.8Apr21.melt.g++.4)
return a YAML data structure as if loaded from a thermo yaml file return a YAML data structure as if loaded from a thermo yaml file
''' '''
@ -200,8 +218,9 @@ def divide_into_N(original_list, N):
b.append(l) b.append(l)
return b return b
''' '''
process the #REG markers in an input script, add/replace with what follows each marker
inputFileName: LAMMPS input file with comments #REG:ADD and #REG:SUB as markers inputFileName: LAMMPS input file with comments #REG:ADD and #REG:SUB as markers
outputFileName: modified input file ready for testing outputFileName: modified input file ready for testing
''' '''
@ -271,11 +290,13 @@ def has_markers(inputFileName):
Iterate over a list of input files using the given lmp_binary, the testing configuration Iterate over a list of input files using the given lmp_binary, the testing configuration
return test results, as a list of TestResult instances return test results, as a list of TestResult instances
To map a function to individual workers:
def func(input1, input2, output): def func(input1, input2, output):
# do smth # do smth
return result return result
# args is a list of Ncores tuples, each tuple contains the arguments passed to the function executed by a worker # args is a list of num_workers tuples, each tuple contains the arguments passed to the function executed by a worker
args = [] args = []
for i in range(num_workers): for i in range(num_workers):
args.append((input1, input2, output)) args.append((input1, input2, output))
@ -284,7 +305,7 @@ def has_markers(inputFileName):
results = pool.starmap(func, args) results = pool.starmap(func, args)
''' '''
def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False): def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False, output=None):
EPSILON = np.float64(config['epsilon']) EPSILON = np.float64(config['epsilon'])
nugget = float(config['nugget']) nugget = float(config['nugget'])
@ -292,7 +313,7 @@ def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False)
num_passed = 0 num_passed = 0
test_id = 0 test_id = 0
# using REG-commented input scripts # using REG-commented input scripts, now turned off (False)
using_markers = False using_markers = False
# iterate over the input scripts # iterate over the input scripts
@ -330,6 +351,7 @@ def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False)
str_t = "\nRunning " + input_test + f" ({test_id+1}/{num_tests})" str_t = "\nRunning " + input_test + f" ({test_id+1}/{num_tests})"
else: else:
input_test = input input_test = input
print(str_t) print(str_t)
print(f"-"*len(str_t)) print(f"-"*len(str_t))
logger.info(str_t) logger.info(str_t)
@ -368,7 +390,7 @@ def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False)
thermo_ref = extract_data_to_yaml(thermo_ref_file) thermo_ref = extract_data_to_yaml(thermo_ref_file)
num_runs_ref = len(thermo_ref) num_runs_ref = len(thermo_ref)
else: else:
logger.info(f"Cannot find reference log file with {pattern}.") logger.info(f"Cannot find a reference log file {thermo_ref_file} for {input_test}.")
# try to read in the thermo yaml output from the working directory # try to read in the thermo yaml output from the working directory
thermo_ref_file = 'thermo.' + input + '.yaml' thermo_ref_file = 'thermo.' + input + '.yaml'
file_exist = os.path.isfile(thermo_ref_file) file_exist = os.path.isfile(thermo_ref_file)
@ -382,10 +404,6 @@ def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False)
test_id = test_id + 1 test_id = test_id + 1
continue continue
# using the LAMMPS python module (for single-proc runs)
# lmp = lammps()
# lmp.file(input_test)
# or more customizable with config.yaml # or more customizable with config.yaml
cmd_str, output, error, returncode = execute(lmp_binary, config, input_test) cmd_str, output, error, returncode = execute(lmp_binary, config, input_test)
@ -400,12 +418,11 @@ def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False)
test_id = test_id + 1 test_id = test_id + 1
continue continue
# process thermo output # process thermo output from the run
thermo = extract_data_to_yaml("log.lammps") thermo = extract_data_to_yaml("log.lammps")
num_runs = len(thermo) num_runs = len(thermo)
if num_runs == 0: if num_runs == 0:
logger.info(f"The run terminated with {input_test} gives the following output:\n") logger.info(f"The run terminated with {input_test} gives the following output:\n")
logger.info(f"\n{output}") logger.info(f"\n{output}")
if "Unrecognized" in output: if "Unrecognized" in output:
@ -534,7 +551,8 @@ def iterate(lmp_binary, input_list, config, results, removeAnnotatedInput=False)
''' '''
TODO: TODO:
- automate annotating the example input scripts if thermo style is multi (e.g. examples/peptide)
''' '''
if __name__ == "__main__": if __name__ == "__main__":
@ -627,24 +645,15 @@ if __name__ == "__main__":
p = subprocess.run(cmd_str, shell=True, text=True, capture_output=True) p = subprocess.run(cmd_str, shell=True, text=True, capture_output=True)
input_list = p.stdout.split('\n') input_list = p.stdout.split('\n')
input_list.remove("") input_list.remove("")
# find out which folder to cd into to run the input script # find out which folder to cd into to run the input script
for input in input_list: for input in input_list:
folder = input.rsplit('/', 1)[0] folder = input.rsplit('/', 1)[0]
folder_list.append(folder) folder_list.append(folder)
print(f"There are {len(input_list)} input scripts in total under the {example_toplevel} folder.") print(f"There are {len(input_list)} input scripts in total under the {example_toplevel} folder.")
# divide the list of input scripts into num_workers chunks # divide the list of input scripts into num_workers chunks
sublists = divide_into_N(input_list, num_workers) sublists = divide_into_N(input_list, num_workers)
'''
# getting the list of all the subfolders
cmd_str = f"ls -d {example_toplevel} "
p = subprocess.run(cmd_str, shell=True, text=True, capture_output=True)
folder_list = p.stdout.split('\n')
folder_list.remove("")
print(f"There are {len(folder_list)} subfolders in total under the {example_toplevel} folder.")
# divide the list of subfolders into num_workers chunks
sublists = divide_into_N(folder_list, num_workers)
'''
# if only statistics, not running anything # if only statistics, not running anything
if dry_run == True: if dry_run == True:
@ -655,10 +664,12 @@ if __name__ == "__main__":
test_cases = [] test_cases = []
# if the example folders are not specified from the command-line argument --example-folders # if the example folders are not specified from the command-line argument --example-folders
# then use the --example-top-folder # then use the path from --example-top-folder
if len(example_subfolders) == 0: if len(example_subfolders) == 0:
# input file list # get the input file list, for now the first in the sublist
# TODO: generate a list of tuples, each tuple contains a folder list for a worker,
# then use multiprocessing.Pool starmap()
folder_list = [] folder_list = []
for input in sublists[0]: for input in sublists[0]:
folder = input.rsplit('/', 1)[0] folder = input.rsplit('/', 1)[0]
@ -668,82 +679,9 @@ if __name__ == "__main__":
example_subfolders = folder_list example_subfolders = folder_list
'''
example_subfolders = sublists[0]
'''
'''
example_subfolders.append("../../examples/melt")
example_subfolders.append('../../examples/flow')
example_subfolders.append('../../examples/indent')
example_subfolders.append('../../examples/shear')
example_subfolders.append('../../examples/steinhardt')
# prd log file parsing issue
# neb log file parsing issue
# snap log files obsolete?
# append the example subfolders depending on the installed packages
if 'ASPHERE' in packages:
#example_subfolders.append('../../examples/ASPHERE/ellipsoid')
example_subfolders.append('../../examples/ellipse')
if 'CORESHELL' in packages:
example_subfolders.append('../../examples/coreshell')
if 'MOLECULE' in packages:
example_subfolders.append('../../examples/micelle')
# peptide thermo_style as multi
#example_subfolders.append('../../examples/peptide')
if 'GRANULAR' in packages:
example_subfolders.append('../../examples/granular')
example_subfolders.append('../../examples/pour')
if 'AMOEBA' in packages:
example_subfolders.append('../../examples/amoeba')
if 'BODY' in packages:
example_subfolders.append('../../examples/body')
if 'BPM' in packages:
example_subfolders.append('../../examples/bpm/impact')
example_subfolders.append('../../examples/bpm/pour')
if 'COLLOID' in packages:
example_subfolders.append('../../examples/colloid')
if 'CRACK' in packages:
example_subfolders.append('../../examples/crack')
if 'DIELECTRIC' in packages:
example_subfolders.append('../../examples/PACKAGES/dielectric')
if 'DIPOLE' in packages:
example_subfolders.append('../../examples/dipole')
if 'DPD-BASIC' in packages:
example_subfolders.append('../../examples/PACKAGES/dpd-basic/dpd')
example_subfolders.append('../../examples/PACKAGES/dpd-basic/dpdext')
example_subfolders.append('../../examples/PACKAGES/dpd-basic/dpd_tstat')
example_subfolders.append('../../examples/PACKAGES/dpd-basic/dpdext_tstat')
if 'MANYBODY' in packages:
example_subfolders.append('../../examples/tersoff')
example_subfolders.append('../../examples/vashishta')
example_subfolders.append('../../examples/threebody')
if 'RIGID' in packages:
example_subfolders.append('../../examples/rigid')
if 'SNAP' in packages:
example_subfolders.append('../../examples/snap')
if 'SRD' in packages:
example_subfolders.append('../../examples/srd')
'''
all_results = [] all_results = []
# default setting
# default setting is to use inplace_input
if inplace_input == True: if inplace_input == True:
# save current working dir # save current working dir
p = subprocess.run("pwd", shell=True, text=True, capture_output=True) p = subprocess.run("pwd", shell=True, text=True, capture_output=True)
@ -754,7 +692,6 @@ if __name__ == "__main__":
# change dir to a folder under examples/, need to use os.chdir() # change dir to a folder under examples/, need to use os.chdir()
# TODO: loop through the subfolders under examples/, depending on the installed packages # TODO: loop through the subfolders under examples/, depending on the installed packages
''' '''
args = [] args = []
for i in range(num_workers): for i in range(num_workers):
@ -766,7 +703,6 @@ if __name__ == "__main__":
total_tests = 0 total_tests = 0
passed_tests = 0 passed_tests = 0
for directory in example_subfolders: for directory in example_subfolders:
p = subprocess.run("pwd", shell=True, text=True, capture_output=True) p = subprocess.run("pwd", shell=True, text=True, capture_output=True)
@ -786,6 +722,7 @@ if __name__ == "__main__":
num_passed = iterate(lmp_binary, input_list, config, results) num_passed = iterate(lmp_binary, input_list, config, results)
passed_tests += num_passed passed_tests += num_passed
# append the results to the all_results list
all_results.extend(results) all_results.extend(results)
# get back to the working dir # get back to the working dir
@ -798,11 +735,14 @@ if __name__ == "__main__":
results = [] results = []
passed_tests = iterate(input_list, config, results) passed_tests = iterate(input_list, config, results)
all_results.extend(results)
# print out summary
print("Summary:") print("Summary:")
print(f" - {passed_tests} numerical tests passed / {total_tests} tests") print(f" - {passed_tests} numerical tests passed / {total_tests} tests")
print(f" - Details are given in {output_file}.") print(f" - Details are given in {output_file}.")
# optional: need to check if junit_xml packaged is already installed in the env
# generate a JUnit XML file # generate a JUnit XML file
with open(output_file, 'w') as f: with open(output_file, 'w') as f:
test_cases = [] test_cases = []