Compare commits

..

29 Commits

Author SHA1 Message Date
b29d526632 Merge pull request #1323 from akohlmey/next-patch-release
Patch release 8 February 2019
2019-02-08 21:43:39 +01:00
77c24f1b79 Merge branch 'master' into next-patch-release
# Conflicts:
#	.github/PULL_REQUEST_TEMPLATE.md
2019-02-08 15:34:40 -05:00
b4765683da restore generic pull request template (for now) 2019-02-08 15:27:16 -05:00
f286155dd3 Merge pull request #1306 from stanmoore1/kk_angle_cosine
Add Kokkos version of angle/cosine
2019-02-08 21:07:36 +01:00
f04d97e66e Merge pull request #1312 from stanmoore1/kk_update
Update Kokkos library to v2.8.00
2019-02-08 21:07:23 +01:00
b871b4e13d Merge pull request #1321 from akohlmey/collected-small-fixes
Collected small fixes for the next patch release
2019-02-08 17:19:01 +01:00
8b449e569b Merge pull request #1318 from akohlmey/github-template-update
New GitHub PR templates
2019-02-08 17:18:30 +01:00
be6c2d781b Merge pull request #1322 from akohlmey/add-code-of-conduct
Add a code of conduct to lammps github project
2019-02-08 17:18:05 +01:00
a35f407dfc Patch release 8 February 2019 2019-02-08 17:15:22 +01:00
82355342bf improve wording about avoiding namespace imports in headers 2019-02-08 17:08:15 +01:00
773476634b Update Kokkos library in LAMMPS to v2.8.00 2019-02-08 08:50:26 -07:00
1b4ed9cb8d revert region optimization to create_atoms 2019-02-08 16:24:48 +01:00
997727d26c add a code of conduct to github project
complete the "github open source best practices" setup by adapting a
code of conduct for any public communications about LAMMPS on github.
2019-02-08 14:21:37 +01:00
a71159578e fix typo 2019-02-08 02:29:03 -05:00
ae85ca18fc provide multiple and more specific pull request templates 2019-02-07 15:53:46 +01:00
4d192e541e update github contribution guidelines text 2019-02-07 14:58:52 +01:00
8da5289638 add readme file adapted from https://github.com/pmla/polyhedral-template-matching/blob/master/README.md to USER-PTM package 2019-02-07 14:40:52 +01:00
ad8aeb8b00 using POSIX compatible version of strdup() (i.e. _strdup()) on windows 2019-02-07 14:11:02 +01:00
f80c577b3c Merge pull request #1316 from akohlmey/python-portable-header
Rename python.h to lmppython.h
2019-02-07 14:05:49 +01:00
510641c065 Merge pull request #1319 from lammps/akohlmey-new-issue-templates
Update issue templates to have a different template for bug reports and feature requests.
2019-02-07 14:02:10 +01:00
019e534f80 some small tweaks, fixes for typos, and url corrections for the manual 2019-02-07 01:16:48 -05:00
ebcbc5bdbd fix some formatting and spelling/grammar errors in bug report template 2019-02-06 17:32:44 -05:00
c9a7513dc6 remove outdated generic issue template 2019-02-06 17:26:59 -05:00
17cd92a3f2 Update issue templates
Take advantage of the fact, that GitHub now allows to have separate issue templates for bug reports and feature requests
2019-02-06 17:15:52 -05:00
0c4e76ce84 bugfix for fix qeq/reax to make it usable without pair reax/c 2019-02-05 11:55:02 +01:00
7e1f760b72 rename python.h to lmppython.h and correspondingly python.cpp to lmppython.cpp 2019-02-04 14:24:53 -05:00
b417cfda9b removed dead code and reduce compiler warnings about unused parameters 2019-02-03 11:36:41 -05:00
64834e4a3d Update Kokkos library 2019-02-01 12:45:54 -07:00
54b9a1335c Add Kokkos version of angle_cosine 2019-01-28 11:29:28 -07:00
115 changed files with 1780 additions and 457 deletions

67
.github/CODE_OF_CONDUCT.md vendored Normal file
View File

@ -0,0 +1,67 @@
# Code of Conduct for the LAMMPS Project on GitHub
## Our Pledge
In the interest of fostering an open and welcoming environment, we as LAMMPS
developers, contributors, and maintainers pledge to making participation in
our project a harassment-free experience for everyone.
## Our Standards
Examples of behavior that contributes to creating a positive environment
include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of explicit language or imagery
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, issues, and other contributions that are not
aligned to this Code of Conduct, or to ban temporarily or permanently any
developer, maintainer, or contributor for this or other behaviors that they
deem inappropriate, threatening, offensive, or harmful.
## Scope
This Code of Conduct applies to all public exchanges in the LAMMPS project
on GitHub and in submitted code.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at developer@lammps.org. All
complaints will be reviewed and investigated and will result in a response
that is deemed necessary and appropriate to the circumstances. The project
team is obligated to maintain confidentiality with regard to the reporter
of an incident.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq

View File

@ -2,10 +2,10 @@
Thank your for considering to contribute to the LAMMPS software project.
The following is a set of guidelines as well as explanations of policies and workflows for contributing to the LAMMPS molecular dynamics software project. These guidelines focus on submitting issues or pull requests on the LAMMPS GitHub project.
The following is a set of guidelines as well as explanations of policies and work flows for contributing to the LAMMPS molecular dynamics software project. These guidelines focus on submitting issues or pull requests on the LAMMPS GitHub project.
Thus please also have a look at:
* [The Section on submitting new features for inclusion in LAMMPS of the Manual](http://lammps.sandia.gov/doc/Section_modify.html#mod-15)
* [The Section on submitting new features for inclusion in LAMMPS of the Manual](https://lammps.sandia.gov/doc/Modify_contribute.html)
* [The LAMMPS GitHub Tutorial in the Manual](http://lammps.sandia.gov/doc/Howto_github.html)
## Table of Contents
@ -18,7 +18,7 @@ Thus please also have a look at:
* [Suggesting Enhancements](#suggesting-enhancements)
* [Contributing Code](#contributing-code)
[GitHub Workflows](#github-workflows)
[GitHub Work flows](#github-workflows)
* [Issues](#issues)
* [Pull Requests](#pull-requests)
@ -26,17 +26,17 @@ __
## I don't want to read this whole thing I just have a question!
> **Note:** Please do not file an issue to ask a general question about LAMMPS, its features, how to use specific commands, or how perform simulations or analysis in LAMMPS. Instead post your question to the ['lammps-users' mailing list](http://lammps.sandia.gov/mail.html). You do not need to be subscribed to post to the list (but a mailing list subscription avoids having your post delayed until it is approved by a mailing list moderator). Most posts to the mailing list receive a response within less than 24 hours. Before posting to the mailing list, please read the [mailing list guidelines](http://lammps.sandia.gov/guidelines.html). Following those guidelines will help greatly to get a helpful response. Always mention which LAMMPS version you are using.
> **Note:** Please do not file an issue to ask a general question about LAMMPS, its features, how to use specific commands, or how perform simulations or analysis in LAMMPS. Instead post your question to the ['lammps-users' mailing list](https://lammps.sandia.gov/mail.html). You do not need to be subscribed to post to the list (but a mailing list subscription avoids having your post delayed until it is approved by a mailing list moderator). Most posts to the mailing list receive a response within less than 24 hours. Before posting to the mailing list, please read the [mailing list guidelines](https://lammps.sandia.gov/guidelines.html). Following those guidelines will help greatly to get a helpful response. Always mention which LAMMPS version you are using.
## How Can I Contribute?
There are several ways how you can actively contribute to the LAMMPS project: you can discuss compiling and using LAMMPS, and solving LAMMPS related problems with other LAMMPS users on the lammps-users mailing list, you can report bugs or suggest enhancements by creating issues on GitHub (or posting them to the lammps-users mailing list), and you can contribute by submitting pull requests on GitHub or e-mail your code
to one of the [LAMMPS core developers](http://lammps.sandia.gov/authors.html). As you may see from the aforementioned developer page, the LAMMPS software package includes the efforts of a very large number of contributors beyond the principal authors and maintainers.
to one of the [LAMMPS core developers](https://lammps.sandia.gov/authors.html). As you may see from the aforementioned developer page, the LAMMPS software package includes the efforts of a very large number of contributors beyond the principal authors and maintainers.
### Discussing How To Use LAMMPS
The LAMMPS mailing list is hosted at SourceForge. The mailing list began in 2005, and now includes tens of thousands of messages in thousands of threads. LAMMPS developers try to respond to posted questions in a timely manner, but there are no guarantees. Please consider that people live in different timezone and may not have time to answer e-mails outside of their work hours.
You can post to list by sending your email to lammps-users at lists.sourceforge.net (no subscription required), but before posting, please read the [mailing list guidelines](http://lammps.sandia.gov/guidelines.html) to maximize your chances to receive a helpful response.
You can post to list by sending your email to lammps-users at lists.sourceforge.net (no subscription required), but before posting, please read the [mailing list guidelines](https://lammps.sandia.gov/guidelines.html) to maximize your chances to receive a helpful response.
Anyone can browse/search previous questions/answers in the archives. You do not have to subscribe to the list to post questions, receive answers (to your questions), or browse/search the archives. You **do** need to subscribe to the list if you want emails for **all** the posts (as individual messages or in digest form), or to answer questions yourself. Feel free to sign up and help us out! Answering questions from fellow LAMMPS users is a great way to pay back the community for providing you a useful tool for free, and to pass on the advice you have received yourself to others. It improves your karma and helps you understand your own research better.
@ -44,7 +44,7 @@ If you post a message and you are a subscriber, your message will appear immedia
### Reporting Bugs
While developers writing code for LAMMPS are careful to test their code, LAMMPS is such a large and complex software, that it is impossible to test for all combinations of features under all normal and not so normal circumstances. Thus bugs do happen, and if you suspect, that you have encountered one, please try to document it and report it as an [Issue](https://github.com/lammps/lammps/issues) on the LAMMPS GitHub project web page. However, before reporting a bug, you need to check whether this is something that may have already been corrected. The [Latest Features and Bug Fixes in LAMMPS](http://lammps.sandia.gov/bug.html) web page lists all significant changes to LAMMPS over the years. It also tells you what the current latest development version of LAMMPS is, and you should test whether your issue still applies to that version.
While developers writing code for LAMMPS are careful to test their code, LAMMPS is such a large and complex software, that it is impossible to test for all combinations of features under all normal and not so normal circumstances. Thus bugs do happen, and if you suspect, that you have encountered one, please try to document it and report it as an [Issue](https://github.com/lammps/lammps/issues) on the LAMMPS GitHub project web page. However, before reporting a bug, you need to check whether this is something that may have already been corrected. The [Latest Features and Bug Fixes in LAMMPS](https://lammps.sandia.gov/bug.html) web page lists all significant changes to LAMMPS over the years. It also tells you what the current latest development version of LAMMPS is, and you should test whether your issue still applies to that version.
When you click on the green "New Issue" button, you will be provided with a text field, where you can enter your message. That text field with contain a template with several headlines and some descriptions. Keep the headlines that are relevant to your reported potential bug and replace the descriptions with the information as suggested by the descriptions.
You can also attach small text files (please add the file name extension `.txt` or it will be rejected), images, or small compressed text files (using gzip, do not use RAR or 7-ZIP or similar tools that are uncommon outside of Windows machines). In many cases, bugs are best illustrated by providing a small input deck (do **not** attach your entire production input, but remove everything that is not required to reproduce the issue, and scale down your system size, that the resulting calculation runs fast and can be run on small desktop quickly).
@ -62,13 +62,13 @@ To be able to submit an issue on GitHub, you have to register for an account (fo
We encourage users to submit new features or modifications for LAMMPS to the core developers so they can be added to the LAMMPS distribution. The preferred way to manage and coordinate this is by submitting a pull request at the LAMMPS project on GitHub. For any larger modifications or programming project, you are encouraged to contact the LAMMPS developers ahead of time, in order to discuss implementation strategies and coding guidelines, that will make it easier to integrate your contribution and result in less work for everybody involved. You are also encouraged to search through the list of open issues on GitHub and submit a new issue for a planned feature, so you would not duplicate the work of others (and possibly get scooped by them) or have your work duplicated by others.
How quickly your contribution will be integrated depends largely on how much effort it will cause to integrate and test it, how much it requires changes to the core code base, and of how much interest it is to the larger LAMMPS community. Please see below for a checklist of typical requirements. Once you have prepared everything, see [this tutorial](http://lammps.sandia.gov/doc/Howto_github.html)
How quickly your contribution will be integrated depends largely on how much effort it will cause to integrate and test it, how much it requires changes to the core code base, and of how much interest it is to the larger LAMMPS community. Please see below for a checklist of typical requirements. Once you have prepared everything, see [this tutorial](https://lammps.sandia.gov/doc/Howto_github.html)
for instructions on how to submit your changes or new files through a GitHub pull request
Here is a checklist of steps you need to follow to submit a single file or user package for our consideration. Following these steps will save both you and us time. See existing files in packages in the source directory for examples. If you are uncertain, please ask on the lammps-users mailing list.
* All source files you provide must compile with the most current version of LAMMPS with multiple configurations. In particular you need to test compiling LAMMPS from scratch with `-DLAMMPS_BIGBIG` set in addition to the default `-DLAMMPS_SMALLBIG` setting. Your code will need to work correctly in serial and in parallel using MPI.
* For consistency with the rest of LAMMPS and especially, if you want your contribution(s) to be added to main LAMMPS code or one of its standard packages, it needs to be written in a style compatible with other LAMMPS source files. This means: 2-character indentation per level, no tabs, no lines over 80 characters. I/O is done via the C-style stdio library, class header files should not import any system headers outside <stdio.h>, STL containers should be avoided in headers, and forward declarations used where possible or needed. All added code should be placed into the LAMMPS_NS namespace or a sub-namespace; global or static variables should be avoided, as they conflict with the modular nature of LAMMPS and the C++ class structure. Header files must not import namespaces with using. This all is so the developers can more easily understand, integrate, and maintain your contribution and reduce conflicts with other parts of LAMMPS. This basically means that the code accesses data structures, performs its operations, and is formatted similar to other LAMMPS source files, including the use of the error class for error and warning messages.
* For consistency with the rest of LAMMPS and especially, if you want your contribution(s) to be added to main LAMMPS code or one of its standard packages, it needs to be written in a style compatible with other LAMMPS source files. This means: 2-character indentation per level, no tabs, no lines over 80 characters. I/O is done via the C-style stdio library, style class header files should not import any system headers outside of <cstdio>, STL containers should be avoided in headers, and forward declarations used where possible or needed. All added code should be placed into the LAMMPS_NS namespace or a sub-namespace; global or static variables should be avoided, as they conflict with the modular nature of LAMMPS and the C++ class structure. There MUST NOT be any "using namespace XXX;" statements in headers. In the implementation file (<name>.cpp) system includes should be placed in angular brackets (<>) and for c-library functions the C++ style header files should be included (<cstdio> instead of <stdio.h>, or <cstring> instead of <string.h>). This all is so the developers can more easily understand, integrate, and maintain your contribution and reduce conflicts with other parts of LAMMPS. This basically means that the code accesses data structures, performs its operations, and is formatted similar to other LAMMPS source files, including the use of the error class for error and warning messages.
* If you want your contribution to be added as a user-contributed feature, and it is a single file (actually a `<name>.cpp` and `<name>.h` file) it can be rapidly added to the USER-MISC directory. Include the one-line entry to add to the USER-MISC/README file in that directory, along with the 2 source files. You can do this multiple times if you wish to contribute several individual features.
* If you want your contribution to be added as a user-contribution and it is several related features, it is probably best to make it a user package directory with a name like USER-FOO. In addition to your new files, the directory should contain a README text file. The README should contain your name and contact information and a brief description of what your new package does. If your files depend on other LAMMPS style files also being installed (e.g. because your file is a derived class from the other LAMMPS class), then an Install.sh file is also needed to check for those dependencies. See other README and Install.sh files in other USER directories as examples. Send us a tarball of this USER-FOO directory.
* Your new source files need to have the LAMMPS copyright, GPL notice, and your name and email address at the top, like other user-contributed LAMMPS source files. They need to create a class that is inside the LAMMPS namespace. If the file is for one of the USER packages, including USER-MISC, then we are not as picky about the coding style (see above). I.e. the files do not need to be in the same stylistic format and syntax as other LAMMPS files, though that would be nice for developers as well as users who try to read your code.
@ -102,10 +102,10 @@ For bug reports, the next step is that one of the core LAMMPS developers will se
### Pull Requests
For submitting pull requests, there is a [detailed tutorial](http://lammps.sandia.gov/doc/Howto_github.html) in the LAMMPS manual. Thus only a brief breakdown of the steps is presented here. Please note, that the LAMMPS developers are still reviewing and trying to improve the process. If you are unsure about something, do not hesitate to post a question on the lammps-users mailing list or contact one fo the core LAMMPS developers.
For submitting pull requests, there is a [detailed tutorial](https://lammps.sandia.gov/doc/Howto_github.html) in the LAMMPS manual. Thus only a brief breakdown of the steps is presented here. Please note, that the LAMMPS developers are still reviewing and trying to improve the process. If you are unsure about something, do not hesitate to post a question on the lammps-users mailing list or contact one fo the core LAMMPS developers.
Immediately after the submission, the LAMMPS continuing integration server at ci.lammps.org will download your submitted branch and perform a simple compilation test, i.e. will test whether your submitted code can be compiled under various conditions. It will also do a check on whether your included documentation translates cleanly. Whether these tests are successful or fail will be recorded. If a test fails, please inspect the corresponding output on the CI server and take the necessary steps, if needed, so that the code can compile cleanly again. The test will be re-run each the pull request is updated with a push to the remote branch on GitHub.
Next a LAMMPS core developer will self-assign and do an overall technical assessment of the submission. If you are not yet registered as a LAMMPS collaborator, you will receive an invitation for that.
You may also receive comments and suggestions on the overall submission or specific details. If permitted, additional changes may be pushed into your pull request branch or a pull request may be filed in your LAMMPS fork on GitHub to include those changes.
Next a LAMMPS core developer will self-assign and do an overall technical assessment of the submission. If you are not yet registered as a LAMMPS collaborator, you will receive an invitation for that. As part of the assesment, the pull request will be categorized with labels. There are two special labels: `needs_work` (indicates that work from the submitter of the pull request is needed) and `work_in_progress` (indicates, that the assigned LAMMPS developer will make changes, if not done by the contributor who made the submit).
You may also receive comments and suggestions on the overall submission or specific details and on occasion specific requests for changes as part of the review. If permitted, also additional changes may be pushed into your pull request branch or a pull request may be filed in your LAMMPS fork on GitHub to include those changes.
The LAMMPS developer may then decide to assign the pull request to another developer (e.g. when that developer is more knowledgeable about the submitted feature or enhancement or has written the modified code). It may also happen, that additional developers are requested to provide a review and approve the changes. For submissions, that may change the general behavior of LAMMPS, or where a possibility of unwanted side effects exists, additional tests may be requested by the assigned developer.
If the assigned developer is satisfied and considers the submission ready for inclusion into LAMMPS, the pull request will receive approvals and be merged into the master branch by one of the core LAMMPS developers. After the pull request is merged, you may delete the feature branch used for the pull request in your personal LAMMPS fork.
Since the learning curve for git is quite steep for efficiently managing remote repositories, local and remote branches, pull requests and more, do not hesitate to ask questions, if you are not sure about how to do certain steps that are asked of you. Even if the changes asked of you do not make sense to you, they may be important for the LAMMPS developers. Please also note, that these all are guidelines and nothing set in stone. So depending on the nature of the contribution, the workflow may be adjusted.

View File

@ -1,31 +0,0 @@
## Summary
_Please provide a brief description of the issue_
## Type of Issue
_Is this a 'Bug Report' or a 'Suggestion for an Enhancement'?_
## Detailed Description (Enhancement Suggestion)
_Explain how you would like to see LAMMPS enhanced, what feature(s) you are looking for, provide references to relevant background information, and whether you are willing to implement the enhancement yourself or would like to participate in the implementation_
## LAMMPS Version (Bug Report)
_Please specify which LAMMPS version this issue was detected with. If this is not the latest development version, please stop and test that version, too, and report it here if the bug persists_
## Expected Behavior (Bug Report)
_Describe the expected behavior. Quote from the LAMMPS manual where needed or explain why the expected behavior is meaningful, especially when it differs from the manual_
## Actual Behavior (Bug Report)
_Describe the actual behavior, how it differs from the expected behavior, and how this can be observed. Try to be specific and do **not* use vague terms like "doesn't work" or "wrong result". Do not assume that the person reading this has any experience with or knowledge of your specific research._
## Steps to Reproduce (Bug Report)
_Describe the steps required to quickly reproduce the issue. You can attach (small) files to the section below or add URLs where to download an archive with all necessary files. Please try to create input that are as small as possible and run as fast as possible. NOTE: the less effort and time it takes to reproduce your issue, the more likely, that somebody will look into it._
## Further Information, Files, and Links
_Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications_

32
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@ -0,0 +1,32 @@
---
name: Bug report
about: Create a bug report to help us eliminate issues and improve LAMMPS
title: "[BUG] _Replace With Suitable Title_"
labels: bug
assignees: ''
---
**Summary**
_Please provide a clear and concise description of what the bug is._
**LAMMPS Version and Platform**
_Please specify precisely which LAMMPS version this issue was detected with (the first line of the output) and what platform (operating system and its version, hardware) you are running on. If possible, test with the most recent LAMMPS patch version_
**Expected Behavior**
_Describe the expected behavior. Quote from the LAMMPS manual where needed, or explain why the expected behavior is meaningful, especially when it differs from the manual_
**Actual Behavior**
_Describe the actual behavior, how it differs from the expected behavior, and how this can be observed. Try to be specific and do **not** use vague terms like "doesn't work" or "wrong result". Do not assume that the person reading this has any experience with or knowledge of your specific area of research._
**Steps to Reproduce**
_Describe the steps required to (quickly) reproduce the issue. You can attach (small) files to the section below or add URLs where to download an archive with all necessary files. Please try to create an input set that is as minimal and small as possible and reproduces the bug as quickly as possible. **NOTE:** the less effort and time it takes to reproduce your reported bug, the more likely it becomes, that somebody will look into it and fix the problem._
**Further Information, Files, and Links**
_Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications_

View File

@ -0,0 +1,20 @@
---
name: Feature request
about: Make a suggestion for a new feature or a change to LAMMPS
title: "[Feature Request] _Replace with Title_"
labels: enhancement
assignees: ''
---
**Summary**
_Please provide a brief and concise description of the suggested feature or change_
**Detailed Description**
_Please explain how you would like to see LAMMPS enhanced, what feature(s) you are looking for, what specific problems this will solve. If possible, provide references to relevant background information like publications or web pages, and whether you are planning to implement the enhancement yourself or would like to participate in the implementation. If applicable add a reference to an existing bug report or issue that this will address._
**Further Information, Files, and Links**
_Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications_

View File

@ -1,28 +1,46 @@
## Purpose
**Summary**
_Briefly describe the new feature(s), enhancement(s), or bugfix(es) included in this pull request. If this addresses an open GitHub Issue, mention the issue number, e.g. with `fixes #221` or `closes #135`, so that issue will be automatically closed when the pull request is merged_
_Briefly describe the new feature(s), enhancement(s), or bugfix(es) included in this pull request._
## Author(s)
**Related Issues**
_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
__If this addresses an open GitHub Issue, mention the issue number here. Use the phrases `fixes #221` or `closes #135`, when you want those issues to be automatically closed when the pull request is merged_
## Backward Compatibility
**Author(s)**
_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request. If this pull request adds new files to the distribution, please also provide a suitable "long-lived" e-mail address (e.g. from gmail, yahoo, outlook, etc.) for the *corresponding* author, i.e. the person the LAMMPS developers can contact directly with questions and requests related to maintenance and support of this code. now and in the future_
**Licensing**
By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under the GNU General Public License version 2.
_Please complete the following statement by adding "yes" or "no":_
My contribution may be re-licensed as LGPL (for use of LAMMPS as a library linked to proprietary software):
**Backward Compatibility**
_Please state whether any changes in the pull request break backward compatibility for inputs, and - if yes - explain what has been changed and why_
## Implementation Notes
**Implementation Notes**
_Provide any relevant details about how the changes are implemented, how correctness was verified, how other features - if any - in LAMMPS are affected_
## Post Submission Checklist
**Post Submission Checklist**
_Please check the fields below as they are completed **after** the pull request has been submitted_
_Please check the fields below as they are completed_
- [ ] The feature or features in this pull request is complete
- [ ] Suitable new documentation files and/or updates to the existing docs are included
- [ ] One or more example input decks are included
- [ ] Licensing information is complete
- [ ] Corresponding author information is complete
- [ ] The source code follows the LAMMPS formatting guidelines
- [ ] Suitable new documentation files and/or updates to the existing docs are included
- [ ] The added/updated documentation is integrated and tested with the documentation build system
- [ ] The feature has been verified to work with the conventional build system
- [ ] The feature has been verified to work with the CMake based build system
- [ ] A package specific README file has been included or updated
- [ ] One or more example input decks are included
## Further Information, Files, and Links
**Further Information, Files, and Links**
_Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. DOIs or webpages)_

View File

@ -0,0 +1,42 @@
---
name: Bug fix
about: Submit a pull request that fixes one or more bugs
title: "[BUGFIX] _Replace With Suitable Title_"
labels: bugfix
assignees: ''
---
**Summary**
_Briefly describe the bug or bugs, that are eliminated by this pull request._
**Related Issue(s)**
_If this request addresses or is related to an existing (open) GitHub issue, e.g. a bug report, mention the issue number number here following a pound sign (aka hashmark), e.g.`#222`._
**Author(s)**
_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
**Licensing**
By submitting this pull request I implicitly accept, that my submission is subject to the same licensing terms as the files that are modified.
**Backward Compatibility**
_Please state whether any changes in the pull request break backward compatibility for inputs, and - if yes - explain what has been changed and why_
**Detailed Description**
_Provide any relevant details about how the fixed bug can be reproduced, how the changes are implemented, how correctness was verified, how other features - if any - in LAMMPS are affected_
## Post Submission Checklist
_Please check the fields below as they are completed *after* the pull request is submitted_
- [ ] The code in this pull request is complete
- [ ] The source code follows the LAMMPS formatting guidelines
## Further Information, Files, and Links
_Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. to download input decks for testing)_

View File

@ -0,0 +1,35 @@
---
name: Maintenance or Refactoring
about: Submit a pull request that does code refactoring or other maintenance changes
title: "[MAINTENANCE] _Replace With Suitable Title_"
labels: maintenance
assignees: ''
---
**Summary**
_Briefly describe the included changes._
**Related Issue(s)**
_If this request addresses or is related to an existing (open) GitHub issue, e.g. a bug report, mention the issue number number here following a pound sign (aka hashmark), e.g.`#222`.
**Author(s)**
_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
**Licensing**
By submitting this pull request I implicitly accept, that my submission is subject to the same licensing terms as the files that are modified.
**Detailed Description**
_Provide any relevant details about the included changes._
## Post Submission Checklist
_Please check the fields below as they are completed *after* the pull request is submitted_
- [ ] The pull request is complete
- [ ] The source code follows the LAMMPS formatting guidelines

View File

@ -0,0 +1,56 @@
---
name: New Feature
about: Submit a pull request that adds new Features (complete files) to LAMMPS
title: "[New Feature] _Replace With Suitable Title_"
labels: enhancement
assignees: ''
---
**Summary**
_Briefly describe the new feature(s) included in this pull request._
**Related Issues**
_If this addresses an existing (open) GitHub issue, e.g. a feature request, mention the issue number here following a pound sign (aka hashmark), e.g. `#331`._
**Author(s)**
_Please state name and affiliation of the author or authors that should be credited with the features added in this pull request. Please provide a suitable "long-lived" e-mail address (e.g. from gmail, yahoo, outlook, etc.) for the *corresponding* author, i.e. the person the LAMMPS developers can contact directly with questions and requests related to maintenance and support of this code. now and in the future_
**Licensing**
_Please add *yes* or *no* to the following two statements (please contact @lammps/core if you have questions about this)_
My contribution may be licensed as GPL v2 (default LAMMPS license):
My contribution may be licensed as LGPL (for use as a library with proprietary software):
**Backward Compatibility**
_Please state if any of the changes in this pull request will affect backward compatibility for inputs, and - if yes - explain what has been changed and why_
**Implementation Notes**
_Provide any relevant details about how the new features are implemented, how correctness was verified, what platforms (OS, compiler, MPI, hardware, number of processors, accelerator(s)) it was tested on_
## Post Submission Checklist
_Please check the fields below as they are completed *after* the pull request has been submitted_
- [ ] The feature or features in this pull request is complete
- [ ] Licensing information is complete
- [ ] Corresponding author information is complete
- [ ] The source code follows the LAMMPS formatting guidelines
- [ ] Suitable new documentation files and/or updates to the existing docs are included
- [ ] The added/updated documentation is integrated and tested with the documentation build system
- [ ] The feature has been verified to work with the conventional build system
- [ ] The feature has been verified to work with the CMake based build system
- [ ] A package specific README file has been included or updated
- [ ] One or more example input decks are included
## Further Information, Files, and Links
_Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. DOIs or webpages)_

View File

@ -0,0 +1,42 @@
---
name: Update or Enhancement
about: Submit a pull request that provides update or enhancements for a package or feature in LAMMPS
title: "[UPDATE] _Replace With Suitable Title_"
labels: enhancement
assignees: ''
---
**Summary**
_Briefly describe what kind of updates or enhancements for a package or feature are included. If you are not the original author of the package or feature, please mention, whether your contribution was created independently or in collaboration/cooperation with the original author._
**Author(s)**
_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
**Licensing**
By submitting this pull request I implicitly accept, that my submission is subject to the same licensing terms as the original package or feature(s) that are updated or amended by this pull request.
**Backward Compatibility**
_Please state whether any changes in the pull request break backward compatibility for inputs, and - if yes - explain what has been changed and why_
**Implementation Notes**
_Provide any relevant details about how the changes are implemented, how correctness was verified, how other features - if any - in LAMMPS are affected_
**Post Submission Checklist**
_Please check the fields below as they are completed_
- [ ] The feature or features in this pull request is complete
- [ ] Suitable updates to the existing docs are included
- [ ] One or more example input decks are included
- [ ] The source code follows the LAMMPS formatting guidelines
**Further Information, Files, and Links**
_Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. DOIs or webpages)_

1
doc/.gitignore vendored
View File

@ -1,4 +1,5 @@
/html
/latex
/spelling
/LAMMPS.epub
/LAMMPS.mobi

View File

@ -123,7 +123,7 @@ Here are some items to check:
* float.h -> cfloat
* limits.h -> climits
* math.h -> cmath
* omplex.h -> complex
* complex.h -> complex
* setjmp.h -> csetjmp
* signal.h -> csignal
* stddef.h -> cstddef

View File

@ -61,7 +61,7 @@ OPT.
"charmm (iko)"_angle_charmm.html,
"class2 (ko)"_angle_class2.html,
"class2/p6"_angle_class2.html,
"cosine (o)"_angle_cosine.html,
"cosine (ko)"_angle_cosine.html,
"cosine/buck6d"_angle_cosine_buck6d.html,
"cosine/delta (o)"_angle_cosine_delta.html,
"cosine/periodic (o)"_angle_cosine_periodic.html,

View File

@ -12,7 +12,7 @@ Download an executable for Windows :h3
Pre-compiled Windows installers which install LAMMPS executables on a
Windows system can be downloaded from this site:
"http://rpm.lammps.org/windows.html"_http://rpm.lammps.org/windows.html
"http://packages.lammps.org/windows.html"_http://packages.lammps.org/windows.html
Note that each installer package has a date in its name, which
corresponds to the LAMMPS version of the same date. Installers for

View File

@ -1,7 +1,7 @@
<!-- HTML_ONLY -->
<HEAD>
<TITLE>LAMMPS Users Manual</TITLE>
<META NAME="docnumber" CONTENT="1 Feb 2019 version">
<META NAME="docnumber" CONTENT="8 Feb 2019 version">
<META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
<META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License.">
</HEAD>
@ -21,7 +21,7 @@
:line
LAMMPS Documentation :c,h1
1 Feb 2019 version :c,h2
8 Feb 2019 version :c,h2
"What is a LAMMPS version?"_Manual_version.html
@ -37,27 +37,21 @@ LAMMPS is an open-source code, distributed freely under the terms of
the GNU Public License (GPL).
The "LAMMPS website"_lws has a variety of information about the code.
It includes links to an on-line version of this manual, a "mail
It includes links to an on-line version of this manual, a "mailing
list"_http://lammps.sandia.gov/mail.html where users can post
questions, and a "GitHub site"https://github.com/lammps/lammps where
questions, and a "GitHub site"_https://github.com/lammps/lammps where
all LAMMPS development is coordinated.
:line
"PDF file"_Manual.pdf of the entire manual, generated by
"htmldoc"_http://freecode.com/projects/htmldoc
The content for this manual is part of the LAMMPS distribution. You
can build a local copy of the Manual as HTML pages or a PDF file, by
following the steps on the "Manual build"_Manual_build.html doc page.
There is also a "Developer.pdf"_Developer.pdf document which gives
a brief description of the basic code structure of LAMMPS.
:line
This manual is organized into the following sections.
Once you are familiar with LAMMPS, you may want to bookmark "this
page"_Commands.html since it gives quick access to a doc page for
every LAMMPS command.

View File

@ -8,6 +8,7 @@
angle_style cosine command :h3
angle_style cosine/omp command :h3
angle_style cosine/kk command :h3
[Syntax:]

View File

@ -219,7 +219,7 @@ latex_elements = {
# author, documentclass [howto, manual, or own class]).
latex_documents = [
('Manual', 'LAMMPS.tex', 'LAMMPS Documentation',
'Steve Plimpton', 'manual'),
'The LAMMPS Developers', 'manual'),
]
# The name of an image file (relative to this directory) to place at the top of

View File

@ -1,5 +1,26 @@
# Change Log
## [2.8.00](https://github.com/kokkos/kokkos/tree/2.8.00) (2019-02-05)
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.7.24...2.8.00)
**Implemented enhancements:**
- Capability, Tests: C++14 support and testing [\#1914](https://github.com/kokkos/kokkos/issues/1914)
- Capability: Add environment variables for all command line arguments [\#1798](https://github.com/kokkos/kokkos/issues/1798)
- Capability: --kokkos-ndevices not working for Slurm [\#1920](https://github.com/kokkos/kokkos/issues/1920)
- View: Undefined behavior when deep copying from and to an empty unmanaged view [\#1967](https://github.com/kokkos/kokkos/issues/1967)
- BuildSystem: nvcc\_wrapper should stop immediately if nvcc is not in PATH [\#1861](https://github.com/kokkos/kokkos/issues/1861)
**Fixed bugs:**
- Cuda: Fix Volta Issues 1 Non-deterministic behavior on Volta, runs fine on Pascal [\#1949](https://github.com/kokkos/kokkos/issues/1949)
- Cuda: Fix Volta Issues 2 CUDA Team Scan gives wrong values on Volta with -G compile flag [\#1942](https://github.com/kokkos/kokkos/issues/1942)
- Cuda: illegal warp sync in parallel\_reduce by functor on Turing 75 [\#1958](https://github.com/kokkos/kokkos/issues/1958)
- Threads: Pthreads backend does not handle RangePolicy with offset correctly [\#1976](https://github.com/kokkos/kokkos/issues/1976)
- Atomics: atomic\_fetch\_oper has no case for Kokkos::complex\<double\> or other 16-byte types [\#1951](https://github.com/kokkos/kokkos/issues/1951)
- MDRangePolicy: Fix zero-length range [\#1948](https://github.com/kokkos/kokkos/issues/1948)
- TeamThreadRange: TeamThreadRange MaxLoc reduce doesnt compile [\#1909](https://github.com/kokkos/kokkos/issues/1909)
## [2.7.24](https://github.com/kokkos/kokkos/tree/2.7.24) (2018-11-04)
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.7.00...2.7.24)

View File

@ -6,16 +6,16 @@ ifndef KOKKOS_PATH
endif
CXXFLAGS=$(CCFLAGS)
# Options: Cuda,ROCm,OpenMP,Pthread,Qthreads,Serial
# Options: Cuda,ROCm,OpenMP,Pthreads,Qthreads,Serial
KOKKOS_DEVICES ?= "OpenMP"
#KOKKOS_DEVICES ?= "Pthread"
#KOKKOS_DEVICES ?= "Pthreads"
# Options:
# Intel: KNC,KNL,SNB,HSW,BDW,SKX
# NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72
# NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72,Turing75
# ARM: ARMv80,ARMv81,ARMv8-ThunderX,ARMv8-TX2
# IBM: BGQ,Power7,Power8,Power9
# AMD-GPUS: Kaveri,Carrizo,Fiji,Vega
# AMD-CPUS: AMDAVX,Ryzen,Epyc
# AMD-CPUS: AMDAVX,Ryzen,EPYC
KOKKOS_ARCH ?= ""
# Options: yes,no
KOKKOS_DEBUG ?= "no"
@ -224,7 +224,7 @@ ifeq ($(KOKKOS_INTERNAL_COMPILER_PGI), 1)
else
ifeq ($(KOKKOS_INTERNAL_COMPILER_XL), 1)
KOKKOS_INTERNAL_CXX11_FLAG := -std=c++11
#KOKKOS_INTERNAL_CXX14_FLAG := -std=c++14
KOKKOS_INTERNAL_CXX14_FLAG := -std=c++14
KOKKOS_INTERNAL_CXX1Y_FLAG := -std=c++1y
#KOKKOS_INTERNAL_CXX17_FLAG := -std=c++17
#KOKKOS_INTERNAL_CXX1Z_FLAG := -std=c++1Z
@ -276,6 +276,7 @@ KOKKOS_INTERNAL_USE_ARCH_PASCAL61 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pas
KOKKOS_INTERNAL_USE_ARCH_PASCAL60 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal60)
KOKKOS_INTERNAL_USE_ARCH_VOLTA70 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta70)
KOKKOS_INTERNAL_USE_ARCH_VOLTA72 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta72)
KOKKOS_INTERNAL_USE_ARCH_TURING75 := $(call kokkos_has_string,$(KOKKOS_ARCH),Turing75)
KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLER30) \
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER32) \
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER35) \
@ -284,6 +285,7 @@ KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLE
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
+ $(KOKKOS_INTERNAL_USE_ARCH_TURING75) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -300,6 +302,7 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
+ $(KOKKOS_INTERNAL_USE_ARCH_TURING75) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -331,7 +334,7 @@ KOKKOS_INTERNAL_USE_ARCH_IBM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_
# AMD based.
KOKKOS_INTERNAL_USE_ARCH_AMDAVX := $(call kokkos_has_string,$(KOKKOS_ARCH),AMDAVX)
KOKKOS_INTERNAL_USE_ARCH_RYZEN := $(call kokkos_has_string,$(KOKKOS_ARCH),Ryzen)
KOKKOS_INTERNAL_USE_ARCH_EPYC := $(call kokkos_has_string,$(KOKKOS_ARCH),Epyc)
KOKKOS_INTERNAL_USE_ARCH_EPYC := $(call kokkos_has_string,$(KOKKOS_ARCH),EPYC)
KOKKOS_INTERNAL_USE_ARCH_KAVERI := $(call kokkos_has_string,$(KOKKOS_ARCH),Kaveri)
KOKKOS_INTERNAL_USE_ARCH_CARRIZO := $(call kokkos_has_string,$(KOKKOS_ARCH),Carrizo)
KOKKOS_INTERNAL_USE_ARCH_FIJI := $(call kokkos_has_string,$(KOKKOS_ARCH),Fiji)
@ -341,12 +344,12 @@ KOKKOS_INTERNAL_USE_ARCH_GFX901 := $(call kokkos_has_string,$(KOKKOS_ARCH),gfx90
# Any AVX?
KOKKOS_INTERNAL_USE_ARCH_SSE42 := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_WSM))
KOKKOS_INTERNAL_USE_ARCH_AVX := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_AMDAVX))
KOKKOS_INTERNAL_USE_ARCH_AVX2 := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW))
KOKKOS_INTERNAL_USE_ARCH_AVX2 := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW) + $(KOKKOS_INTERNAL_USE_ARCH_EPYC))
KOKKOS_INTERNAL_USE_ARCH_AVX512MIC := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KNL))
KOKKOS_INTERNAL_USE_ARCH_AVX512XEON := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_SKX))
# Decide what ISA level we are able to support.
KOKKOS_INTERNAL_USE_ISA_X86_64 := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_WSM) + $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW) + $(KOKKOS_INTERNAL_USE_ARCH_KNL) + $(KOKKOS_INTERNAL_USE_ARCH_SKX))
KOKKOS_INTERNAL_USE_ISA_X86_64 := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_WSM) + $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW) + $(KOKKOS_INTERNAL_USE_ARCH_KNL) + $(KOKKOS_INTERNAL_USE_ARCH_SKX) + $(KOKKOS_INTERNAL_USE_ARCH_EPYC))
KOKKOS_INTERNAL_USE_ISA_KNC := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KNC))
KOKKOS_INTERNAL_USE_ISA_POWERPCLE := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_POWER8) + $(KOKKOS_INTERNAL_USE_ARCH_POWER9))
KOKKOS_INTERNAL_USE_ISA_POWERPCBE := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_POWER7))
@ -658,6 +661,19 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV81), 1)
endif
endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_EPYC), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_AMD_EPYC")
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_AMD_AVX2")
ifeq ($(KOKKOS_INTERNAL_COMPILER_INTEL), 1)
KOKKOS_CXXFLAGS += -mavx2
KOKKOS_LDFLAGS += -mavx2
else
KOKKOS_CXXFLAGS += -march=znver1 -mtune=znver1
KOKKOS_LDFLAGS += -march=znver1 -mtune=znver1
endif
endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV80")
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV8_THUNDERX")
@ -950,6 +966,11 @@ ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA72")
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_72
endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_TURING75), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_TURING")
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_TURING75")
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_75
endif
ifneq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)

View File

@ -73,6 +73,8 @@ For specifics see the LICENSE file contained in the repository or distribution.
* NVCC 7.5 for CUDA (with gcc 4.8.4)
* NVCC 8.0.44 for CUDA (with gcc 5.3.0)
* NVCC 9.1 for CUDA (with gcc 6.1.0)
* NVCC 9.2 for CUDA (with gcc 7.2.0)
* NVCC 10.0 for CUDA (with gcc 7.4.0)
### Primary tested compilers on Power 8 are:
* GCC 6.4.0 (OpenMP,Serial)
@ -109,7 +111,7 @@ GCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits
-Wignored-qualifiers -Wempty-body -Wclobbered -Wuninitialized
Intel: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
Clang: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
NVCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
NVCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
Other compilers are tested occasionally, in particular when pushing from develop to
master branch, without -Werror and only for a select set of backends.

View File

@ -308,6 +308,16 @@ do
shift
done
#Check if nvcc exists
if [ $host_only -ne 1 ]; then
var=$(which nvcc )
if [ $? -gt 0 ]; then
echo "Could not find nvcc in PATH"
exit $?
fi
fi
# Only print host compiler version
if [ $get_host_version -eq 1 ]; then
$host_compiler --version

View File

@ -104,6 +104,7 @@ list(APPEND KOKKOS_ARCH_LIST
Pascal61 # (GPU) NVIDIA Pascal generation CC 6.1
Volta70 # (GPU) NVIDIA Volta generation CC 7.0
Volta72 # (GPU) NVIDIA Volta generation CC 7.2
Turing75 # (GPU) NVIDIA Turing generation CC 7.5
)
# List of possible device architectures.

View File

@ -832,16 +832,14 @@ void
deep_copy (DualView<DT,DL,DD,DM> dst, // trust me, this must not be a reference
const DualView<ST,SL,SD,SM>& src )
{
if(src.modified_flags.data()==NULL || dst.modified_flags.data()==NULL) {
return deep_copy(dst.d_view, src.d_view);
}
if (src.modified_flags(1) >= src.modified_flags(0)) {
deep_copy (dst.d_view, src.d_view);
dst.template modify<typename DualView<DT,DL,DD,DM>::device_type> ();
} else {
if ( src.need_sync_device() ) {
deep_copy (dst.h_view, src.h_view);
dst.template modify<typename DualView<DT,DL,DD,DM>::host_mirror_space> ();
dst.modify_host();
}
else {
deep_copy (dst.d_view, src.d_view);
dst.modify_device();
}
}
template< class ExecutionSpace ,
@ -852,15 +850,12 @@ deep_copy (const ExecutionSpace& exec ,
DualView<DT,DL,DD,DM> dst, // trust me, this must not be a reference
const DualView<ST,SL,SD,SM>& src )
{
if(src.modified_flags.data()==NULL || dst.modified_flags.data()==NULL) {
return deep_copy(exec, dst.d_view, src.d_view);
}
if (src.modified_flags(1) >= src.modified_flags(0)) {
deep_copy (exec, dst.d_view, src.d_view);
dst.template modify<typename DualView<DT,DL,DD,DM>::device_type> ();
} else {
if ( src.need_sync_device() ) {
deep_copy (exec, dst.h_view, src.h_view);
dst.template modify<typename DualView<DT,DL,DD,DM>::host_mirror_space> ();
dst.modify_host();
} else {
deep_copy (exec, dst.d_view, src.d_view);
dst.modify_device();
}
}

View File

@ -368,8 +368,8 @@ public:
enum { is_assignable = is_assignable_value_type &&
is_assignable_layout };
typedef ViewMapping< DstTraits , void > DstType ;
typedef ViewMapping< SrcTraits , void > SrcType ;
typedef ViewMapping< DstTraits , typename DstTraits::specialize > DstType ;
typedef ViewMapping< SrcTraits , typename SrcTraits::specialize > SrcType ;
template < typename DT , typename ... DP , typename ST , typename ... SP >
KOKKOS_INLINE_FUNCTION
@ -432,7 +432,7 @@ public:
private:
typedef Kokkos::Impl::ViewMapping< traits , void > map_type ;
typedef Kokkos::Impl::ViewMapping< traits , typename traits::specialize > map_type ;
typedef Kokkos::Impl::SharedAllocationTracker track_type ;
track_type m_track ;
@ -567,11 +567,11 @@ public:
// Allow specializations to query their specialized map
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
KOKKOS_INLINE_FUNCTION
const Kokkos::Impl::ViewMapping< traits , void > &
const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
implementation_map() const { return m_map ; }
#endif
KOKKOS_INLINE_FUNCTION
const Kokkos::Impl::ViewMapping< traits , void > &
const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
impl_map() const { return m_map ; }
//----------------------------------------
@ -952,7 +952,7 @@ public:
, m_rank(rhs.m_rank)
{
typedef typename DynRankView<RT,RP...> ::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
Mapping::assign( m_map , rhs.m_map , rhs.m_track );
}
@ -962,7 +962,7 @@ public:
DynRankView & operator = (const DynRankView<RT,RP...> & rhs )
{
typedef typename DynRankView<RT,RP...> ::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
Mapping::assign( m_map , rhs.m_map , rhs.m_track );
m_track.assign( rhs.m_track , traits::is_managed );
@ -980,7 +980,7 @@ public:
{
typedef typename View<RT,RP...>::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Impl::ViewToDynRankViewTag > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
static_assert( Mapping::is_assignable , "Incompatible View to DynRankView copy construction" );
Mapping::assign( *this , rhs );
}
@ -1432,7 +1432,7 @@ public:
, Args ... args )
{
typedef ViewMapping< traits_type, void > DstType ;
typedef ViewMapping< traits_type, typename traits_type::specialize > DstType ;
typedef typename std::conditional< (rank==0) , ViewDimension<>
, typename std::conditional< (rank==1) , ViewDimension<0>

View File

@ -101,13 +101,98 @@ namespace Impl {
result = run_me< Kokkos::DualView<Scalar**,Kokkos::LayoutLeft,Device> >(size,3);
}
};
};
template < typename Scalar, class ViewType >
struct SumViewEntriesFunctor {
typedef Scalar value_type;
ViewType fv;
SumViewEntriesFunctor ( const ViewType & fv_ ) : fv(fv_) {}
KOKKOS_INLINE_FUNCTION
void operator() ( const int i , value_type & total ) const {
for ( size_t j = 0; j < fv.extent(1); ++j ) {
total += fv(i,j);
}
}
};
template <typename Scalar, class Device>
struct test_dual_view_deep_copy
{
typedef Scalar scalar_type;
typedef Device execution_space;
template <typename ViewType>
void run_me() {
const unsigned int n = 10;
const unsigned int m = 5;
const unsigned int sum_total = n * m;
ViewType a("A",n,m);
ViewType b("B",n,m);
Kokkos::deep_copy( a.d_view , 1 );
a.template modify<typename ViewType::execution_space>();
a.template sync<typename ViewType::host_mirror_space>();
// Check device view is initialized as expected
scalar_type a_d_sum = 0;
// Execute on the execution_space associated with t_dev's memory space
typedef typename ViewType::t_dev::memory_space::execution_space t_dev_exec_space;
Kokkos::parallel_reduce( Kokkos::RangePolicy<t_dev_exec_space>(0,n), SumViewEntriesFunctor<scalar_type, typename ViewType::t_dev>(a.d_view), a_d_sum );
ASSERT_EQ(a_d_sum, sum_total);
// Check host view is synced as expected
scalar_type a_h_sum = 0;
for ( size_t i = 0; i < a.h_view.extent(0); ++i )
for ( size_t j = 0; j < a.h_view.extent(1); ++j ) {
a_h_sum += a.h_view(i,j);
}
ASSERT_EQ(a_h_sum, sum_total);
// Test deep_copy
Kokkos::deep_copy( b, a );
b.template sync<typename ViewType::host_mirror_space>();
// Perform same checks on b as done on a
// Check device view is initialized as expected
scalar_type b_d_sum = 0;
// Execute on the execution_space associated with t_dev's memory space
Kokkos::parallel_reduce( Kokkos::RangePolicy<t_dev_exec_space>(0,n), SumViewEntriesFunctor<scalar_type, typename ViewType::t_dev>(b.d_view), b_d_sum );
ASSERT_EQ(b_d_sum, sum_total);
// Check host view is synced as expected
scalar_type b_h_sum = 0;
for ( size_t i = 0; i < b.h_view.extent(0); ++i )
for ( size_t j = 0; j < b.h_view.extent(1); ++j ) {
b_h_sum += b.h_view(i,j);
}
ASSERT_EQ(b_h_sum, sum_total);
} // end run_me
test_dual_view_deep_copy()
{
run_me< Kokkos::DualView<Scalar**,Kokkos::LayoutLeft,Device> >();
}
};
} // namespace Impl
template <typename Scalar, typename Device>
void test_dualview_combinations(unsigned int size)
{
@ -116,10 +201,21 @@ void test_dualview_combinations(unsigned int size)
}
template <typename Scalar, typename Device>
void test_dualview_deep_copy()
{
Impl::test_dual_view_deep_copy<Scalar,Device> ();
}
TEST_F( TEST_CATEGORY, dualview_combination) {
test_dualview_combinations<int,TEST_EXECSPACE>(10);
}
TEST_F( TEST_CATEGORY, dualview_deep_copy) {
test_dualview_deep_copy<int,TEST_EXECSPACE>();
test_dualview_deep_copy<double,TEST_EXECSPACE>();
}
} // namespace Test

View File

@ -829,7 +829,8 @@ void* cuda_resize_scratch_space(std::int64_t bytes, bool force_shrink) {
}
if(bytes > current_size) {
current_size = bytes;
ptr = Kokkos::kokkos_realloc<Kokkos::CudaSpace>(ptr,current_size);
Kokkos::kokkos_free<Kokkos::CudaSpace>(ptr);
ptr = Kokkos::kokkos_malloc<Kokkos::CudaSpace>("CudaSpace::ScratchMemory",current_size);
}
if((bytes < current_size) && (force_shrink)) {
current_size = bytes;

View File

@ -561,7 +561,11 @@ void CudaInternal::initialize( int cuda_device_id , int stream_count )
}
#endif
#ifdef KOKKOS_ENABLE_PRE_CUDA_10_DEPRECATION_API
cudaThreadSetCacheConfig(cudaFuncCachePreferShared);
#else
cudaDeviceSetCacheConfig(cudaFuncCachePreferShared);
#endif
// Init the array for used for arbitrarily sized atomics
Impl::initialize_host_cuda_lock_arrays();

View File

@ -525,6 +525,7 @@ public:
inline
void execute() const
{
if(m_rp.m_num_tiles==0) return;
const array_index_type maxblocks = static_cast<array_index_type>(Kokkos::Impl::CudaTraits::UpperBoundGridCount);
if ( RP::rank == 2 )
{
@ -685,7 +686,7 @@ public:
typename Policy::member_type( kokkos_impl_cuda_shared_memory<void>()
, m_shmem_begin
, m_shmem_size
, (void*) ( ((char*)m_scratch_ptr[1]) + threadid/(blockDim.x*blockDim.y) * m_scratch_size[1])
, (void*) ( ((char*)m_scratch_ptr[1]) + ptrdiff_t(threadid/(blockDim.x*blockDim.y)) * m_scratch_size[1])
, m_scratch_size[1]
, league_rank
, m_league_size ) );
@ -1336,7 +1337,7 @@ public:
( Member( kokkos_impl_cuda_shared_memory<char>() + m_team_begin
, m_shmem_begin
, m_shmem_size
, (void*) ( ((char*)m_scratch_ptr[1]) + threadid/(blockDim.x*blockDim.y) * m_scratch_size[1])
, (void*) ( ((char*)m_scratch_ptr[1]) + ptrdiff_t(threadid/(blockDim.x*blockDim.y)) * m_scratch_size[1])
, m_scratch_size[1]
, league_rank
, m_league_size )
@ -1378,7 +1379,7 @@ public:
( Member( kokkos_impl_cuda_shared_memory<char>() + m_team_begin
, m_shmem_begin
, m_shmem_size
, (void*) ( ((char*)m_scratch_ptr[1]) + threadid/(blockDim.x*blockDim.y) * m_scratch_size[1])
, (void*) ( ((char*)m_scratch_ptr[1]) + ptrdiff_t(threadid/(blockDim.x*blockDim.y)) * m_scratch_size[1])
, m_scratch_size[1]
, league_rank
, m_league_size )
@ -2064,7 +2065,7 @@ private:
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(MASK);
#else
KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
KOKKOS_IMPL_CUDA_SYNCWARP;
#endif
if ( CudaTraits::WarpSize < word_count.value ) { __syncthreads(); } // Protect against large scan values.
@ -2291,7 +2292,7 @@ private:
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(MASK);
#else
KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
KOKKOS_IMPL_CUDA_SYNCWARP;
#endif
if ( CudaTraits::WarpSize < word_count.value ) { __syncthreads(); } // Protect against large scan values.

View File

@ -321,7 +321,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 2) {
value_type tmp = Kokkos::shfl_down(value, 2,32);
@ -331,7 +331,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 4) {
value_type tmp = Kokkos::shfl_down(value, 4,32);
@ -341,7 +341,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 8) {
value_type tmp = Kokkos::shfl_down(value, 8,32);
@ -351,7 +351,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 16) {
value_type tmp = Kokkos::shfl_down(value, 16,32);
@ -361,7 +361,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
}
}
@ -506,7 +506,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 2) {
value_type tmp = Kokkos::shfl_down(value, 2,32);
@ -516,7 +516,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 4) {
value_type tmp = Kokkos::shfl_down(value, 4,32);
@ -526,7 +526,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 8) {
value_type tmp = Kokkos::shfl_down(value, 8,32);
@ -536,7 +536,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
if (int(blockDim.x*blockDim.y) > 16) {
value_type tmp = Kokkos::shfl_down(value, 16,32);
@ -546,7 +546,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
active += KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
}
}
@ -578,7 +578,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, true> {
const int width, // How much of the warp participates
Scalar& result)
{
unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)%(32/width))*width;
unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)/width)*width;
for(int delta=skip_vector?blockDim.x:1; delta<width; delta*=2) {
Scalar tmp;
cuda_shfl_down(tmp,value,delta,width,mask);
@ -683,7 +683,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, false> {
const int width) // How much of the warp participates
{
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)%(32/width))*width;
unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)/width)*width;
#endif
const int lane_id = (threadIdx.y*blockDim.x+threadIdx.x)%32;
for(int delta=skip_vector?blockDim.x:1; delta<width; delta*=2) {
@ -693,7 +693,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, false> {
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(mask);
#else
KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
KOKKOS_IMPL_CUDA_SYNCWARP;
#endif
}
*value=*(value-lane_id);
@ -779,7 +779,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, false> {
/*
* Algorithmic constraints:
* (a) blockDim.y is a power of two
* (b) blockDim.y <= 512
* (b) blockDim.y <= 1024
* (c) blockDim.x == blockDim.z == 1
*/
@ -828,14 +828,26 @@ void cuda_intra_block_reduce_scan( const FunctorType & functor ,
{ // Inter-warp reduce-scan by a single warp to avoid extra synchronizations
const unsigned rtid_inter = ( threadIdx.y ^ BlockSizeMask ) << CudaTraits::WarpIndexShift ;
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
unsigned inner_mask = KOKKOS_IMPL_CUDA_BALLOT_MASK(0xffffffff,(rtid_inter<blockDim.y));
#endif
if ( rtid_inter < blockDim.y ) {
const pointer_type tdata_inter = base_data + value_count * ( rtid_inter ^ BlockSizeMask );
if ( (1<<5) < BlockSizeMask ) { BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,5) }
if ( (1<<6) < BlockSizeMask ) { __threadfence_block(); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,6) }
if ( (1<<7) < BlockSizeMask ) { __threadfence_block(); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,7) }
if ( (1<<8) < BlockSizeMask ) { __threadfence_block(); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,8) }
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
if ( (1<<5) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,5) }
if ( (1<<6) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,6) }
if ( (1<<7) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,7) }
if ( (1<<8) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,8) }
if ( (1<<9) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,9) }
#else
if ( (1<<5) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,5) }
if ( (1<<6) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,6) }
if ( (1<<7) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,7) }
if ( (1<<8) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,8) }
if ( (1<<9) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,9) }
#endif
if ( DoScan ) {
@ -846,10 +858,17 @@ void cuda_intra_block_reduce_scan( const FunctorType & functor ,
if ( ! ( rtid_inter + n < blockDim.y ) ) n = 0 ;
__threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,8)
__threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,7)
__threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,6)
__threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,5)
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,8)
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,7)
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,6)
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,5)
#else
KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,8)
KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,7)
KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,6)
KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,5)
#endif
}
}
}
@ -864,19 +883,17 @@ void cuda_intra_block_reduce_scan( const FunctorType & functor ,
( rtid_intra & 16 ) ? 16 : 0 ))));
if ( ! ( rtid_intra + n < blockDim.y ) ) n = 0 ;
#ifdef KOKKOS_IMPL_CUDA_CLANG_WORKAROUND
BLOCK_SCAN_STEP(tdata_intra,n,4) __syncthreads();//__threadfence_block();
BLOCK_SCAN_STEP(tdata_intra,n,3) __syncthreads();//__threadfence_block();
BLOCK_SCAN_STEP(tdata_intra,n,2) __syncthreads();//__threadfence_block();
BLOCK_SCAN_STEP(tdata_intra,n,1) __syncthreads();//__threadfence_block();
BLOCK_SCAN_STEP(tdata_intra,n,0) __syncthreads();
#else
KOKKOS_IMPL_CUDA_SYNCWARP;
BLOCK_SCAN_STEP(tdata_intra,n,4) __threadfence_block();
KOKKOS_IMPL_CUDA_SYNCWARP;
BLOCK_SCAN_STEP(tdata_intra,n,3) __threadfence_block();
KOKKOS_IMPL_CUDA_SYNCWARP;
BLOCK_SCAN_STEP(tdata_intra,n,2) __threadfence_block();
KOKKOS_IMPL_CUDA_SYNCWARP;
BLOCK_SCAN_STEP(tdata_intra,n,1) __threadfence_block();
KOKKOS_IMPL_CUDA_SYNCWARP;
BLOCK_SCAN_STEP(tdata_intra,n,0) __threadfence_block();
#endif
KOKKOS_IMPL_CUDA_SYNCWARP;
}
#undef BLOCK_SCAN_STEP

View File

@ -290,7 +290,7 @@ public:
// Intra vector lane shuffle reduction:
typename ReducerType::value_type tmp ( reducer.reference() );
unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x;
unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<((threadIdx.y%(32/blockDim.x))*blockDim.x);
for ( int i = blockDim.x ; ( i >>= 1 ) ; ) {
cuda_shfl_down( reducer.reference() , tmp , i , blockDim.x , mask );
@ -742,7 +742,7 @@ void parallel_for
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x);
#else
KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
KOKKOS_IMPL_CUDA_SYNCWARP;
#endif
#endif
}
@ -915,7 +915,7 @@ void single(const Impl::VectorSingleStruct<Impl::CudaTeamMember>& , const Functo
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x);
#else
KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
KOKKOS_IMPL_CUDA_SYNCWARP;
#endif
#endif
}
@ -928,7 +928,7 @@ void single(const Impl::ThreadSingleStruct<Impl::CudaTeamMember>& , const Functo
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
KOKKOS_IMPL_CUDA_SYNCWARP_MASK(blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x);
#else
KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
KOKKOS_IMPL_CUDA_SYNCWARP;
#endif
#endif
}
@ -938,7 +938,7 @@ KOKKOS_INLINE_FUNCTION
void single(const Impl::VectorSingleStruct<Impl::CudaTeamMember>& , const FunctorType& lambda, ValueType& val) {
#ifdef __CUDA_ARCH__
if(threadIdx.x == 0) lambda(val);
unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x;
unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<((threadIdx.y%(32/blockDim.x))*blockDim.x);
Impl::cuda_shfl(val,val,0,blockDim.x,mask);
#endif
}

View File

@ -4,9 +4,9 @@
#if ( CUDA_VERSION < 9000 )
#define KOKKOS_IMPL_CUDA_ACTIVEMASK 0
#define KOKKOS_IMPL_CUDA_SYNCWARP __threadfence_block()
#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK __threadfence_block()
#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) if(m)__threadfence_block()
#define KOKKOS_IMPL_CUDA_BALLOT(x) __ballot(x)
#define KOKKOS_IMPL_CUDA_BALLOT_MASK(x) __ballot(x)
#define KOKKOS_IMPL_CUDA_BALLOT_MASK(m,x) __ballot(x)
#define KOKKOS_IMPL_CUDA_SHFL(x,y,z) __shfl(x,y,z)
#define KOKKOS_IMPL_CUDA_SHFL_MASK(m,x,y,z) __shfl(x,y,z)
#define KOKKOS_IMPL_CUDA_SHFL_UP(x,y,z) __shfl_up(x,y,z)
@ -16,7 +16,7 @@
#else
#define KOKKOS_IMPL_CUDA_ACTIVEMASK __activemask()
#define KOKKOS_IMPL_CUDA_SYNCWARP __syncwarp(0xffffffff)
#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) __syncwarp(m);
#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) __syncwarp(m)
#define KOKKOS_IMPL_CUDA_BALLOT(x) __ballot_sync(__activemask(),x)
#define KOKKOS_IMPL_CUDA_BALLOT_MASK(m,x) __ballot_sync(m,x)
#define KOKKOS_IMPL_CUDA_SHFL(x,y,z) __shfl_sync(0xffffffff,x,y,z)
@ -29,9 +29,9 @@
#else
#define KOKKOS_IMPL_CUDA_ACTIVEMASK 0
#define KOKKOS_IMPL_CUDA_SYNCWARP
#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK
#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) (void)m
#define KOKKOS_IMPL_CUDA_BALLOT(x) 0
#define KOKKOS_IMPL_CUDA_BALLOT_MASK(x) 0
#define KOKKOS_IMPL_CUDA_BALLOT_MASK(m,x) 0
#define KOKKOS_IMPL_CUDA_SHFL(x,y,z) 0
#define KOKKOS_IMPL_CUDA_SHFL_MASK(m,x,y,z) 0
#define KOKKOS_IMPL_CUDA_SHFL_UP(x,y,z) 0

View File

@ -1401,7 +1401,33 @@ void deep_copy
typedef typename src_type::memory_space src_memory_space ;
typedef typename dst_type::value_type dst_value_type ;
typedef typename src_type::value_type src_value_type ;
if(dst.data() == NULL && src.data() == NULL) {
if(dst.data() == NULL || src.data() == NULL) {
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
// do nothing
#else
// throw if dimension mismatch
if ( (src.extent(0) != dst.extent(0)) ||
(src.extent(1) != dst.extent(1)) ||
(src.extent(2) != dst.extent(2)) ||
(src.extent(3) != dst.extent(3)) ||
(src.extent(4) != dst.extent(4)) ||
(src.extent(5) != dst.extent(5)) ||
(src.extent(6) != dst.extent(6)) ||
(src.extent(7) != dst.extent(7))
) {
std::string message("Deprecation Error: Kokkos::deep_copy extents of views don't match: ");
message += dst.label(); message += "(";
for(int r = 0; r<dst_type::Rank-1; r++)
{ message+= std::to_string(dst.extent(r)); message += ","; }
message+= std::to_string(dst.extent(dst_type::Rank-1)); message += ") ";
message += src.label(); message += "(";
for(int r = 0; r<src_type::Rank-1; r++)
{ message+= std::to_string(src.extent(r)); message += ","; }
message+= std::to_string(src.extent(src_type::Rank-1)); message += ") ";
Kokkos::Impl::throw_runtime_exception(message);
}
#endif
Kokkos::fence();
return;
}
@ -1646,7 +1672,33 @@ void deep_copy
typedef typename dst_type::value_type dst_value_type ;
typedef typename src_type::value_type src_value_type ;
if(dst.data() == NULL && src.data() == NULL) {
if(dst.data() == NULL || src.data() == NULL) {
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
// do nothing
#else
// throw if dimension mismatch
if ( (src.extent(0) != dst.extent(0)) ||
(src.extent(1) != dst.extent(1)) ||
(src.extent(2) != dst.extent(2)) ||
(src.extent(3) != dst.extent(3)) ||
(src.extent(4) != dst.extent(4)) ||
(src.extent(5) != dst.extent(5)) ||
(src.extent(6) != dst.extent(6)) ||
(src.extent(7) != dst.extent(7))
) {
std::string message("Deprecation Error: Kokkos::deep_copy extents of views don't match: ");
message += dst.label(); message += "(";
for(int r = 0; r<dst_type::Rank-1; r++)
{ message+= std::to_string(dst.extent(r)); message += ","; }
message+= std::to_string(dst.extent(dst_type::Rank-1)); message += ") ";
message += src.label(); message += "(";
for(int r = 0; r<src_type::Rank-1; r++)
{ message+= std::to_string(src.extent(r)); message += ","; }
message+= std::to_string(src.extent(src_type::Rank-1)); message += ") ";
Kokkos::Impl::throw_runtime_exception(message);
}
#endif
exec_space.fence();
return;
}

View File

@ -100,32 +100,27 @@ public:
row_map_type row_map;
entries_type entries;
//! Construct an empty view.
Crs() : row_map(), entries() {}
//! Copy constructor (shallow copy).
Crs(const Crs& rhs) : row_map(rhs.row_map), entries(rhs.entries)
{}
/*
* Default Constructors, operators and destructor
*/
KOKKOS_FUNCTION Crs() = default;
KOKKOS_FUNCTION Crs(Crs const &) = default;
KOKKOS_FUNCTION Crs(Crs &&) = default;
KOKKOS_FUNCTION Crs& operator=(Crs const &) = default;
KOKKOS_FUNCTION Crs& operator=(Crs &&) = default;
KOKKOS_FUNCTION ~Crs() = default;
/** \brief Assign to a view of the rhs array.
* If the old view is the last view
* then allocated memory is deallocated.
*/
template<class EntriesType, class RowMapType>
Crs(const RowMapType& row_map_, const EntriesType& entries_) : row_map(row_map_), entries(entries_)
{}
/** \brief Assign to a view of the rhs array.
* If the old view is the last view
* then allocated memory is deallocated.
*/
Crs& operator= (const Crs& rhs) {
row_map = rhs.row_map;
entries = rhs.entries;
return *this;
KOKKOS_INLINE_FUNCTION
Crs(const RowMapType& row_map_, const EntriesType& entries_)
: row_map(row_map_), entries(entries_)
{
}
/** \brief Destroy this view of the array.
* If the last view then allocated memory is deallocated.
*/
~Crs() {}
/** \brief Return number of rows in the graph
*/
KOKKOS_INLINE_FUNCTION

View File

@ -170,6 +170,10 @@
// see https://github.com/kokkos/kokkos/issues/1470
#define KOKKOS_CUDA_9_DEFAULTED_BUG_WORKAROUND
#endif
#if ( 10000 > CUDA_VERSION )
#define KOKKOS_ENABLE_PRE_CUDA_10_DEPRECATION_API
#endif
#endif // #if defined( KOKKOS_ENABLE_CUDA ) && defined( __CUDACC__ )
//----------------------------------------------------------------------------

View File

@ -505,7 +505,7 @@ public:
}
KOKKOS_INLINE_FUNCTION
value_type& reference() {
value_type& reference() const {
return *value;
}
@ -559,7 +559,7 @@ public:
}
KOKKOS_INLINE_FUNCTION
value_type& reference() {
value_type& reference() const {
return *value;
}
@ -637,7 +637,7 @@ public:
}
KOKKOS_INLINE_FUNCTION
value_type& reference() {
value_type& reference() const {
return *value;
}
@ -727,7 +727,7 @@ public:
}
KOKKOS_INLINE_FUNCTION
value_type& reference() {
value_type& reference() const {
return *value;
}

View File

@ -198,6 +198,7 @@ struct ViewTraits< void >
typedef void HostMirrorSpace ;
typedef void array_layout ;
typedef void memory_traits ;
typedef void specialize ;
};
template< class ... Prop >
@ -209,6 +210,7 @@ struct ViewTraits< void , void , Prop ... >
typedef typename ViewTraits<void,Prop...>::HostMirrorSpace HostMirrorSpace ;
typedef typename ViewTraits<void,Prop...>::array_layout array_layout ;
typedef typename ViewTraits<void,Prop...>::memory_traits memory_traits ;
typedef typename ViewTraits<void,Prop...>::specialize specialize ;
};
template< class ArrayLayout , class ... Prop >
@ -221,6 +223,7 @@ struct ViewTraits< typename std::enable_if< Kokkos::Impl::is_array_layout<ArrayL
typedef typename ViewTraits<void,Prop...>::HostMirrorSpace HostMirrorSpace ;
typedef ArrayLayout array_layout ;
typedef typename ViewTraits<void,Prop...>::memory_traits memory_traits ;
typedef typename ViewTraits<void,Prop...>::specialize specialize ;
};
template< class Space , class ... Prop >
@ -239,6 +242,7 @@ struct ViewTraits< typename std::enable_if< Kokkos::Impl::is_space<Space>::value
typedef typename Kokkos::Impl::HostMirror< Space >::Space HostMirrorSpace ;
typedef typename execution_space::array_layout array_layout ;
typedef typename ViewTraits<void,Prop...>::memory_traits memory_traits ;
typedef typename ViewTraits<void,Prop...>::specialize specialize ;
};
template< class MemoryTraits , class ... Prop >
@ -257,6 +261,7 @@ struct ViewTraits< typename std::enable_if< Kokkos::Impl::is_memory_traits<Memor
typedef void HostMirrorSpace ;
typedef void array_layout ;
typedef MemoryTraits memory_traits ;
typedef void specialize ;
};
@ -335,7 +340,12 @@ public:
typedef ArrayLayout array_layout ;
typedef typename data_analysis::dimension dimension ;
typedef typename data_analysis::specialize specialize /* mapping specialization tag */ ;
typedef typename std::conditional<
std::is_same<typename data_analysis::specialize,void>::value
,typename prop::specialize
,typename data_analysis::specialize>::type
specialize ; /* mapping specialization tag */
enum { rank = dimension::rank };
enum { rank_dynamic = dimension::rank_dynamic };
@ -542,7 +552,7 @@ public:
private:
typedef Kokkos::Impl::ViewMapping< traits , void > map_type ;
typedef Kokkos::Impl::ViewMapping< traits , typename traits::specialize > map_type ;
typedef Kokkos::Impl::SharedAllocationTracker track_type ;
track_type m_track ;
@ -608,13 +618,18 @@ public:
template< typename iType >
KOKKOS_INLINE_FUNCTION constexpr
typename std::enable_if< std::is_integral<iType>::value , size_t >::type
extent( const iType & r ) const
extent( const iType & r ) const noexcept
{ return m_map.extent(r); }
static KOKKOS_INLINE_FUNCTION constexpr
size_t
static_extent( const unsigned r ) noexcept
{ return map_type::static_extent(r); }
template< typename iType >
KOKKOS_INLINE_FUNCTION constexpr
typename std::enable_if< std::is_integral<iType>::value , int >::type
extent_int( const iType & r ) const
extent_int( const iType & r ) const noexcept
{ return static_cast<int>(m_map.extent(r)); }
KOKKOS_INLINE_FUNCTION constexpr
@ -709,11 +724,11 @@ public:
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
KOKKOS_INLINE_FUNCTION
const Kokkos::Impl::ViewMapping< traits , void > &
const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
implementation_map() const { return m_map ; }
#endif
KOKKOS_INLINE_FUNCTION
const Kokkos::Impl::ViewMapping< traits , void > &
const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
impl_map() const { return m_map ; }
KOKKOS_INLINE_FUNCTION
const Kokkos::Impl::SharedAllocationTracker &
@ -1955,7 +1970,7 @@ public:
, m_map()
{
typedef typename View<RT,RP...>::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible View copy construction" );
Mapping::assign( m_map , rhs.m_map , rhs.m_track );
}
@ -1965,7 +1980,7 @@ public:
View & operator = ( const View<RT,RP...> & rhs )
{
typedef typename View<RT,RP...>::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible View copy assignment" );
Mapping::assign( m_map , rhs.m_map , rhs.m_track );
m_track.assign( rhs.m_track , traits::is_managed );
@ -1992,7 +2007,7 @@ public:
typedef typename Mapping::type DstType ;
static_assert( Kokkos::Impl::ViewMapping< traits , typename DstType::traits , void >::is_assignable
static_assert( Kokkos::Impl::ViewMapping< traits , typename DstType::traits , typename traits::specialize >::is_assignable
, "Subview construction requires compatible view and subview arguments" );
Mapping::assign( m_map, src_view.m_map, arg0 , args... );
@ -2266,10 +2281,10 @@ public:
}
template <class Traits>
KOKKOS_INLINE_FUNCTION
View( const track_type & track, const Kokkos::Impl::ViewMapping< Traits , void > &map ) :
View( const track_type & track, const Kokkos::Impl::ViewMapping< Traits , typename Traits::specialize > &map ) :
m_track(track), m_map()
{
typedef Kokkos::Impl::ViewMapping< traits , Traits , void > Mapping ;
typedef Kokkos::Impl::ViewMapping< traits , Traits , typename traits::specialize > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible View copy construction" );
Mapping::assign( m_map , map , track );
}

View File

@ -142,16 +142,15 @@ private:
WorkRange range( self.m_policy , exec.pool_rank() , exec.pool_size() );
exec.set_work_range(range.begin(),range.end(),self.m_policy.chunk_size());
exec.set_work_range(range.begin()-self.m_policy.begin(),range.end()-self.m_policy.begin(),self.m_policy.chunk_size());
exec.reset_steal_target();
exec.barrier();
long work_index = exec.get_work_index();
while(work_index != -1) {
const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size();
const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size()+self.m_policy.begin();
const Member end = begin + self.m_policy.chunk_size() < self.m_policy.end()?begin+self.m_policy.chunk_size():self.m_policy.end();
ParallelFor::template exec_range< WorkTag >
( self.m_functor , begin , end );
work_index = exec.get_work_index();
@ -470,14 +469,14 @@ private:
const ParallelReduce & self = * ((const ParallelReduce *) arg );
const WorkRange range( self.m_policy, exec.pool_rank(), exec.pool_size() );
exec.set_work_range(range.begin(),range.end(),self.m_policy.chunk_size());
exec.set_work_range(range.begin()-self.m_policy.begin(),range.end()-self.m_policy.begin(),self.m_policy.chunk_size());
exec.reset_steal_target();
exec.barrier();
long work_index = exec.get_work_index();
reference_type update = ValueInit::init( ReducerConditional::select(self.m_functor , self.m_reducer) , exec.reduce_memory() );
while(work_index != -1) {
const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size();
const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size() + self.m_policy.begin();
const Member end = begin + self.m_policy.chunk_size() < self.m_policy.end()?begin+self.m_policy.chunk_size():self.m_policy.end();
ParallelReduce::template exec_range< WorkTag >
( self.m_functor , begin , end

View File

@ -111,7 +111,7 @@ T atomic_compare_exchange( volatile T * const dest , const T & compare ,
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
unsigned int done_active = 0;
while (active!=done_active) {
@ -127,7 +127,7 @@ T atomic_compare_exchange( volatile T * const dest , const T & compare ,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
#else
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
#endif
}
return return_val;
@ -308,6 +308,16 @@ T atomic_compare_exchange( volatile T * const dest_v, const T compare, const T v
#endif
#endif // !defined ROCM_ATOMICS
// dummy for non-CUDA Kokkos headers being processed by NVCC
#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
template <typename T>
__inline__ __device__
T atomic_compare_exchange(volatile T * const, const Kokkos::Impl::identity_t<T>, const Kokkos::Impl::identity_t<T>)
{
return T();
}
#endif
template <typename T>
KOKKOS_INLINE_FUNCTION
bool atomic_compare_exchange_strong(volatile T* const dest, const T compare, const T val)

View File

@ -134,7 +134,7 @@ T atomic_exchange( volatile T * const dest ,
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
unsigned int done_active = 0;
while (active!=done_active) {
@ -149,7 +149,7 @@ T atomic_exchange( volatile T * const dest ,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
#else
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
#endif
}
return return_val;
@ -418,6 +418,23 @@ void atomic_assign( volatile T * const dest_v , const T val )
#endif
#endif
// dummy for non-CUDA Kokkos headers being processed by NVCC
#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
template <typename T>
__inline__ __device__
T atomic_exchange(volatile T * const, const Kokkos::Impl::identity_t<T>)
{
return T();
}
template < typename T >
__inline__ __device__
void atomic_assign(volatile T * const, const Kokkos::Impl::identity_t<T>)
{
}
#endif
} // namespace Kokkos
#endif

View File

@ -147,7 +147,7 @@ T atomic_fetch_add( volatile T * const dest ,
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
unsigned int done_active = 0;
while (active!=done_active) {
@ -164,7 +164,7 @@ T atomic_fetch_add( volatile T * const dest ,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
#else
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
#endif
}
return return_val;
@ -384,6 +384,15 @@ T atomic_fetch_add( volatile T * const dest_v , typename std::add_const<T>::type
#endif // !defined ROCM_ATOMICS
//----------------------------------------------------------------------------
// dummy for non-CUDA Kokkos headers being processed by NVCC
#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
template< typename T >
__inline__ __device__
T atomic_fetch_add(volatile T* const, Kokkos::Impl::identity_t<T>) {
return T();
}
#endif
// Simpler version of atomic_fetch_add without the fetch
template <typename T>
KOKKOS_INLINE_FUNCTION

View File

@ -149,6 +149,15 @@ T atomic_fetch_and( volatile T * const dest_v , const T val )
#endif
//----------------------------------------------------------------------------
// dummy for non-CUDA Kokkos headers being processed by NVCC
#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
template< typename T >
__inline__ __device__
T atomic_fetch_and(volatile T* const, Kokkos::Impl::identity_t<T>) {
return T();
}
#endif
// Simpler version of atomic_fetch_and without the fetch
template <typename T>
KOKKOS_INLINE_FUNCTION

View File

@ -149,6 +149,15 @@ T atomic_fetch_or( volatile T * const dest_v , const T val )
#endif
//----------------------------------------------------------------------------
// dummy for non-CUDA Kokkos headers being processed by NVCC
#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
template< typename T >
__inline__ __device__
T atomic_fetch_or(volatile T* const, Kokkos::Impl::identity_t<T>) {
return T();
}
#endif
// Simpler version of atomic_fetch_or without the fetch
template <typename T>
KOKKOS_INLINE_FUNCTION

View File

@ -139,7 +139,7 @@ T atomic_fetch_sub( volatile T * const dest ,
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
unsigned int done_active = 0;
while (active!=done_active) {
@ -154,7 +154,7 @@ T atomic_fetch_sub( volatile T * const dest ,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
#else
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
#endif
}
return return_val;
@ -304,6 +304,15 @@ T atomic_fetch_sub( volatile T * const dest_v , const T val )
#endif
#endif // !defined ROCM_ATOMICS
// dummy for non-CUDA Kokkos headers being processed by NVCC
#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
template< typename T >
__inline__ __device__
T atomic_fetch_sub(volatile T* const, Kokkos::Impl::identity_t<T>) {
return T();
}
#endif
// Simpler version of atomic_fetch_sub without the fetch
template <typename T>
KOKKOS_INLINE_FUNCTION

View File

@ -230,9 +230,6 @@ T atomic_fetch_oper( const Oper& op, volatile T * const dest ,
typename Kokkos::Impl::enable_if<
( sizeof(T) != 4 )
&& ( sizeof(T) != 8 )
#if defined(KOKKOS_ENABLE_ASM) && defined(KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_HOST)
&& ( sizeof(T) != 16 )
#endif
, const T >::type val )
{
@ -250,7 +247,7 @@ T atomic_fetch_oper( const Oper& op, volatile T * const dest ,
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
unsigned int done_active = 0;
while (active!=done_active) {
@ -265,7 +262,7 @@ T atomic_fetch_oper( const Oper& op, volatile T * const dest ,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
#else
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
#endif
}
return return_val;
@ -298,7 +295,7 @@ T atomic_oper_fetch( const Oper& op, volatile T * const dest ,
unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
#else
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
#endif
unsigned int done_active = 0;
while (active!=done_active) {
@ -313,7 +310,7 @@ T atomic_oper_fetch( const Oper& op, volatile T * const dest ,
#ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
#else
done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
#endif
}
return return_val;

View File

@ -49,6 +49,7 @@
#include <sstream>
#include <cstdlib>
#include <stack>
#include <cerrno>
//----------------------------------------------------------------------------
@ -70,7 +71,6 @@ bool is_unsigned_int(const char* str)
}
return true;
}
void initialize_internal(const InitArguments& args)
{
// This is an experimental setting
@ -99,6 +99,7 @@ setenv("MEMKIND_HBW_NODES", "1", 0);
if (use_gpu < 0 && ndevices >= 0) {
auto local_rank_str = std::getenv("OMPI_COMM_WORLD_LOCAL_RANK"); //OpenMPI
if (!local_rank_str) local_rank_str = std::getenv("MV2_COMM_WORLD_LOCAL_RANK"); //MVAPICH2
if (!local_rank_str) local_rank_str = std::getenv("SLURM_LOCALID"); //SLURM
if (local_rank_str) {
auto local_rank = std::atoi(local_rank_str);
use_gpu = local_rank % ndevices;
@ -532,6 +533,85 @@ void initialize(int& narg, char* arg[])
iarg++;
}
//Read environment variables
char * endptr;
auto env_num_threads_str = std::getenv("KOKKOS_NUM_THREADS");
if (env_num_threads_str!=nullptr) {
errno = 0;
auto env_num_threads = std::strtol(env_num_threads_str,&endptr,10);
if (endptr== env_num_threads_str)
Impl::throw_runtime_exception("Error: cannot convert KOKKOS_NUM_THREADS to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if (errno == ERANGE)
Impl::throw_runtime_exception("Error: KOKKOS_NUM_THREADS out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if ((num_threads != -1)&&(env_num_threads!=num_threads))
Impl::throw_runtime_exception("Error: expecting a match between --kokkos-threads and KOKKOS_NUM_THREADS if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
else
num_threads = env_num_threads;
}
auto env_numa_str = std::getenv("KOKKOS_NUMA");
if (env_numa_str!=nullptr) {
errno = 0;
auto env_numa = std::strtol(env_numa_str,&endptr,10);
if (endptr== env_numa_str)
Impl::throw_runtime_exception("Error: cannot convert KOKKOS_NUMA to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if (errno == ERANGE)
Impl::throw_runtime_exception("Error: KOKKOS_NUMA out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if ((numa != -1)&&(env_numa!=numa))
Impl::throw_runtime_exception("Error: expecting a match between --kokkos-numa and KOKKOS_NUMA if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
else
numa = env_numa;
}
auto env_device_str = std::getenv("KOKKOS_DEVICE_ID");
if (env_device_str!=nullptr) {
errno = 0;
auto env_device = std::strtol(env_device_str,&endptr,10);
if (endptr== env_device_str)
Impl::throw_runtime_exception("Error: cannot convert KOKKOS_DEVICE_ID to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if (errno == ERANGE)
Impl::throw_runtime_exception("Error: KOKKOS_DEVICE_ID out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if ((device != -1)&&(env_device!=device))
Impl::throw_runtime_exception("Error: expecting a match between --kokkos-device and KOKKOS_DEVICE_ID if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
else
device = env_device;
}
auto env_ndevices_str = std::getenv("KOKKOS_NUM_DEVICES");
if (env_ndevices_str!=nullptr) {
errno = 0;
auto env_ndevices = std::strtol(env_ndevices_str,&endptr,10);
if (endptr== env_ndevices_str)
Impl::throw_runtime_exception("Error: cannot convert KOKKOS_NUM_DEVICES to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if (errno == ERANGE)
Impl::throw_runtime_exception("Error: KOKKOS_NUM_DEVICES out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if ((ndevices != -1)&&(env_ndevices!=ndevices))
Impl::throw_runtime_exception("Error: expecting a match between --kokkos-ndevices and KOKKOS_NUM_DEVICES if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
else
ndevices = env_ndevices;
//Skip device
auto env_skip_device_str = std::getenv("KOKKOS_SKIP_DEVICE");
if (env_skip_device_str!=nullptr) {
errno = 0;
auto env_skip_device = std::strtol(env_skip_device_str,&endptr,10);
if (endptr== env_skip_device_str)
Impl::throw_runtime_exception("Error: cannot convert KOKKOS_SKIP_DEVICE to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if (errno == ERANGE)
Impl::throw_runtime_exception("Error: KOKKOS_SKIP_DEVICE out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
if ((skip_device != 9999)&&(env_skip_device!=skip_device))
Impl::throw_runtime_exception("Error: expecting a match between --kokkos-ndevices and KOKKOS_SKIP_DEVICE if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
else
skip_device = env_skip_device;
}
}
char * env_disablewarnings_str = std::getenv("KOKKOS_DISABLE_WARNINGS");
if (env_disablewarnings_str!=nullptr) {
std::string env_str (env_disablewarnings_str); // deep-copies string
for (char& c : env_str) { c = toupper (c); }
if ((env_str == "TRUE") || (env_str == "ON") || (env_str == "1"))
disable_warnings = true;
else
if (disable_warnings)
Impl::throw_runtime_exception("Error: expecting a match between --kokkos-disable-warnings and KOKKOS_DISABLE_WARNINGS if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
}
InitArguments arguments;
arguments.num_threads = num_threads;
arguments.num_numa = numa;

View File

@ -409,6 +409,9 @@ struct inclusive_scan_integer_sequence
static constexpr value_type value = helper::value ;
};
template <typename T>
using identity_t = T;
}} // namespace Kokkos::Impl

View File

@ -103,13 +103,7 @@ namespace Impl {
/** \brief View mapping for non-specialized data type and standard layout */
template< class Traits >
class ViewMapping< Traits ,
typename std::enable_if<(
std::is_same< typename Traits::specialize , Kokkos::Array<> >::value &&
( std::is_same< typename Traits::array_layout , Kokkos::LayoutLeft >::value ||
std::is_same< typename Traits::array_layout , Kokkos::LayoutRight >::value ||
std::is_same< typename Traits::array_layout , Kokkos::LayoutStride >::value )
)>::type >
class ViewMapping< Traits , Kokkos::Array<> >
{
private:
@ -345,64 +339,6 @@ public:
}
};
//----------------------------------------------------------------------------
//----------------------------------------------------------------------------
/** \brief Assign compatible default mappings */
template< class DstTraits , class SrcTraits >
class ViewMapping< DstTraits , SrcTraits ,
typename std::enable_if<(
std::is_same< typename DstTraits::memory_space , typename SrcTraits::memory_space >::value
&&
std::is_same< typename DstTraits::specialize , Kokkos::Array<> >::value
&&
(
std::is_same< typename DstTraits::array_layout , Kokkos::LayoutLeft >::value ||
std::is_same< typename DstTraits::array_layout , Kokkos::LayoutRight >::value ||
std::is_same< typename DstTraits::array_layout , Kokkos::LayoutStride >::value
)
&&
std::is_same< typename SrcTraits::specialize , Kokkos::Array<> >::value
&&
(
std::is_same< typename SrcTraits::array_layout , Kokkos::LayoutLeft >::value ||
std::is_same< typename SrcTraits::array_layout , Kokkos::LayoutRight >::value ||
std::is_same< typename SrcTraits::array_layout , Kokkos::LayoutStride >::value
)
)>::type >
{
public:
enum { is_assignable = true };
typedef Kokkos::Impl::SharedAllocationTracker TrackType ;
typedef ViewMapping< DstTraits , void > DstType ;
typedef ViewMapping< SrcTraits , void > SrcType ;
KOKKOS_INLINE_FUNCTION
static void assign( DstType & dst , const SrcType & src , const TrackType & src_track )
{
static_assert( std::is_same< typename DstTraits::value_type , typename SrcTraits::value_type >::value ||
std::is_same< typename DstTraits::value_type , typename SrcTraits::const_value_type >::value
, "View assignment must have same value type or const = non-const" );
static_assert( ViewDimensionAssignable< typename DstTraits::dimension , typename SrcTraits::dimension >::value
, "View assignment must have compatible dimensions" );
static_assert( std::is_same< typename DstTraits::array_layout , typename SrcTraits::array_layout >::value ||
std::is_same< typename DstTraits::array_layout , Kokkos::LayoutStride >::value ||
( DstTraits::dimension::rank == 0 ) ||
( DstTraits::dimension::rank == 1 && DstTraits::dimension::rank_dynamic == 1 )
, "View assignment must have compatible layout or have rank <= 1" );
typedef typename DstType::offset_type dst_offset_type ;
dst.m_impl_offset = dst_offset_type( src.m_impl_offset );
dst.m_impl_handle = src.m_impl_handle ;
dst.m_stride = src.m_stride ;
}
};
/** \brief Assign Array to non-Array */
template< class DstTraits , class SrcTraits >
@ -436,7 +372,7 @@ public:
typedef Kokkos::Impl::SharedAllocationTracker TrackType ;
typedef ViewMapping< DstTraits , void > DstType ;
typedef ViewMapping< SrcTraits , void > SrcType ;
typedef ViewMapping< SrcTraits , Kokkos::Array<> > SrcType ;
KOKKOS_INLINE_FUNCTION
static void assign( DstType & dst , const SrcType & src , const TrackType & src_track )
@ -480,6 +416,7 @@ public:
}
};
//----------------------------------------------------------------------------
//----------------------------------------------------------------------------

View File

@ -195,7 +195,7 @@ struct ViewDimension
{}
KOKKOS_INLINE_FUNCTION
constexpr size_t extent( const unsigned r ) const
constexpr size_t extent( const unsigned r ) const noexcept
{
return r == 0 ? N0 : (
r == 1 ? N1 : (
@ -207,6 +207,19 @@ struct ViewDimension
r == 7 ? N7 : 0 )))))));
}
static KOKKOS_INLINE_FUNCTION
constexpr size_t static_extent( const unsigned r ) noexcept
{
return r == 0 ? ArgN0 : (
r == 1 ? ArgN1 : (
r == 2 ? ArgN2 : (
r == 3 ? ArgN3 : (
r == 4 ? ArgN4 : (
r == 5 ? ArgN5 : (
r == 6 ? ArgN6 : (
r == 7 ? ArgN7 : 0 )))))));
}
template< size_t N >
struct prepend { typedef ViewDimension< N , Vals... > type ; };
@ -2640,6 +2653,12 @@ public:
KOKKOS_INLINE_FUNCTION constexpr size_t extent( const iType & r ) const
{ return m_impl_offset.m_dim.extent(r); }
static KOKKOS_INLINE_FUNCTION constexpr size_t static_extent( const unsigned r ) noexcept
{
using dim_type = typename offset_type::dimension_type;
return dim_type::static_extent(r);
}
KOKKOS_INLINE_FUNCTION constexpr
typename Traits::array_layout layout() const
{ return m_impl_offset.layout(); }

View File

@ -63,6 +63,86 @@ struct CountFillFunctor {
}
};
/* RunUpdateCrsTest
* 4 test cases:
* 1. use member object version which is constructed directly using the copy constructor
* 2. excplicity copy construct in local variable
* 3. construct default and assign to input object
* 4. construct object from views
*/
template< class CrsType, class ExecSpace, class scalarType >
struct RunUpdateCrsTest {
struct TestOne {};
struct TestTwo {};
struct TestThree {};
struct TestFour {};
CrsType graph;
RunUpdateCrsTest( CrsType g_in ) : graph(g_in)
{
}
void run_test(int nTest) {
switch (nTest)
{
case 1:
parallel_for ("TestCrs1", Kokkos::RangePolicy<ExecSpace, TestOne>(0,graph.numRows()),*this);
break;
case 2:
parallel_for ("TestCrs2", Kokkos::RangePolicy<ExecSpace, TestTwo>(0,graph.numRows()),*this);
break;
case 3:
parallel_for ("TestCrs3", Kokkos::RangePolicy<ExecSpace, TestThree>(0,graph.numRows()),*this);
break;
case 4:
parallel_for ("TestCrs4", Kokkos::RangePolicy<ExecSpace, TestFour>(0,graph.numRows()),*this);
break;
default:
break;
}
}
KOKKOS_INLINE_FUNCTION
void updateGraph(const CrsType & g_in, const scalarType row) const {
auto row_map = g_in.row_map;
auto entries = g_in.entries;
auto j_start = row_map(row);
auto j_end = row_map(row+1)-j_start;
for (scalarType j = 0; j < j_end; ++j) {
entries(j_start+j) = (j+1)*(j+1);
}
}
// Test Crs class from class member
KOKKOS_INLINE_FUNCTION
void operator()(const TestOne &, const scalarType row) const {
updateGraph(graph, row);
}
// Test Crs class from copy constructor (local_graph(graph)
KOKKOS_INLINE_FUNCTION
void operator()(const TestTwo &, const scalarType row) const {
CrsType local_graph(graph);
updateGraph(local_graph, row);
}
// Test Crs class from default constructor assigned to function parameter
KOKKOS_INLINE_FUNCTION
void operator()(const TestThree &, const scalarType row) const {
CrsType local_graph;
local_graph = graph;
updateGraph(local_graph, row);
}
// Test Crs class from local graph constructed from row_map and entities access on input parameter)
KOKKOS_INLINE_FUNCTION
void operator()(const TestFour &, const scalarType row) const {
CrsType local_graph(graph.row_map, graph.entries);
updateGraph(local_graph, row);
}
};
template< class ExecSpace >
void test_count_fill(std::int32_t nrows) {
Kokkos::Crs<std::int32_t, ExecSpace, void, std::int32_t> graph;
@ -81,6 +161,38 @@ void test_count_fill(std::int32_t nrows) {
}
}
// Test Crs Constructor / assignment operation by
// using count and fill to create/populate initial graph,
// then use parallel_for with Crs directly to update content
// then verify results
template< class ExecSpace >
void test_constructor(std::int32_t nrows) {
for (int nTest = 1; nTest < 5; nTest++)
{
typedef Kokkos::Crs<std::int32_t, ExecSpace, void, std::int32_t> crs_int32;
crs_int32 graph;
Kokkos::count_and_fill_crs(graph, nrows, CountFillFunctor<ExecSpace>());
ASSERT_EQ(graph.numRows(), nrows);
RunUpdateCrsTest<crs_int32, ExecSpace, std::int32_t> crstest(graph);
crstest.run_test(nTest);
auto row_map = Kokkos::create_mirror_view(graph.row_map);
Kokkos::deep_copy(row_map, graph.row_map);
auto entries = Kokkos::create_mirror_view(graph.entries);
Kokkos::deep_copy(entries, graph.entries);
for (std::int32_t row = 0; row < nrows; ++row) {
auto n = (row % 4) + 1;
ASSERT_EQ(row_map(row + 1) - row_map(row), n);
for (std::int32_t j = 0; j < n; ++j) {
ASSERT_EQ(entries(row_map(row) + j), (j + 1)*(j+1));
}
}
}
}
} // anonymous namespace
TEST_F( TEST_CATEGORY, crs_count_fill )
@ -95,4 +207,17 @@ TEST_F( TEST_CATEGORY, crs_count_fill )
test_count_fill<TEST_EXECSPACE>(10000);
}
TEST_F( TEST_CATEGORY, crs_copy_constructor )
{
test_constructor<TEST_EXECSPACE>(0);
test_constructor<TEST_EXECSPACE>(1);
test_constructor<TEST_EXECSPACE>(2);
test_constructor<TEST_EXECSPACE>(3);
test_constructor<TEST_EXECSPACE>(13);
test_constructor<TEST_EXECSPACE>(100);
test_constructor<TEST_EXECSPACE>(1000);
test_constructor<TEST_EXECSPACE>(10000);
}
} // namespace Test

View File

@ -956,7 +956,12 @@ struct TestMDRange_3D {
}
, Kokkos::Min<double>(min) );
ASSERT_EQ( min, 8.0 );
if((N0-1)*(N1-1)*(N2-1)>0)
ASSERT_EQ( min, 8.0 );
else {
double min_identity = Kokkos::reduction_identity<double>::min();
ASSERT_EQ( min, min_identity );
}
}
#endif
#endif

View File

@ -46,8 +46,10 @@
namespace Test {
TEST_F( TEST_CATEGORY , mdrange_3d) {
TestMDRange_3D< TEST_EXECSPACE >::test_for3( 1, 10, 100 );
TestMDRange_3D< TEST_EXECSPACE >::test_for3( 100, 10, 100 );
#if !defined( KOKKOS_ENABLE_ROCM ) // MDRange Reduced explicitly handled in its own cpp file
TestMDRange_3D< TEST_EXECSPACE >::test_reduce3( 1, 10, 100 );
TestMDRange_3D< TEST_EXECSPACE >::test_reduce3( 100, 10, 100 );
#endif
}

View File

@ -60,8 +60,11 @@ struct TestRange {
struct VerifyInitTag {};
struct ResetTag {};
struct VerifyResetTag {};
struct OffsetTag {};
struct VerifyOffsetTag {};
int N;
int N;
static const int offset = 13;
TestRange( const size_t N_ )
: m_flags( Kokkos::ViewAllocateWithoutInitializing( "flags" ), N_ ), N(N_)
{}
@ -117,6 +120,18 @@ struct TestRange {
if ( int( 2 * i ) != host_flags( i ) ) ++error_count;
}
ASSERT_EQ( error_count, int( 0 ) );
Kokkos::parallel_for( Kokkos::RangePolicy< ExecSpace, ScheduleType, OffsetTag >( offset, N + offset ), *this );
Kokkos::parallel_for( std::string("TestKernelFor"), Kokkos::RangePolicy<ExecSpace, ScheduleType, VerifyOffsetTag>( 0, N ), *this);
Kokkos::deep_copy(host_flags, m_flags);
error_count = 0;
for (int i = 0; i < N; ++i) {
if (i + offset != host_flags(i))
++error_count;
}
ASSERT_EQ(error_count, int(0));
}
KOKKOS_INLINE_FUNCTION
@ -144,9 +159,19 @@ struct TestRange {
}
}
//----------------------------------------
KOKKOS_INLINE_FUNCTION
void operator()(const OffsetTag &, const int i) const {
m_flags(i - offset) = i;
}
struct OffsetTag {};
KOKKOS_INLINE_FUNCTION
void operator()(const VerifyOffsetTag &, const int i) const {
if (i + offset != m_flags(i)) {
printf("TestRange::test_for error at %d != %d\n", i + offset, m_flags(i));
}
}
//----------------------------------------
void test_reduce( )
{
@ -158,7 +183,7 @@ struct TestRange {
// sum( 0 .. N-1 )
ASSERT_EQ( size_t( ( N - 1 ) * ( N ) / 2 ), size_t( total ) );
Kokkos::parallel_reduce( Kokkos::RangePolicy< ExecSpace, ScheduleType, OffsetTag>( 0, N ), *this, total );
Kokkos::parallel_reduce( Kokkos::RangePolicy< ExecSpace, ScheduleType, OffsetTag>( offset, N+offset ), *this, total );
// sum( 1 .. N )
ASSERT_EQ( size_t( ( N ) * ( N + 1 ) / 2 ), size_t( total ) );
}
@ -169,7 +194,7 @@ struct TestRange {
KOKKOS_INLINE_FUNCTION
void operator()( const OffsetTag &, const int i, value_type & update ) const
{ update += 1 + m_flags( i ); }
{ update += 1 + m_flags( i-offset ); }
//----------------------------------------

View File

@ -532,7 +532,11 @@ struct functor_vec_single {
typedef ExecutionSpace execution_space;
Kokkos::View< int, Kokkos::LayoutLeft, ExecutionSpace > flag;
functor_vec_single( Kokkos::View< int, Kokkos::LayoutLeft, ExecutionSpace > flag_ ) : flag( flag_ ) {}
int nStart;
int nEnd;
functor_vec_single( Kokkos::View< int, Kokkos::LayoutLeft, ExecutionSpace > flag_, const int start_, const int end_ ) :
flag( flag_ ), nStart(start_), nEnd(end_) {}
KOKKOS_INLINE_FUNCTION
void operator()( typename policy_type::member_type team ) const {
@ -541,7 +545,7 @@ struct functor_vec_single {
// inside a parallel_for and write to it.
Scalar value = 0;
Kokkos::parallel_for( Kokkos::ThreadVectorRange( team, 0, 13 ), [&] ( int i )
Kokkos::parallel_for( Kokkos::ThreadVectorRange( team, nStart, nEnd ), [&] ( int i )
{
value = i; // This write is violating Kokkos semantics for nested parallelism.
});
@ -552,12 +556,12 @@ struct functor_vec_single {
}, value );
Scalar value2 = 0;
Kokkos::parallel_reduce( Kokkos::ThreadVectorRange( team, 0, 13 ), [&] ( int i, Scalar & val )
Kokkos::parallel_reduce( Kokkos::ThreadVectorRange( team, nStart, nEnd ), [&] ( int i, Scalar & val )
{
val += value;
}, value2 );
if ( value2 != ( value * 13 ) ) {
if ( value2 != ( value * (nEnd-nStart) ) ) {
printf( "FAILED vector_single broadcast %i %i %f %f\n",
team.league_rank(), team.team_rank(), (double) value2, (double) value );
@ -746,12 +750,6 @@ bool test_scalar( int nteams, int team_size, int test ) {
functor_vec_red< Scalar, ExecutionSpace >( d_flag ) );
}
else if ( test == 1 ) {
// WORKAROUND CUDA
#if defined(KOKKOS_ENABLE_CUDA)
#if defined(KOKKOS_IMPL_CUDA_CLANG_WORKAROUND) || defined(KOKKOS_ARCH_PASCAL)
if(!std::is_same<ExecutionSpace,Kokkos::Cuda>::value)
#endif
#endif
Kokkos::parallel_for( Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
functor_vec_red_reducer< Scalar, ExecutionSpace >( d_flag ) );
}
@ -765,7 +763,7 @@ bool test_scalar( int nteams, int team_size, int test ) {
}
else if ( test == 4 ) {
Kokkos::parallel_for( "B", Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
functor_vec_single< Scalar, ExecutionSpace >( d_flag ) );
functor_vec_single< Scalar, ExecutionSpace >( d_flag, 0, 13 ) );
}
else if ( test == 5 ) {
Kokkos::parallel_for( Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size ),
@ -791,6 +789,10 @@ bool test_scalar( int nteams, int team_size, int test ) {
Kokkos::parallel_for( Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
functor_team_vector_reduce_reducer< Scalar, ExecutionSpace >( d_flag ) );
}
else if ( test == 11 ) {
Kokkos::parallel_for( "B", Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
functor_vec_single< Scalar, ExecutionSpace >( d_flag, 4, 13 ) );
}
Kokkos::deep_copy( h_flag, d_flag );
@ -938,6 +940,7 @@ TEST_F( TEST_CATEGORY, team_vector )
ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 8 ) ) );
ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 9 ) ) );
ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 10 ) ) );
ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 11 ) ) );
}
#endif

View File

@ -56,17 +56,13 @@ struct TestViewCopy {
using InExecSpace = ExecSpace;
static void test_view_copy()
static void test_view_copy(const int dim0, const int dim1, const int dim2)
{
#if defined( KOKKOS_ENABLE_CUDA ) || defined( KOKKOS_ENABLE_ROCM )
// ExecSpace = CudaUVM, CudaHostPinned
// This test will fail at runtime with an illegal memory access if something goes wrong
// Test 1: deep_copy from host_mirror_space to ExecSpace and ExecSpace back to host_mirror_space
{
const int dim0 = 4;
const int dim1 = 2;
const int dim2 = 3;
typedef Kokkos::View<double****,InExecSpace> Rank4ViewType;
Rank4ViewType view_4;
view_4 = Rank4ViewType("view_4", dim0, dim1, dim2, dim2);
@ -88,19 +84,21 @@ struct TestViewCopy {
// Test 2: deep_copy from Cuda to ExecSpace and ExecSpace back to Cuda
{
const int dim0 = 4;
const int dim1 = 2;
const int dim2 = 3;
typedef Kokkos::View<double****,InExecSpace> Rank4ViewType;
Rank4ViewType view_4;
view_4 = Rank4ViewType("view_4", dim0, dim1, dim2, dim2);
#if defined( KOKKOS_ENABLE_CUDA )
typedef Kokkos::Cuda space_type;
typedef typename std::conditional<
Kokkos::Impl::MemorySpaceAccess<Kokkos::CudaSpace,typename InExecSpace::memory_space>::accessible,
Kokkos::CudaSpace,
InExecSpace>::type space_type;
#endif
#if defined( KOKKOS_ENABLE_ROCM )
typedef Kokkos::Experimental::ROCm space_type;
typedef typename std::conditional<
Kokkos::Impl::MemorySpaceAccess<Kokkos::ROCmSpace,typename InExecSpace::memory_space>::accessible,
Kokkos::ROCmSpace,
InExecSpace>::type space_type;
#endif
Kokkos::View<double**,Kokkos::LayoutLeft,space_type> srcView("srcView", dim2, dim2);
@ -118,10 +116,6 @@ struct TestViewCopy {
// Test 3: deep_copy from host_space to ExecSpace and ExecSpace back to host_space
{
const int dim0 = 4;
const int dim1 = 2;
const int dim2 = 3;
typedef Kokkos::View<double****,InExecSpace> Rank4ViewType;
Rank4ViewType view_4;
view_4 = Rank4ViewType("view_4", dim0, dim1, dim2, dim2);
@ -149,7 +143,41 @@ struct TestViewCopy {
TEST_F( TEST_CATEGORY , view_copy_tests ) {
//Only include this file to be compiled with CudaUVM and CudaHostPinned
TestViewCopy< TEST_EXECSPACE >::test_view_copy();
TestViewCopy< TEST_EXECSPACE >::test_view_copy(4,2,3);
TestViewCopy< TEST_EXECSPACE >::test_view_copy(4,2,0);
}
TEST_F( TEST_CATEGORY , view_copy_degenerated ) {
//Only include this file to be compiled with CudaUVM and CudaHostPinned
Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_def_1;
Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_1( reinterpret_cast<int*>(-1), 0 );
Kokkos::View<int*> v_m_def_1;
Kokkos::View<int*> v_m_1("v_m_1", 0);
Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_def_2;
Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_2( reinterpret_cast<int*>(-1), 0 );
Kokkos::View<int*> v_m_def_2;
Kokkos::View<int*> v_m_2("v_m_2", 0);
Kokkos::deep_copy(v_um_def_1, v_um_def_2);
Kokkos::deep_copy(v_um_def_1, v_um_2);
Kokkos::deep_copy(v_um_def_1, v_m_def_2);
Kokkos::deep_copy(v_um_def_1, v_m_2);
Kokkos::deep_copy(v_um_1, v_um_def_2);
Kokkos::deep_copy(v_um_1, v_um_2);
Kokkos::deep_copy(v_um_1, v_m_def_2);
Kokkos::deep_copy(v_um_1, v_m_2);
Kokkos::deep_copy(v_m_def_1, v_um_def_2);
Kokkos::deep_copy(v_m_def_1, v_um_2);
Kokkos::deep_copy(v_m_def_1, v_m_def_2);
Kokkos::deep_copy(v_m_def_1, v_m_2);
Kokkos::deep_copy(v_m_1, v_um_def_2);
Kokkos::deep_copy(v_m_1, v_um_2);
Kokkos::deep_copy(v_m_1, v_m_def_2);
Kokkos::deep_copy(v_m_1, v_m_2);
}
} // namespace Test

View File

@ -1245,5 +1245,12 @@ TEST_F( TEST_CATEGORY , view_mapping_operator )
test_view_mapping_operator< TEST_EXECSPACE >();
}
TEST_F( TEST_CATEGORY , static_extent )
{
using T = Kokkos::View<double*[2][3]>;
ASSERT_EQ( T::static_extent(1), 2 );
ASSERT_EQ( T::static_extent(2), 3 );
}
}

View File

@ -228,6 +228,10 @@ TEST_F( cuda, uvm )
}
}
/* Removing UVM Allocs Test due to added time to complete overall unit test
* The issue verified with this unit test appears to no longer be an
* problem. Refer to github issue 1880 for more details
*
TEST_F( cuda, uvm_num_allocs )
{
// The max number of UVM allocations allowed is 65536.
@ -288,6 +292,7 @@ TEST_F( cuda, uvm_num_allocs )
#undef MAX_NUM_ALLOCS
}
*/
template< class MemSpace, class ExecSpace >
struct TestViewCudaAccessible {

View File

@ -43,3 +43,4 @@
#include <openmp/TestOpenMP_Category.hpp>
#include <TestViewAPI_e.hpp>
#include <TestViewCopy.hpp>

View File

@ -43,3 +43,5 @@
#include <serial/TestSerial_Category.hpp>
#include <TestViewAPI_e.hpp>
#include <TestViewCopy.hpp>

View File

@ -43,3 +43,4 @@
#include <threads/TestThreads_Category.hpp>
#include <TestViewAPI_e.hpp>
#include <TestViewCopy.hpp>

View File

@ -68,6 +68,9 @@ do
--cxxflags*)
CXXFLAGS="${key#*=}"
;;
--cxxstandard*)
KOKKOS_CXX_STANDARD="${key#*=}"
;;
--ldflags*)
LDFLAGS="${key#*=}"
;;
@ -127,6 +130,7 @@ do
echo "--arch=[OPT]: Set target architectures. Options are:"
echo " [AMD]"
echo " AMDAVX = AMD CPU"
echo " EPYC = AMD EPYC Zen-Core CPU"
echo " [ARM]"
echo " ARMv80 = ARMv8.0 Compatible CPU"
echo " ARMv81 = ARMv8.1 Compatible CPU"
@ -165,6 +169,8 @@ do
echo " build. This will still set certain required"
echo " flags via KOKKOS_CXXFLAGS (such as -fopenmp,"
echo " --std=c++11, etc.)."
echo "--cxxstandard=[FLAGS] Overwrite KOKKOS_CXX_STANDARD for library build and test"
echo " c++11 (default), c++14, c++17, c++1y, c++1z, c++2a"
echo "--ldflags=[FLAGS] Overwrite LDFLAGS for library build and test"
echo " build. This will still set certain required"
echo " flags via KOKKOS_LDFLAGS (such as -fopenmp,"
@ -243,6 +249,10 @@ if [ ${#CXXFLAGS} -gt 0 ]; then
KOKKOS_SETTINGS="${KOKKOS_SETTINGS} CXXFLAGS=\"${CXXFLAGS}\""
fi
if [ ${#KOKKOS_CXX_STANDARD} -gt 0 ]; then
KOKKOS_SETTINGS="${KOKKOS_SETTINGS} KOKKOS_CXX_STANDARD=\"${KOKKOS_CXX_STANDARD}\""
fi
if [ ${#LDFLAGS} -gt 0 ]; then
KOKKOS_SETTINGS="${KOKKOS_SETTINGS} LDFLAGS=\"${LDFLAGS}\""
fi

View File

@ -15,3 +15,4 @@ tag: 2.5.00 date: 12:15:2017 master: dfe685f4 develop: ec7ad6d8
tag: 2.6.00 date: 03:07:2018 master: 62e760fa develop: d1ba7d71
tag: 2.7.00 date: 05:24:2018 master: e01945d0 develop: 2d13f608
tag: 2.7.24 date: 11:04:2018 master: d3a94192 develop: 7a06fc81
tag: 2.8.00 date: 02:05:2019 master: 34931a36 develop: d1659d1d

View File

@ -88,6 +88,8 @@ CXX_FLAGS_EXTRA=""
LD_FLAGS_EXTRA=""
KOKKOS_OPTIONS=""
CXX_STANDARD="c++11"
#
# Handle arguments.
#
@ -142,6 +144,9 @@ do
--cxxflags-extra*)
CXX_FLAGS_EXTRA="${key#*=}"
;;
--cxxstandard*)
CXX_STANDARD="${key#*=}"
;;
--ldflags-extra*)
LD_FLAGS_EXTRA="${key#*=}"
;;
@ -227,18 +232,30 @@ elif [ "$MACHINE" = "white" ]; then
export SLURM_TASKS_PER_NODE=32
BASE_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>"
IBM_MODULE_LIST="<COMPILER_NAME>/xl/<COMPILER_VERSION>"
IBM_MODULE_LIST="<COMPILER_NAME>/xl/<COMPILER_VERSION>,gcc/7.2.0"
CUDA_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/7.2.0,ibm/xl/16.1.0"
CUDA10_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/7.4.0,ibm/xl/16.1.0"
# Don't do pthread on white.
GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"
# Format: (compiler module-list build-list exe-name warning-flag)
COMPILERS=("gcc/6.4.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
"gcc/7.2.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
"ibm/16.1.0 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
"cuda/9.2.88 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
)
if [ "$SPOT_CHECK" = "True" ]; then
# Format: (compiler module-list build-list exe-name warning-flag)
COMPILERS=("gcc/6.4.0 $BASE_MODULE_LIST "OpenMP_Serial" g++ $GCC_WARNING_FLAGS"
"gcc/7.2.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
"ibm/16.1.0 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
"cuda/9.2.88 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
)
else
# Format: (compiler module-list build-list exe-name warning-flag)
COMPILERS=("gcc/6.4.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
"gcc/7.2.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
"ibm/16.1.0 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
"ibm/16.1.1 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
"cuda/9.2.88 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"cuda/10.0.130 $CUDA10_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
)
fi
if [ -z "$ARCH_FLAG" ]; then
ARCH_FLAG="--arch=Power8,Kepler37"
@ -323,6 +340,7 @@ elif [ "$MACHINE" = "apollo" ]; then
BASE_MODULE_LIST="sems-env,kokkos-env,sems-<COMPILER_NAME>/<COMPILER_VERSION>,kokkos-hwloc/1.10.1/base"
CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base"
CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"
CUDA10_MODULE_LIST="sems-env,kokkos-env,<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"
CLANG_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/9.0.69"
CLANG7_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/9.1"
@ -344,6 +362,7 @@ elif [ "$MACHINE" = "apollo" ]; then
else
# Format: (compiler module-list build-list exe-name warning-flag)
COMPILERS=("cuda/9.1 $CUDA8_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"cuda/10.0 $CUDA10_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"clang/6.0 $CLANG_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS"
"clang/7.0 $CLANG7_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS"
"clang/3.9.0 $CLANG_MODULE_LIST $BUILD_LIST_CLANG clang++ $CLANG_WARNING_FLAGS"
@ -629,6 +648,8 @@ single_build_and_test() {
local cxxflags="${cxxflags} ${CXX_FLAGS_EXTRA}"
local ldflags="${ldflags} ${LD_FLAGS_EXTRA}"
local cxx_standard="${CXX_STANDARD}"
if [[ "$KOKKOS_CUDA_OPTIONS" != "" ]]; then
local extra_args="$extra_args $KOKKOS_CUDA_OPTIONS"
fi
@ -650,7 +671,7 @@ single_build_and_test() {
run_cmd ls fake_problem >& ${desc}.configure.log || { report_and_log_test_result 1 $desc configure && return 0; }
fi
else
run_cmd ${KOKKOS_PATH}/generate_makefile.bash --with-devices=$build $ARCH_FLAG --compiler=$(which $compiler_exe) --cxxflags=\"$cxxflags\" --ldflags=\"$ldflags\" $extra_args &>> ${desc}.configure.log || { report_and_log_test_result 1 ${desc} configure && return 0; }
run_cmd ${KOKKOS_PATH}/generate_makefile.bash --with-devices=$build $ARCH_FLAG --compiler=$(which $compiler_exe) --cxxflags=\"$cxxflags\" --cxxstandard=\"$cxx_standard\" --ldflags=\"$ldflags\" $extra_args &>> ${desc}.configure.log || { report_and_log_test_result 1 ${desc} configure && return 0; }
local -i build_start_time=$(date +%s)
run_cmd make -j 48 build-test >& ${desc}.build.log || { report_and_log_test_result 1 ${desc} build && return 0; }
local -i build_end_time=$(date +%s)

View File

@ -206,7 +206,7 @@ void FixWallBodyPolygon::setup(int vflag)
void FixWallBodyPolygon::post_force(int /*vflag*/)
{
double vwall[3],dx,dy,dz,del1,del2,delxy,delr,rsq,eradi,rradi,wall_pos;
double vwall[3],dx,dy,dz,del1,del2,delxy,delr,rsq,eradi,wall_pos;
int i,ni,npi,ifirst,nei,iefirst,side;
double facc[3];
@ -316,7 +316,6 @@ void FixWallBodyPolygon::post_force(int /*vflag*/)
nei = ednum[i];
iefirst = edfirst[i];
eradi = enclosing_radius[i];
rradi = rounded_radius[i];
// reset vertex and edge forces
@ -332,14 +331,14 @@ void FixWallBodyPolygon::post_force(int /*vflag*/)
edge[iefirst+ni][4] = 0;
}
int interact, num_contacts, done;
int num_contacts, done;
double delta_a, delta_ua, j_a;
Contact contact_list[MAX_CONTACTS];
num_contacts = 0;
facc[0] = facc[1] = facc[2] = 0;
interact = vertex_against_wall(i, wall_pos, x, f, torque, side,
contact_list, num_contacts, facc);
vertex_against_wall(i, wall_pos, x, f, torque, side,
contact_list, num_contacts, facc);
if (num_contacts >= 2) {
@ -475,12 +474,11 @@ int FixWallBodyPolygon::vertex_against_wall(int i, double wall_pos,
Contact* contact_list, int &num_contacts, double* /*facc*/)
{
int ni, npi, ifirst, interact;
double xpi[3], eradi, rradi;
double xpi[3], rradi;
double fx, fy, fz;
npi = dnum[i];
ifirst = dfirst[i];
eradi = enclosing_radius[i];
rradi = rounded_radius[i];
interact = 0;

View File

@ -213,8 +213,8 @@ void FixWallBodyPolyhedron::setup(int vflag)
void FixWallBodyPolyhedron::post_force(int /*vflag*/)
{
double vwall[3],dx,dy,dz,del1,del2,rsq,eradi,rradi,wall_pos;
int i,ni,npi,ifirst,nei,iefirst,nfi,iffirst,side;
double vwall[3],dx,dy,dz,del1,del2,rsq,wall_pos;
int i,ni,npi,ifirst,nei,iefirst,side;
double facc[3];
// set position of wall to initial settings and velocity to 0.0
@ -330,10 +330,6 @@ void FixWallBodyPolyhedron::post_force(int /*vflag*/)
ifirst = dfirst[i];
nei = ednum[i];
iefirst = edfirst[i];
nfi = facnum[i];
iffirst = facfirst[i];
eradi = enclosing_radius[i];
rradi = rounded_radius[i];
if (npi == 1) {
sphere_against_wall(i, wall_pos, side, vwall, x, v, f, angmom, torque);
@ -356,13 +352,13 @@ void FixWallBodyPolyhedron::post_force(int /*vflag*/)
edge[iefirst+ni][5] = 0;
}
int interact, num_contacts;
int num_contacts;
Contact contact_list[MAX_CONTACTS];
num_contacts = 0;
facc[0] = facc[1] = facc[2] = 0;
interact = edge_against_wall(i, wall_pos, side, vwall, x, f, torque,
contact_list, num_contacts, facc);
edge_against_wall(i, wall_pos, side, vwall, x, f, torque,
contact_list, num_contacts, facc);
} // group bit
}
@ -544,7 +540,7 @@ int FixWallBodyPolyhedron::edge_against_wall(int i, double wall_pos,
int side, double* vwall, double** x, double** /*f*/, double** /*torque*/,
Contact* /*contact_list*/, int &/*num_contacts*/, double* /*facc*/)
{
int ni, nei, mode, contact;
int ni, nei, contact;
double rradi;
nei = ednum[i];
@ -555,8 +551,7 @@ int FixWallBodyPolyhedron::edge_against_wall(int i, double wall_pos,
// loop through body i's edges
for (ni = 0; ni < nei; ni++)
mode = compute_distance_to_wall(i, ni, x[i], rradi, wall_pos, side, vwall,
contact);
compute_distance_to_wall(i, ni, x[i], rradi, wall_pos, side, vwall, contact);
return contact;
}

View File

@ -212,8 +212,8 @@ double PairBornCoulLongCSGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairBornCoulLongCSGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itable,itype,jtype;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;

View File

@ -207,8 +207,8 @@ double PairBornCoulLongGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairBornCoulLongGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;

View File

@ -193,9 +193,9 @@ double PairBornCoulWolfCSGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairBornCoulWolfCSGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh,
int **firstneigh) {
void PairBornCoulWolfCSGPU::cpu_compute(int start, int inum, int eflag,
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,qtmp,delx,dely,delz,evdwl,ecoul,fpair;
double rsq,r2inv,r6inv,forcecoul,forceborn,factor_coul,factor_lj;

View File

@ -201,8 +201,8 @@ double PairBuckCoulLongGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairBuckCoulLongGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;

View File

@ -184,8 +184,9 @@ double PairColloidGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairColloidGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh)
void PairColloidGPU::cpu_compute(int start, int inum, int eflag,
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;

View File

@ -153,7 +153,7 @@ void PairCoulLongCSGPU::init_style()
for (int i = 1; i <= atom->ntypes; i++) {
for (int j = i; j <= atom->ntypes; j++) {
if (setflag[i][j] != 0 || (setflag[i][i] != 0 && setflag[j][j] != 0)) {
double cut = init_one(i,j);
init_one(i,j);
}
}
}
@ -208,8 +208,8 @@ double PairCoulLongCSGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairCoulLongCSGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itable,itype,jtype;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,ecoul,fpair;

View File

@ -148,7 +148,7 @@ void PairCoulLongGPU::init_style()
for (int i = 1; i <= atom->ntypes; i++) {
for (int j = i; j <= atom->ntypes; j++) {
if (setflag[i][j] != 0 || (setflag[i][i] != 0 && setflag[j][j] != 0)) {
double cut = init_one(i,j);
init_one(i,j);
}
}
}
@ -203,8 +203,8 @@ double PairCoulLongGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairCoulLongGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itable;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,ecoul,fpair;

View File

@ -192,8 +192,8 @@ void PairEAMAlloyGPU::init_style()
/* ---------------------------------------------------------------------- */
double PairEAMAlloyGPU::single(int i, int j, int itype, int jtype,
double rsq, double factor_coul, double factor_lj,
double &fforce)
double rsq, double /* factor_coul */,
double /* factor_lj */, double &fforce)
{
int m;
double r,p,rhoip,rhojp,z2,z2p,recip,phi,phip,psip;
@ -235,7 +235,7 @@ double PairEAMAlloyGPU::single(int i, int j, int itype, int jtype,
/* ---------------------------------------------------------------------- */
int PairEAMAlloyGPU::pack_forward_comm(int n, int *list, double *buf,
int pbc_flag,int *pbc)
int /* pbc_flag */, int * /* pbc */)
{
int i,j,m;

View File

@ -192,8 +192,8 @@ void PairEAMFSGPU::init_style()
/* ---------------------------------------------------------------------- */
double PairEAMFSGPU::single(int i, int j, int itype, int jtype,
double rsq, double factor_coul, double factor_lj,
double &fforce)
double rsq, double /* factor_coul */,
double /* factor_lj */, double &fforce)
{
int m;
double r,p,rhoip,rhojp,z2,z2p,recip,phi,phip,psip;
@ -235,7 +235,7 @@ double PairEAMFSGPU::single(int i, int j, int itype, int jtype,
/* ---------------------------------------------------------------------- */
int PairEAMFSGPU::pack_forward_comm(int n, int *list, double *buf,
int pbc_flag,int *pbc)
int /* pbc_flag */, int * /* pbc */)
{
int i,j,m;

View File

@ -195,8 +195,8 @@ void PairEAMGPU::init_style()
/* ---------------------------------------------------------------------- */
double PairEAMGPU::single(int i, int j, int itype, int jtype,
double rsq, double factor_coul, double factor_lj,
double &fforce)
double rsq, double /* factor_coul */,
double /* factor_lj */, double &fforce)
{
int m;
double r,p,rhoip,rhojp,z2,z2p,recip,phi,phip,psip;
@ -238,7 +238,7 @@ double PairEAMGPU::single(int i, int j, int itype, int jtype,
/* ---------------------------------------------------------------------- */
int PairEAMGPU::pack_forward_comm(int n, int *list, double *buf,
int pbc_flag,int *pbc)
int /* pbc_flag */, int * /* pbc */)
{
int i,j,m;

View File

@ -221,8 +221,9 @@ double PairGayBerneGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairGayBerneGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh)
void PairGayBerneGPU::cpu_compute(int start, int inum, int eflag,
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype;
double evdwl,one_eng,rsq,r2inv,r6inv,forcelj,factor_lj;

View File

@ -151,11 +151,10 @@ void PairLJCharmmCoulLongGPU::init_style()
// Repeat cutsq calculation because done after call to init_style
double cut;
for (int i = 1; i <= atom->ntypes; i++) {
for (int j = i; j <= atom->ntypes; j++) {
if (setflag[i][j] != 0 || (setflag[i][i] != 0 && setflag[j][j] != 0))
cut = init_one(i,j);
init_one(i,j);
}
}
@ -219,7 +218,7 @@ double PairLJCharmmCoulLongGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairLJCharmmCoulLongGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist,
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype,itable;

View File

@ -197,8 +197,8 @@ double PairLJClass2CoulLongGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairLJClass2CoulLongGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;

View File

@ -215,8 +215,8 @@ double PairLJCutCoulLongGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairLJCutCoulLongGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype,itable;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;

View File

@ -192,8 +192,9 @@ double PairLJCutCoulMSMGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairLJCutCoulMSMGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh) {
void PairLJCutCoulMSMGPU::cpu_compute(int start, int inum, int eflag,
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype,itable;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
double fraction,table;

View File

@ -216,7 +216,7 @@ void PairLJCutDipoleLongGPU::cpu_compute(int start, int inum, int eflag, int vfl
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz;
double rsq,r,rinv,r2inv,r6inv;
double forcecoulx,forcecouly,forcecoulz,fforce;
double tixcoul,tiycoul,tizcoul,tjxcoul,tjycoul,tjzcoul;
double tixcoul,tiycoul,tizcoul;
double fx,fy,fz,fdx,fdy,fdz,fax,fay,faz;
double pdotp,pidotr,pjdotr,pre1,pre2,pre3;
double grij,expm2,t,erfc;
@ -378,14 +378,9 @@ void PairLJCutDipoleLongGPU::cpu_compute(int start, int inum, int eflag, int vfl
tixcoul = mu[i][1]*(zdiz + zaiz) - mu[i][2]*(zdiy + zaiy);
tiycoul = mu[i][2]*(zdix + zaix) - mu[i][0]*(zdiz + zaiz);
tizcoul = mu[i][0]*(zdiy + zaiy) - mu[i][1]*(zdix + zaix);
tjxcoul = mu[j][1]*(zdjz + zajz) - mu[j][2]*(zdjy + zajy);
tjycoul = mu[j][2]*(zdjx + zajx) - mu[j][0]*(zdjz + zajz);
tjzcoul = mu[j][0]*(zdjy + zajy) - mu[j][1]*(zdjx + zajx);
} else {
forcecoulx = forcecouly = forcecoulz = 0.0;
tixcoul = tiycoul = tizcoul = 0.0;
tjxcoul = tjycoul = tjzcoul = 0.0;
}
// LJ interaction

View File

@ -215,8 +215,8 @@ double PairLJExpandCoulLongGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairLJExpandCoulLongGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh)
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype,itable;
double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;

View File

@ -172,8 +172,8 @@ double PairMIECutGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairMIECutGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh) {
void PairMIECutGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
int *ilist, int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
double rsq,r2inv,rgamR,rgamA,forcemie,factor_mie;

View File

@ -168,7 +168,7 @@ double PairMorseGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairMorseGPU::cpu_compute(int start, int inum, int eflag, int vflag,
void PairMorseGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
int *ilist, int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype;

View File

@ -219,8 +219,9 @@ double PairRESquaredGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairRESquaredGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh)
void PairRESquaredGPU::cpu_compute(int start, int inum, int eflag,
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh)
{
int i,j,ii,jj,jnum,itype,jtype;
double evdwl,one_eng,rsq,r2inv,r6inv,forcelj,factor_lj;

View File

@ -182,8 +182,8 @@ double PairSoftGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairSoftGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh) {
void PairSoftGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
int *ilist, int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
double r,rsq,arg,factor_lj;

View File

@ -246,7 +246,7 @@ double PairTableGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairTableGPU::cpu_compute(int start, int inum, int eflag, int vflag,
void PairTableGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
int *ilist, int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype,itable;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;

View File

@ -186,8 +186,8 @@ double PairUFMGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairUFMGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh) {
void PairUFMGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
int *ilist, int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
double rsq,expuf,factor_lj;

View File

@ -151,7 +151,7 @@ void PairVashishtaGPU::init_style()
if (force->newton_pair != 0)
error->all(FLERR,"Pair style vashishta/gpu requires newton pair off");
double *cutsq, *r0, *r0eps, *gamma, *eta;
double *cutsq, *r0, *gamma, *eta;
double *lam1inv, *lam4inv, *zizj, *mbigd;
double *dvrc, *big6w, *heta, *bigh;
double *bigw, *c0, *costheta, *bigb;

View File

@ -179,8 +179,8 @@ double PairYukawaColloidGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairYukawaColloidGPU::cpu_compute(int start, int inum, int eflag,
int vflag, int *ilist, int *numneigh,
int **firstneigh) {
int /* vflag */, int *ilist,
int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair,radi,radj;
double r,rsq,rinv,screening,forceyukawa,factor;

View File

@ -170,8 +170,8 @@ double PairYukawaGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairYukawaGPU::cpu_compute(int start, int inum, int eflag, int vflag,
int *ilist, int *numneigh, int **firstneigh) {
void PairYukawaGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
int *ilist, int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
double rsq,r2inv,r,rinv,screening,forceyukawa,factor;

View File

@ -177,7 +177,7 @@ double PairZBLGPU::memory_usage()
/* ---------------------------------------------------------------------- */
void PairZBLGPU::cpu_compute(int start, int inum, int eflag, int vflag,
void PairZBLGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
int *ilist, int *numneigh, int **firstneigh) {
int i,j,ii,jj,jnum,itype,jtype;
double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;

View File

@ -49,6 +49,8 @@ action angle_charmm_kokkos.cpp angle_charmm.cpp
action angle_charmm_kokkos.h angle_charmm.h
action angle_class2_kokkos.cpp angle_class2.cpp
action angle_class2_kokkos.h angle_class2.h
action angle_cosine_kokkos.cpp angle_cosine.cpp
action angle_cosine_kokkos.h angle_cosine.h
action angle_harmonic_kokkos.cpp angle_harmonic.cpp
action angle_harmonic_kokkos.h angle_harmonic.h
action atom_kokkos.cpp

View File

@ -0,0 +1,394 @@
/* ----------------------------------------------------------------------
LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
http://lammps.sandia.gov, Sandia National Laboratories
Steve Plimpton, sjplimp@sandia.gov
Copyright (2003) Sandia Corporation. Under the terms of Contract
DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
certain rights in this software. This software is distributed under
the GNU General Public License.
See the README file in the top-level LAMMPS directory.
------------------------------------------------------------------------- */
/* ----------------------------------------------------------------------
Contributing author: Stan Moore (SNL)
------------------------------------------------------------------------- */
#include <cmath>
#include <cstdlib>
#include "angle_cosine_kokkos.h"
#include "atom_kokkos.h"
#include "neighbor_kokkos.h"
#include "domain.h"
#include "comm.h"
#include "force.h"
#include "math_const.h"
#include "memory_kokkos.h"
#include "error.h"
#include "atom_masks.h"
using namespace LAMMPS_NS;
using namespace MathConst;
#define SMALL 0.001
/* ---------------------------------------------------------------------- */
template<class DeviceType>
AngleCosineKokkos<DeviceType>::AngleCosineKokkos(LAMMPS *lmp) : AngleCosine(lmp)
{
atomKK = (AtomKokkos *) atom;
neighborKK = (NeighborKokkos *) neighbor;
execution_space = ExecutionSpaceFromDevice<DeviceType>::space;
datamask_read = X_MASK | F_MASK | ENERGY_MASK | VIRIAL_MASK;
datamask_modify = F_MASK | ENERGY_MASK | VIRIAL_MASK;
}
/* ---------------------------------------------------------------------- */
template<class DeviceType>
AngleCosineKokkos<DeviceType>::~AngleCosineKokkos()
{
if (!copymode) {
memoryKK->destroy_kokkos(k_eatom,eatom);
memoryKK->destroy_kokkos(k_vatom,vatom);
}
}
/* ---------------------------------------------------------------------- */
template<class DeviceType>
void AngleCosineKokkos<DeviceType>::compute(int eflag_in, int vflag_in)
{
eflag = eflag_in;
vflag = vflag_in;
if (eflag || vflag) ev_setup(eflag,vflag,0);
else evflag = 0;
// reallocate per-atom arrays if necessary
if (eflag_atom) {
memoryKK->destroy_kokkos(k_eatom,eatom);
memoryKK->create_kokkos(k_eatom,eatom,maxeatom,"angle:eatom");
d_eatom = k_eatom.template view<DeviceType>();
}
if (vflag_atom) {
memoryKK->destroy_kokkos(k_vatom,vatom);
memoryKK->create_kokkos(k_vatom,vatom,maxvatom,6,"angle:vatom");
d_vatom = k_vatom.template view<DeviceType>();
}
//atomKK->sync(execution_space,datamask_read);
k_k.template sync<DeviceType>();
// if (eflag || vflag) atomKK->modified(execution_space,datamask_modify);
// else atomKK->modified(execution_space,F_MASK);
x = atomKK->k_x.template view<DeviceType>();
f = atomKK->k_f.template view<DeviceType>();
neighborKK->k_anglelist.template sync<DeviceType>();
anglelist = neighborKK->k_anglelist.template view<DeviceType>();
int nanglelist = neighborKK->nanglelist;
nlocal = atom->nlocal;
newton_bond = force->newton_bond;
copymode = 1;
// loop over neighbors of my atoms
EV_FLOAT ev;
if (evflag) {
if (newton_bond) {
Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<1,1> >(0,nanglelist),*this,ev);
} else {
Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<0,1> >(0,nanglelist),*this,ev);
}
} else {
if (newton_bond) {
Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<1,0> >(0,nanglelist),*this);
} else {
Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<0,0> >(0,nanglelist),*this);
}
}
if (eflag_global) energy += ev.evdwl;
if (vflag_global) {
virial[0] += ev.v[0];
virial[1] += ev.v[1];
virial[2] += ev.v[2];
virial[3] += ev.v[3];
virial[4] += ev.v[4];
virial[5] += ev.v[5];
}
if (eflag_atom) {
k_eatom.template modify<DeviceType>();
k_eatom.template sync<LMPHostType>();
}
if (vflag_atom) {
k_vatom.template modify<DeviceType>();
k_vatom.template sync<LMPHostType>();
}
copymode = 0;
}
template<class DeviceType>
template<int NEWTON_BOND, int EVFLAG>
KOKKOS_INLINE_FUNCTION
void AngleCosineKokkos<DeviceType>::operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int &n, EV_FLOAT& ev) const {
// The f array is atomic
Kokkos::View<F_FLOAT*[3], typename DAT::t_f_array::array_layout,DeviceType,Kokkos::MemoryTraits<Kokkos::Atomic|Kokkos::Unmanaged> > a_f = f;
const int i1 = anglelist(n,0);
const int i2 = anglelist(n,1);
const int i3 = anglelist(n,2);
const int type = anglelist(n,3);
// 1st bond
const F_FLOAT delx1 = x(i1,0) - x(i2,0);
const F_FLOAT dely1 = x(i1,1) - x(i2,1);
const F_FLOAT delz1 = x(i1,2) - x(i2,2);
const F_FLOAT rsq1 = delx1*delx1 + dely1*dely1 + delz1*delz1;
const F_FLOAT r1 = sqrt(rsq1);
// 2nd bond
const F_FLOAT delx2 = x(i3,0) - x(i2,0);
const F_FLOAT dely2 = x(i3,1) - x(i2,1);
const F_FLOAT delz2 = x(i3,2) - x(i2,2);
const F_FLOAT rsq2 = delx2*delx2 + dely2*dely2 + delz2*delz2;
const F_FLOAT r2 = sqrt(rsq2);
// c = cosine of angle
F_FLOAT c = delx1*delx2 + dely1*dely2 + delz1*delz2;
c /= r1*r2;
if (c > 1.0) c = 1.0;
if (c < -1.0) c = -1.0;
// force & energy
F_FLOAT eangle = 0.0;
if (eflag) eangle = d_k[type]*(1.0+c);
const F_FLOAT a = d_k[type];
const F_FLOAT a11 = a*c / rsq1;
const F_FLOAT a12 = -a / (r1*r2);
const F_FLOAT a22 = a*c / rsq2;
F_FLOAT f1[3],f3[3];
f1[0] = a11*delx1 + a12*delx2;
f1[1] = a11*dely1 + a12*dely2;
f1[2] = a11*delz1 + a12*delz2;
f3[0] = a22*delx2 + a12*delx1;
f3[1] = a22*dely2 + a12*dely1;
f3[2] = a22*delz2 + a12*delz1;
// apply force to each of 3 atoms
if (NEWTON_BOND || i1 < nlocal) {
a_f(i1,0) += f1[0];
a_f(i1,1) += f1[1];
a_f(i1,2) += f1[2];
}
if (NEWTON_BOND || i2 < nlocal) {
a_f(i2,0) -= f1[0] + f3[0];
a_f(i2,1) -= f1[1] + f3[1];
a_f(i2,2) -= f1[2] + f3[2];
}
if (NEWTON_BOND || i3 < nlocal) {
a_f(i3,0) += f3[0];
a_f(i3,1) += f3[1];
a_f(i3,2) += f3[2];
}
if (EVFLAG) ev_tally(ev,i1,i2,i3,eangle,f1,f3,
delx1,dely1,delz1,delx2,dely2,delz2);
}
template<class DeviceType>
template<int NEWTON_BOND, int EVFLAG>
KOKKOS_INLINE_FUNCTION
void AngleCosineKokkos<DeviceType>::operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int &n) const {
EV_FLOAT ev;
this->template operator()<NEWTON_BOND,EVFLAG>(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>(), n, ev);
}
/* ---------------------------------------------------------------------- */
template<class DeviceType>
void AngleCosineKokkos<DeviceType>::allocate()
{
AngleCosine::allocate();
int n = atom->nangletypes;
k_k = typename ArrayTypes<DeviceType>::tdual_ffloat_1d("AngleCosine::k",n+1);
d_k = k_k.template view<DeviceType>();
}
/* ----------------------------------------------------------------------
set coeffs for one or more types
------------------------------------------------------------------------- */
template<class DeviceType>
void AngleCosineKokkos<DeviceType>::coeff(int narg, char **arg)
{
AngleCosine::coeff(narg, arg);
int n = atom->nangletypes;
for (int i = 1; i <= n; i++)
k_k.h_view[i] = k[i];
k_k.template modify<LMPHostType>();
}
/* ----------------------------------------------------------------------
proc 0 reads coeffs from restart file, bcasts them
------------------------------------------------------------------------- */
template<class DeviceType>
void AngleCosineKokkos<DeviceType>::read_restart(FILE *fp)
{
AngleCosine::read_restart(fp);
int n = atom->nangletypes;
for (int i = 1; i <= n; i++)
k_k.h_view[i] = k[i];
k_k.template modify<LMPHostType>();
}
/* ----------------------------------------------------------------------
tally energy and virial into global and per-atom accumulators
virial = r1F1 + r2F2 + r3F3 = (r1-r2) F1 + (r3-r2) F3 = del1*f1 + del2*f3
------------------------------------------------------------------------- */
template<class DeviceType>
//template<int NEWTON_BOND>
KOKKOS_INLINE_FUNCTION
void AngleCosineKokkos<DeviceType>::ev_tally(EV_FLOAT &ev, const int i, const int j, const int k,
F_FLOAT &eangle, F_FLOAT *f1, F_FLOAT *f3,
const F_FLOAT &delx1, const F_FLOAT &dely1, const F_FLOAT &delz1,
const F_FLOAT &delx2, const F_FLOAT &dely2, const F_FLOAT &delz2) const
{
E_FLOAT eanglethird;
F_FLOAT v[6];
// The eatom and vatom arrays are atomic
Kokkos::View<E_FLOAT*, typename DAT::t_efloat_1d::array_layout,DeviceType,Kokkos::MemoryTraits<Kokkos::Atomic|Kokkos::Unmanaged> > v_eatom = k_eatom.template view<DeviceType>();
Kokkos::View<F_FLOAT*[6], typename DAT::t_virial_array::array_layout,DeviceType,Kokkos::MemoryTraits<Kokkos::Atomic|Kokkos::Unmanaged> > v_vatom = k_vatom.template view<DeviceType>();
if (eflag_either) {
if (eflag_global) {
if (newton_bond) ev.evdwl += eangle;
else {
eanglethird = THIRD*eangle;
if (i < nlocal) ev.evdwl += eanglethird;
if (j < nlocal) ev.evdwl += eanglethird;
if (k < nlocal) ev.evdwl += eanglethird;
}
}
if (eflag_atom) {
eanglethird = THIRD*eangle;
if (newton_bond || i < nlocal) v_eatom[i] += eanglethird;
if (newton_bond || j < nlocal) v_eatom[j] += eanglethird;
if (newton_bond || k < nlocal) v_eatom[k] += eanglethird;
}
}
if (vflag_either) {
v[0] = delx1*f1[0] + delx2*f3[0];
v[1] = dely1*f1[1] + dely2*f3[1];
v[2] = delz1*f1[2] + delz2*f3[2];
v[3] = delx1*f1[1] + delx2*f3[1];
v[4] = delx1*f1[2] + delx2*f3[2];
v[5] = dely1*f1[2] + dely2*f3[2];
if (vflag_global) {
if (newton_bond) {
ev.v[0] += v[0];
ev.v[1] += v[1];
ev.v[2] += v[2];
ev.v[3] += v[3];
ev.v[4] += v[4];
ev.v[5] += v[5];
} else {
if (i < nlocal) {
ev.v[0] += THIRD*v[0];
ev.v[1] += THIRD*v[1];
ev.v[2] += THIRD*v[2];
ev.v[3] += THIRD*v[3];
ev.v[4] += THIRD*v[4];
ev.v[5] += THIRD*v[5];
}
if (j < nlocal) {
ev.v[0] += THIRD*v[0];
ev.v[1] += THIRD*v[1];
ev.v[2] += THIRD*v[2];
ev.v[3] += THIRD*v[3];
ev.v[4] += THIRD*v[4];
ev.v[5] += THIRD*v[5];
}
if (k < nlocal) {
ev.v[0] += THIRD*v[0];
ev.v[1] += THIRD*v[1];
ev.v[2] += THIRD*v[2];
ev.v[3] += THIRD*v[3];
ev.v[4] += THIRD*v[4];
ev.v[5] += THIRD*v[5];
}
}
}
if (vflag_atom) {
if (newton_bond || i < nlocal) {
v_vatom(i,0) += THIRD*v[0];
v_vatom(i,1) += THIRD*v[1];
v_vatom(i,2) += THIRD*v[2];
v_vatom(i,3) += THIRD*v[3];
v_vatom(i,4) += THIRD*v[4];
v_vatom(i,5) += THIRD*v[5];
}
if (newton_bond || j < nlocal) {
v_vatom(j,0) += THIRD*v[0];
v_vatom(j,1) += THIRD*v[1];
v_vatom(j,2) += THIRD*v[2];
v_vatom(j,3) += THIRD*v[3];
v_vatom(j,4) += THIRD*v[4];
v_vatom(j,5) += THIRD*v[5];
}
if (newton_bond || k < nlocal) {
v_vatom(k,0) += THIRD*v[0];
v_vatom(k,1) += THIRD*v[1];
v_vatom(k,2) += THIRD*v[2];
v_vatom(k,3) += THIRD*v[3];
v_vatom(k,4) += THIRD*v[4];
v_vatom(k,5) += THIRD*v[5];
}
}
}
}
/* ---------------------------------------------------------------------- */
namespace LAMMPS_NS {
template class AngleCosineKokkos<LMPDeviceType>;
#ifdef KOKKOS_HAVE_CUDA
template class AngleCosineKokkos<LMPHostType>;
#endif
}

View File

@ -0,0 +1,90 @@
/* -*- c++ -*- ----------------------------------------------------------
LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
http://lammps.sandia.gov, Sandia National Laboratories
Steve Plimpton, sjplimp@sandia.gov
Copyright (2003) Sandia Corporation. Under the terms of Contract
DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
certain rights in this software. This software is distributed under
the GNU General Public License.
See the README file in the top-level LAMMPS directory.
------------------------------------------------------------------------- */
#ifdef ANGLE_CLASS
AngleStyle(cosine/kk,AngleCosineKokkos<LMPDeviceType>)
AngleStyle(cosine/kk/device,AngleCosineKokkos<LMPDeviceType>)
AngleStyle(cosine/kk/host,AngleCosineKokkos<LMPHostType>)
#else
#ifndef LMP_ANGLE_COSINE_KOKKOS_H
#define LMP_ANGLE_COSINE_KOKKOS_H
#include "angle_cosine.h"
#include "kokkos_type.h"
namespace LAMMPS_NS {
template<int NEWTON_BOND, int EVFLAG>
struct TagAngleCosineCompute{};
template<class DeviceType>
class AngleCosineKokkos : public AngleCosine {
public:
typedef DeviceType device_type;
typedef EV_FLOAT value_type;
AngleCosineKokkos(class LAMMPS *);
virtual ~AngleCosineKokkos();
void compute(int, int);
void coeff(int, char **);
void read_restart(FILE *);
template<int NEWTON_BOND, int EVFLAG>
KOKKOS_INLINE_FUNCTION
void operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int&, EV_FLOAT&) const;
template<int NEWTON_BOND, int EVFLAG>
KOKKOS_INLINE_FUNCTION
void operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int&) const;
//template<int NEWTON_BOND>
KOKKOS_INLINE_FUNCTION
void ev_tally(EV_FLOAT &ev, const int i, const int j, const int k,
F_FLOAT &eangle, F_FLOAT *f1, F_FLOAT *f3,
const F_FLOAT &delx1, const F_FLOAT &dely1, const F_FLOAT &delz1,
const F_FLOAT &delx2, const F_FLOAT &dely2, const F_FLOAT &delz2) const;
protected:
class NeighborKokkos *neighborKK;
typename ArrayTypes<DeviceType>::t_x_array_randomread x;
typename ArrayTypes<DeviceType>::t_f_array f;
typename ArrayTypes<DeviceType>::t_int_2d anglelist;
typename ArrayTypes<DeviceType>::tdual_efloat_1d k_eatom;
typename ArrayTypes<DeviceType>::tdual_virial_array k_vatom;
typename ArrayTypes<DeviceType>::t_efloat_1d d_eatom;
typename ArrayTypes<DeviceType>::t_virial_array d_vatom;
int nlocal,newton_bond;
int eflag,vflag;
typename ArrayTypes<DeviceType>::tdual_ffloat_1d k_k;
typename ArrayTypes<DeviceType>::t_ffloat_1d d_k;
void allocate();
};
}
#endif
#endif
/* ERROR/WARNING messages:
*/

View File

@ -4988,7 +4988,7 @@ void PairBOP::read_table(char *filename)
fgets(s,MAXLINE,fp);
nws=0;
ws=1;
for(j=0;j<strlen(s);j++) {
for(j=0;j<(int)strlen(s);j++) {
if(ws==1) {
if(isspace(s[j])) {
ws=1;
@ -5015,7 +5015,7 @@ void PairBOP::read_table(char *filename)
nws=0;
ws=1;
fgets(s,MAXLINE,fp);
for(j=0;j<strlen(s);j++) {
for(j=0;j<(int)strlen(s);j++) {
if(ws==1) {
if(isspace(s[j])) {
ws=1;

View File

@ -36,7 +36,7 @@ AngleCosine::AngleCosine(LAMMPS *lmp) : Angle(lmp) {}
AngleCosine::~AngleCosine()
{
if (allocated) {
if (allocated && !copymode) {
memory->destroy(setflag);
memory->destroy(k);
}

View File

@ -30,17 +30,17 @@ class AngleCosine : public Angle {
AngleCosine(class LAMMPS *);
virtual ~AngleCosine();
virtual void compute(int, int);
void coeff(int, char **);
virtual void coeff(int, char **);
double equilibrium_angle(int);
void write_restart(FILE *);
void read_restart(FILE *);
virtual void read_restart(FILE *);
void write_data(FILE *);
double single(int, int, int, int);
protected:
double *k;
void allocate();
virtual void allocate();
};
}

View File

@ -24,7 +24,7 @@
#include "update.h"
#include "respa.h"
#include "error.h"
#include "python.h"
#include "lmppython.h"
#include "python_compat.h"
using namespace LAMMPS_NS;

View File

@ -25,7 +25,7 @@
#include "force.h"
#include "memory.h"
#include "neigh_list.h"
#include "python.h"
#include "lmppython.h"
#include "error.h"
#include "python_compat.h"

View File

@ -26,7 +26,7 @@
#include "memory.h"
#include "update.h"
#include "neigh_list.h"
#include "python.h"
#include "lmppython.h"
#include "error.h"
#include "python_compat.h"
@ -401,9 +401,9 @@ double PairPython::init_one(int, int)
/* ---------------------------------------------------------------------- */
double PairPython::single(int i, int j, int itype, int jtype, double rsq,
double factor_coul, double factor_lj,
double &fforce)
double PairPython::single(int /* i */, int /* j */, int itype, int jtype,
double rsq, double /* factor_coul */,
double factor_lj, double &fforce)
{
// with hybrid/overlay we might get called for skipped types
if (skip_types[itype] || skip_types[jtype]) {

Some files were not shown because too many files have changed in this diff Show More