Merge pull request #1323 from akohlmey/next-patch-release

Patch release 8 February 2019
Merge branch 'master' into next-patch-release
2019-02-08 21:43:39 +01:00 · 2019-02-08 15:34:40 -05:00 · 2019-02-08 15:27:16 -05:00 · 2019-02-08 21:07:36 +01:00 · 2019-02-08 21:07:23 +01:00 · 2019-02-08 17:19:01 +01:00
115 changed files with 1780 additions and 457 deletions
--- a/.github/CODE_OF_CONDUCT.md
+++ b/.github/CODE_OF_CONDUCT.md
@ -0,0 +1,67 @@
+# Code of Conduct for the LAMMPS Project on GitHub
+
+## Our Pledge
+
+In the interest of fostering an open and welcoming environment, we as LAMMPS
+developers, contributors, and maintainers pledge to making participation in
+our project a harassment-free experience for everyone.
+
+## Our Standards
+
+Examples of behavior that contributes to creating a positive environment
+include:
+
+* Using welcoming and inclusive language
+* Being respectful of differing viewpoints and experiences
+* Gracefully accepting constructive criticism
+* Focusing on what is best for the community
+* Showing empathy towards other community members
+
+Examples of unacceptable behavior by participants include:
+
+* The use of explicit language or imagery
+* Trolling, insulting/derogatory comments, and personal or political attacks
+* Public or private harassment
+* Publishing others' private information, such as a physical or electronic
+  address, without explicit permission
+
+## Our Responsibilities
+
+Project maintainers are responsible for clarifying the standards of acceptable
+behavior and are expected to take appropriate and fair corrective action in
+response to any instances of unacceptable behavior.
+
+Project maintainers have the right and responsibility to remove, edit, or
+reject comments, commits, code, issues, and other contributions that are not
+aligned to this Code of Conduct, or to ban temporarily or permanently any
+developer, maintainer, or contributor for this or other behaviors that they
+deem inappropriate, threatening, offensive, or harmful.
+
+## Scope
+
+This Code of Conduct applies to all public exchanges in the LAMMPS project
+on GitHub and in submitted code.
+
+## Enforcement
+
+Instances of abusive, harassing, or otherwise unacceptable behavior may be
+reported by contacting the project team at developer@lammps.org. All
+complaints will be reviewed and investigated and will result in a response
+that is deemed necessary and appropriate to the circumstances. The project
+team is obligated to maintain confidentiality with regard to the reporter
+of an incident.
+
+Project maintainers who do not follow or enforce the Code of Conduct in good
+faith may face temporary or permanent repercussions as determined by other
+members of the project's leadership.
+
+## Attribution
+
+This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
+available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
+
+[homepage]: https://www.contributor-covenant.org
+
+For answers to common questions about this code of conduct, see
+https://www.contributor-covenant.org/faq
+
--- a/.github/CONTRIBUTING.md
+++ b/.github/CONTRIBUTING.md
@ -2,10 +2,10 @@

 Thank your for considering to contribute to the LAMMPS software project.

-The following is a set of guidelines as well as explanations of policies and workflows for contributing to the LAMMPS molecular dynamics software project. These guidelines focus on submitting issues or pull requests on the LAMMPS GitHub project.
+The following is a set of guidelines as well as explanations of policies and work flows for contributing to the LAMMPS molecular dynamics software project. These guidelines focus on submitting issues or pull requests on the LAMMPS GitHub project.

 Thus please also have a look at:
-* [The Section on submitting new features for inclusion in LAMMPS of the Manual](http://lammps.sandia.gov/doc/Section_modify.html#mod-15)
+* [The Section on submitting new features for inclusion in LAMMPS of the Manual](https://lammps.sandia.gov/doc/Modify_contribute.html)
 * [The LAMMPS GitHub Tutorial in the Manual](http://lammps.sandia.gov/doc/Howto_github.html)

 ## Table of Contents
@ -18,7 +18,7 @@ Thus please also have a look at:
 * [Suggesting Enhancements](#suggesting-enhancements)
 * [Contributing Code](#contributing-code)

-[GitHub Workflows](#github-workflows)
+[GitHub Work flows](#github-workflows)
 * [Issues](#issues)
 * [Pull Requests](#pull-requests)

@ -26,17 +26,17 @@ __

 ## I don't want to read this whole thing I just have a question!

-> **Note:** Please do not file an issue to ask a general question about LAMMPS, its features, how to use specific commands, or how perform simulations or analysis in LAMMPS. Instead post your question to the ['lammps-users' mailing list](http://lammps.sandia.gov/mail.html). You do not need to be subscribed to post to the list (but a mailing list subscription avoids having your post delayed until it is approved by a mailing list moderator). Most posts to the mailing list receive a response within less than 24 hours. Before posting to the mailing list, please read the [mailing list guidelines](http://lammps.sandia.gov/guidelines.html). Following those guidelines will help greatly to get a helpful response. Always mention which LAMMPS version you are using.
+> **Note:** Please do not file an issue to ask a general question about LAMMPS, its features, how to use specific commands, or how perform simulations or analysis in LAMMPS. Instead post your question to the ['lammps-users' mailing list](https://lammps.sandia.gov/mail.html). You do not need to be subscribed to post to the list (but a mailing list subscription avoids having your post delayed until it is approved by a mailing list moderator). Most posts to the mailing list receive a response within less than 24 hours. Before posting to the mailing list, please read the [mailing list guidelines](https://lammps.sandia.gov/guidelines.html). Following those guidelines will help greatly to get a helpful response. Always mention which LAMMPS version you are using.

 ## How Can I Contribute?

 There are several ways how you can actively contribute to the LAMMPS project: you can discuss compiling and using LAMMPS, and solving LAMMPS related problems with other LAMMPS users on the lammps-users mailing list, you can report bugs or suggest enhancements by creating issues on GitHub (or posting them to the lammps-users mailing list), and you can contribute by submitting pull requests on GitHub or e-mail your code
-to one of the [LAMMPS core developers](http://lammps.sandia.gov/authors.html). As you may see from the aforementioned developer page, the LAMMPS software package includes the efforts of a very large number of contributors beyond the principal authors and maintainers.
+to one of the [LAMMPS core developers](https://lammps.sandia.gov/authors.html). As you may see from the aforementioned developer page, the LAMMPS software package includes the efforts of a very large number of contributors beyond the principal authors and maintainers.

 ### Discussing How To Use LAMMPS

 The LAMMPS mailing list is hosted at SourceForge. The mailing list began in 2005, and now includes tens of thousands of messages in thousands of threads. LAMMPS developers try to respond to posted questions in a timely manner, but there are no guarantees. Please consider that people live in different timezone and may not have time to answer e-mails outside of their work hours.
-You can post to list by sending your email to lammps-users at lists.sourceforge.net (no subscription required), but before posting, please read the [mailing list guidelines](http://lammps.sandia.gov/guidelines.html) to maximize your chances to receive a helpful response.
+You can post to list by sending your email to lammps-users at lists.sourceforge.net (no subscription required), but before posting, please read the [mailing list guidelines](https://lammps.sandia.gov/guidelines.html) to maximize your chances to receive a helpful response.

 Anyone can browse/search previous questions/answers in the archives. You do not have to subscribe to the list to post questions, receive answers (to your questions), or browse/search the archives. You **do** need to subscribe to the list if you want emails for **all** the posts (as individual messages or in digest form), or to answer questions yourself. Feel free to sign up and help us out! Answering questions from fellow LAMMPS users is a great way to pay back the community for providing you a useful tool for free, and to pass on the advice you have received yourself to others. It improves your karma and helps you understand your own research better.

@ -44,7 +44,7 @@ If you post a message and you are a subscriber, your message will appear immedia

 ### Reporting Bugs

-While developers writing code for LAMMPS are careful to test their code, LAMMPS is such a large and complex software, that it is impossible to test for all combinations of features under all normal and not so normal circumstances. Thus bugs do happen, and if you suspect, that you have encountered one, please try to document it and report it as an [Issue](https://github.com/lammps/lammps/issues) on the LAMMPS GitHub project web page. However, before reporting a bug, you need to check whether this is something that may have already been corrected. The [Latest Features and Bug Fixes in LAMMPS](http://lammps.sandia.gov/bug.html) web page lists all significant changes to LAMMPS over the years. It also tells you what the current latest development version of LAMMPS is, and you should test whether your issue still applies to that version.
+While developers writing code for LAMMPS are careful to test their code, LAMMPS is such a large and complex software, that it is impossible to test for all combinations of features under all normal and not so normal circumstances. Thus bugs do happen, and if you suspect, that you have encountered one, please try to document it and report it as an [Issue](https://github.com/lammps/lammps/issues) on the LAMMPS GitHub project web page. However, before reporting a bug, you need to check whether this is something that may have already been corrected. The [Latest Features and Bug Fixes in LAMMPS](https://lammps.sandia.gov/bug.html) web page lists all significant changes to LAMMPS over the years. It also tells you what the current latest development version of LAMMPS is, and you should test whether your issue still applies to that version.

 When you click on the green "New Issue" button, you will be provided with a text field, where you can enter your message. That text field with contain a template with several headlines and some descriptions. Keep the headlines that are relevant to your reported potential bug and replace the descriptions with the information as suggested by the descriptions.
 You can also attach small text files (please add the file name extension `.txt` or it will be rejected), images, or small compressed text files (using gzip, do not use RAR or 7-ZIP or similar tools that are uncommon outside of Windows machines). In many cases, bugs are best illustrated by providing a small input deck (do **not** attach your entire production input, but remove everything that is not required to reproduce the issue, and scale down your system size, that the resulting calculation runs fast and can be run on small desktop quickly).
@ -62,13 +62,13 @@ To be able to submit an issue on GitHub, you have to register for an account (fo

 We encourage users to submit new features or modifications for LAMMPS to the core developers so they can be added to the LAMMPS distribution. The preferred way to manage and coordinate this is by submitting a pull request at the LAMMPS project on GitHub. For any larger modifications or programming project, you are encouraged to contact the LAMMPS developers ahead of time, in order to discuss implementation strategies and coding guidelines, that will make it easier to integrate your contribution and result in less work for everybody involved. You are also encouraged to search through the list of open issues on GitHub and submit a new issue for a planned feature, so you would not duplicate the work of others (and possibly get scooped by them) or have your work duplicated by others.

-How quickly your contribution will be integrated depends largely on how much effort it will cause to integrate and test it, how much it requires changes to the core code base, and of how much interest it is to the larger LAMMPS community. Please see below for a checklist of typical requirements. Once you have prepared everything, see [this tutorial](http://lammps.sandia.gov/doc/Howto_github.html)
+How quickly your contribution will be integrated depends largely on how much effort it will cause to integrate and test it, how much it requires changes to the core code base, and of how much interest it is to the larger LAMMPS community. Please see below for a checklist of typical requirements. Once you have prepared everything, see [this tutorial](https://lammps.sandia.gov/doc/Howto_github.html)
 for instructions on how to submit your changes or new files through a GitHub pull request

 Here is a checklist of steps you need to follow to submit a single file or user package for our consideration. Following these steps will save both you and us time. See existing files in packages in the source directory for examples. If you are uncertain, please ask on the lammps-users mailing list.

 * All source files you provide must compile with the most current version of LAMMPS with multiple configurations. In particular you need to test compiling LAMMPS from scratch with `-DLAMMPS_BIGBIG` set in addition to the default `-DLAMMPS_SMALLBIG` setting. Your code will need to work correctly in serial and in parallel using MPI.
-* For consistency with the rest of LAMMPS and especially, if you want your contribution(s) to be added to main LAMMPS code or one of its standard packages, it needs to be written in a style compatible with other LAMMPS source files. This means: 2-character indentation per level, no tabs, no lines over 80 characters. I/O is done via the C-style stdio library, class header files should not import any system headers outside <stdio.h>, STL containers should be avoided in headers, and forward declarations used where possible or needed. All added code should be placed into the LAMMPS_NS namespace or a sub-namespace; global or static variables should be avoided, as they conflict with the modular nature of LAMMPS and the C++ class structure. Header files must not import namespaces with using. This all is so the developers can more easily understand, integrate, and maintain your contribution and reduce conflicts with other parts of LAMMPS. This basically means that the code accesses data structures, performs its operations, and is formatted similar to other LAMMPS source files, including the use of the error class for error and warning messages.
+* For consistency with the rest of LAMMPS and especially, if you want your contribution(s) to be added to main LAMMPS code or one of its standard packages, it needs to be written in a style compatible with other LAMMPS source files. This means: 2-character indentation per level, no tabs, no lines over 80 characters. I/O is done via the C-style stdio library, style class header files should not import any system headers outside of <cstdio>, STL containers should be avoided in headers, and forward declarations used where possible or needed. All added code should be placed into the LAMMPS_NS namespace or a sub-namespace; global or static variables should be avoided, as they conflict with the modular nature of LAMMPS and the C++ class structure. There MUST NOT be any "using namespace XXX;" statements in headers. In the implementation file (<name>.cpp) system includes should be placed in angular brackets (<>) and for c-library functions the C++ style header files should be included (<cstdio> instead of <stdio.h>, or <cstring> instead of <string.h>). This all is so the developers can more easily understand, integrate, and maintain your contribution and reduce conflicts with other parts of LAMMPS. This basically means that the code accesses data structures, performs its operations, and is formatted similar to other LAMMPS source files, including the use of the error class for error and warning messages.
 * If you want your contribution to be added as a user-contributed feature, and it is a single file (actually a `<name>.cpp` and `<name>.h` file) it can be rapidly added to the USER-MISC directory. Include the one-line entry to add to the USER-MISC/README file in that directory, along with the 2 source files. You can do this multiple times if you wish to contribute several individual features.
 * If you want your contribution to be added as a user-contribution and it is several related features, it is probably best to make it a user package directory with a name like USER-FOO. In addition to your new files, the directory should contain a README text file. The README should contain your name and contact information and a brief description of what your new package does. If your files depend on other LAMMPS style files also being installed (e.g. because your file is a derived class from the other LAMMPS class), then an Install.sh file is also needed to check for those dependencies. See other README and Install.sh files in other USER directories as examples. Send us a tarball of this USER-FOO directory.
 * Your new source files need to have the LAMMPS copyright, GPL notice, and your name and email address at the top, like other user-contributed LAMMPS source files. They need to create a class that is inside the LAMMPS namespace. If the file is for one of the USER packages, including USER-MISC, then we are not as picky about the coding style (see above). I.e. the files do not need to be in the same stylistic format and syntax as other LAMMPS files, though that would be nice for developers as well as users who try to read your code.
@ -102,10 +102,10 @@ For bug reports, the next step is that one of the core LAMMPS developers will se

 ### Pull Requests

-For submitting pull requests, there is a [detailed tutorial](http://lammps.sandia.gov/doc/Howto_github.html) in the LAMMPS manual. Thus only a brief breakdown of the steps is presented here. Please note, that the LAMMPS developers are still reviewing and trying to improve the process. If you are unsure about something, do not hesitate to post a question on the lammps-users mailing list or contact one fo the core LAMMPS developers.
+For submitting pull requests, there is a [detailed tutorial](https://lammps.sandia.gov/doc/Howto_github.html) in the LAMMPS manual. Thus only a brief breakdown of the steps is presented here. Please note, that the LAMMPS developers are still reviewing and trying to improve the process. If you are unsure about something, do not hesitate to post a question on the lammps-users mailing list or contact one fo the core LAMMPS developers.
 Immediately after the submission, the LAMMPS continuing integration server at ci.lammps.org will download your submitted branch and perform a simple compilation test, i.e. will test whether your submitted code can be compiled under various conditions. It will also do a check on whether your included documentation translates cleanly. Whether these tests are successful or fail will be recorded. If a test fails, please inspect the corresponding output on the CI server and take the necessary steps, if needed, so that the code can compile cleanly again. The test will be re-run each the pull request is updated with a push to the remote branch on GitHub.
-Next a LAMMPS core developer will self-assign and do an overall technical assessment of the submission. If you are not yet registered as a LAMMPS collaborator, you will receive an invitation for that.
-You may also receive comments and suggestions on the overall submission or specific details. If permitted, additional changes may be pushed into your pull request branch or a pull request may be filed in your LAMMPS fork on GitHub to include those changes.
+Next a LAMMPS core developer will self-assign and do an overall technical assessment of the submission. If you are not yet registered as a LAMMPS collaborator, you will receive an invitation for that. As part of the assesment, the pull request will be categorized with labels. There are two special labels: `needs_work` (indicates that work from the submitter of the pull request is needed) and `work_in_progress` (indicates, that the assigned LAMMPS developer will make changes, if not done by the contributor who made the submit). 
+You may also receive comments and suggestions on the overall submission or specific details and on occasion specific requests for changes as part of the review. If permitted, also additional changes may be pushed into your pull request branch or a pull request may be filed in your LAMMPS fork on GitHub to include those changes.
 The LAMMPS developer may then decide to assign the pull request to another developer (e.g. when that developer is more knowledgeable about the submitted feature or enhancement or has written the modified code). It may also happen, that additional developers are requested to provide a review and approve the changes. For submissions, that may change the general behavior of LAMMPS, or where a possibility of unwanted side effects exists, additional tests may be requested by the assigned developer.
 If the assigned developer is satisfied and considers the submission ready for inclusion into LAMMPS, the pull request will receive approvals and be merged into the master branch by one of the core LAMMPS developers. After the pull request is merged, you may delete the feature branch used for the pull request in your personal LAMMPS fork.
 Since the learning curve for git is quite steep for efficiently managing remote repositories, local and remote branches, pull requests and more, do not hesitate to ask questions, if you are not sure about how to do certain steps that are asked of you. Even if the changes asked of you do not make sense to you, they may be important for the LAMMPS developers. Please also note, that these all are guidelines and nothing set in stone. So depending on the nature of the contribution, the workflow may be adjusted.
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@ -1,31 +0,0 @@
-## Summary
-
-_Please provide a brief description of the issue_
-
-## Type of Issue
-
-_Is this a 'Bug Report' or a 'Suggestion for an Enhancement'?_
-
-## Detailed Description (Enhancement Suggestion)
-
-_Explain how you would like to see LAMMPS enhanced, what feature(s) you are looking for, provide references to relevant background information, and whether you are willing to implement the enhancement yourself or would like to participate in the implementation_
-
-## LAMMPS Version (Bug Report)
-
-_Please specify which LAMMPS version this issue was detected with. If this is not the latest development version, please stop and test that version, too, and report it here if the bug persists_
-
-## Expected Behavior (Bug Report)
-
-_Describe the expected behavior. Quote from the LAMMPS manual where needed or explain why the expected behavior is meaningful, especially when it differs from the manual_
-
-## Actual Behavior (Bug Report)
-
-_Describe the actual behavior, how it differs from the expected behavior, and how this can be observed. Try to be specific and do **not* use vague terms like "doesn't work" or "wrong result". Do not assume that the person reading this has any experience with or knowledge of your specific research._
-
-## Steps to Reproduce (Bug Report)
-
-_Describe the steps required to quickly reproduce the issue. You can attach (small) files to the section below or add URLs where to download an archive with all necessary files. Please try to create input that are as small as possible and run as fast as possible. NOTE: the less effort and time it takes to reproduce your issue, the more likely, that somebody will look into it._
-
-## Further Information, Files, and Links
-
-_Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications_
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@ -0,0 +1,32 @@
+---
+name: Bug report
+about: Create a bug report to help us eliminate issues and improve LAMMPS
+title: "[BUG] _Replace With Suitable Title_"
+labels: bug
+assignees: ''
+
+---
+
+**Summary**
+
+_Please provide a clear and concise description of what the bug is._
+
+**LAMMPS Version and Platform**
+
+_Please specify precisely which LAMMPS version this issue was detected with (the first line of the output) and what platform (operating system and its version, hardware) you are running on. If possible, test with the most recent LAMMPS patch version_
+
+**Expected Behavior**
+
+_Describe the expected behavior.  Quote from the LAMMPS manual where needed, or explain why the expected behavior is meaningful, especially when it differs from the manual_
+
+**Actual Behavior**
+
+_Describe the actual behavior, how it differs from the expected behavior, and how this can be observed.  Try to be specific and do **not** use vague terms like "doesn't work" or "wrong result".  Do not assume that the person reading this has any experience with or knowledge of your specific area of research._
+
+**Steps to Reproduce**
+
+_Describe the steps required to (quickly) reproduce the issue. You can attach (small) files to the section below or add URLs where to download an archive with all necessary files. Please try to create an input set that is as minimal and small as possible and reproduces the bug as quickly as possible. **NOTE:** the less effort and time it takes to reproduce your reported bug, the more likely it becomes, that somebody will look into it and fix the problem._
+
+**Further Information, Files, and Links**
+
+_Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications_
--- a/.github/ISSUE_TEMPLATE/feature_request.md
+++ b/.github/ISSUE_TEMPLATE/feature_request.md
@ -0,0 +1,20 @@
+---
+name: Feature request
+about: Make a suggestion for a new feature or a change to LAMMPS
+title: "[Feature Request] _Replace with Title_"
+labels: enhancement
+assignees: ''
+
+---
+
+**Summary**
+
+_Please provide a brief and concise description of the suggested feature or change_
+
+**Detailed Description**
+
+_Please explain how you would like to see LAMMPS enhanced, what feature(s) you are looking for, what specific problems this will solve. If possible, provide references to relevant background information like publications or web pages, and whether you are planning to implement the enhancement yourself or would like to participate in the implementation. If applicable add a reference to an existing bug report or issue that this will address._
+
+**Further Information, Files, and Links**
+
+_Put any additional information here, attach relevant text or image files and URLs to external sites, e.g. relevant publications_
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@ -1,28 +1,46 @@
-## Purpose
+**Summary**

-_Briefly describe the new feature(s), enhancement(s), or bugfix(es) included in this pull request. If this addresses an open GitHub Issue, mention the issue number, e.g. with `fixes #221` or `closes #135`, so that issue will be automatically closed when the pull request is merged_
+_Briefly describe the new feature(s), enhancement(s), or bugfix(es) included in this pull request._

-## Author(s)
+**Related Issues**

-_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
+__If this addresses an open GitHub Issue, mention the issue number here. Use the phrases `fixes #221` or `closes #135`, when you want those issues to be automatically closed when the pull request is merged_

-## Backward Compatibility
+**Author(s)**
+
+_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request. If this pull request adds new files to the distribution, please also provide a suitable "long-lived" e-mail address (e.g. from gmail, yahoo, outlook, etc.) for the *corresponding* author, i.e. the person the LAMMPS developers can contact directly with questions and requests related to maintenance and support of this code. now and in the future_
+
+**Licensing**
+
+By submitting this pull request, I agree, that my contribution will be included in LAMMPS and redistributed under the GNU General Public License version 2.
+
+_Please complete the following statement by adding "yes" or "no":_
+My contribution may be re-licensed as LGPL (for use of LAMMPS as a library linked to proprietary software):
+
+**Backward Compatibility**

 _Please state whether any changes in the pull request break backward compatibility for inputs, and - if yes - explain what has been changed and why_

-## Implementation Notes
+**Implementation Notes**

 _Provide any relevant details about how the changes are implemented, how correctness was verified, how other features - if any - in LAMMPS are affected_

-## Post Submission Checklist
+**Post Submission Checklist**
+
+_Please check the fields below as they are completed **after** the pull request has been submitted_

-_Please check the fields below as they are completed_
 - [ ] The feature or features in this pull request is complete
- [ ] Suitable new documentation files and/or updates to the existing docs are included
- [ ] One or more example input decks are included
+- [ ] Licensing information is complete
+- [ ] Corresponding author information is complete
 - [ ] The source code follows the LAMMPS formatting guidelines
+- [ ] Suitable new documentation files and/or updates to the existing docs are included
+- [ ] The added/updated documentation is integrated and tested with the documentation build system
+- [ ] The feature has been verified to work with the conventional build system
+- [ ] The feature has been verified to work with the CMake based build system
+- [ ] A package specific README file has been included or updated
+- [ ] One or more example input decks are included

-## Further Information, Files, and Links
+**Further Information, Files, and Links**

 _Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. DOIs or webpages)_

--- a/.github/PULL_REQUEST_TEMPLATE/bug_fix.md
+++ b/.github/PULL_REQUEST_TEMPLATE/bug_fix.md
@ -0,0 +1,42 @@
+---
+name: Bug fix
+about: Submit a pull request that fixes one or more bugs
+title: "[BUGFIX] _Replace With Suitable Title_"
+labels: bugfix
+assignees: ''
+
+---
+
+**Summary**
+
+_Briefly describe the bug or bugs, that are eliminated by this pull request._
+
+**Related Issue(s)**
+
+_If this request addresses or is related to an existing (open) GitHub issue, e.g. a bug report, mention the issue number number here following a pound sign (aka hashmark), e.g.`#222`._
+
+**Author(s)**
+
+_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
+
+**Licensing**
+
+By submitting this pull request I implicitly accept, that my submission is subject to the same licensing terms as the files that are modified.
+
+**Backward Compatibility**
+
+_Please state whether any changes in the pull request break backward compatibility for inputs, and - if yes - explain what has been changed and why_
+
+**Detailed Description**
+
+_Provide any relevant details about how the fixed bug can be reproduced, how the changes are implemented, how correctness was verified, how other features - if any - in LAMMPS are affected_
+
+## Post Submission Checklist
+
+_Please check the fields below as they are completed *after* the pull request is submitted_
+- [ ] The code in this pull request is complete
+- [ ] The source code follows the LAMMPS formatting guidelines
+
+## Further Information, Files, and Links
+
+_Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. to download input decks for testing)_
--- a/.github/PULL_REQUEST_TEMPLATE/maintenance_refactoring.md
+++ b/.github/PULL_REQUEST_TEMPLATE/maintenance_refactoring.md
@ -0,0 +1,35 @@
+---
+name: Maintenance or Refactoring
+about: Submit a pull request that does code refactoring or other maintenance changes
+title: "[MAINTENANCE] _Replace With Suitable Title_"
+labels: maintenance
+assignees: ''
+
+---
+
+**Summary**
+
+_Briefly describe the included changes._
+
+**Related Issue(s)**
+
+_If this request addresses or is related to an existing (open) GitHub issue, e.g. a bug report, mention the issue number number here following a pound sign (aka hashmark), e.g.`#222`.
+
+**Author(s)**
+
+_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
+
+**Licensing**
+
+By submitting this pull request I implicitly accept, that my submission is subject to the same licensing terms as the files that are modified.
+
+**Detailed Description**
+
+_Provide any relevant details about the included changes._
+
+## Post Submission Checklist
+
+_Please check the fields below as they are completed *after* the pull request is submitted_
+- [ ] The pull request is complete
+- [ ] The source code follows the LAMMPS formatting guidelines
+
--- a/.github/PULL_REQUEST_TEMPLATE/new_feature.md
+++ b/.github/PULL_REQUEST_TEMPLATE/new_feature.md
@ -0,0 +1,56 @@
+---
+name: New Feature
+about: Submit a pull request that adds new Features (complete files) to LAMMPS
+title: "[New Feature] _Replace With Suitable Title_"
+labels: enhancement
+assignees: ''
+
+---
+
+**Summary**
+
+_Briefly describe the new feature(s) included in this pull request._
+
+**Related Issues**
+
+_If this addresses an existing (open) GitHub issue, e.g. a feature request, mention the issue number here following a pound sign (aka hashmark), e.g. `#331`._
+
+**Author(s)**
+
+_Please state name and affiliation of the author or authors that should be credited with the features added in this pull request. Please provide a suitable "long-lived" e-mail address (e.g. from gmail, yahoo, outlook, etc.) for the *corresponding* author, i.e. the person the LAMMPS developers can contact directly with questions and requests related to maintenance and support of this code. now and in the future_
+
+**Licensing**
+
+_Please add *yes* or *no* to the following two statements (please contact @lammps/core if you have questions about this)_
+
+My contribution may be licensed as GPL v2 (default LAMMPS license):
+My contribution may be licensed as LGPL (for use as a library with proprietary software):
+
+**Backward Compatibility**
+
+_Please state if any of the changes in this pull request will affect backward compatibility for inputs, and - if yes - explain what has been changed and why_
+
+**Implementation Notes**
+
+_Provide any relevant details about how the new features are implemented, how correctness was verified, what platforms (OS, compiler, MPI, hardware, number of processors, accelerator(s)) it was tested on_
+
+## Post Submission Checklist
+
+_Please check the fields below as they are completed *after* the pull request has been submitted_
+
+- [ ] The feature or features in this pull request is complete
+- [ ] Licensing information is complete
+- [ ] Corresponding author information is complete
+- [ ] The source code follows the LAMMPS formatting guidelines
+- [ ] Suitable new documentation files and/or updates to the existing docs are included
+- [ ] The added/updated documentation is integrated and tested with the documentation build system
+- [ ] The feature has been verified to work with the conventional build system
+- [ ] The feature has been verified to work with the CMake based build system
+- [ ] A package specific README file has been included or updated
+- [ ] One or more example input decks are included
+
+## Further Information, Files, and Links
+
+_Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. DOIs or webpages)_
+
+
--- a/.github/PULL_REQUEST_TEMPLATE/update_enhancement.md
+++ b/.github/PULL_REQUEST_TEMPLATE/update_enhancement.md
@ -0,0 +1,42 @@
+---
+name: Update or Enhancement
+about: Submit a pull request that provides update or enhancements for a package or feature in LAMMPS
+title: "[UPDATE] _Replace With Suitable Title_"
+labels: enhancement
+assignees: ''
+
+---
+
+**Summary**
+
+_Briefly describe what kind of updates or enhancements for a package or feature are included. If you are not the original author of the package or feature, please mention, whether your contribution was created independently or in collaboration/cooperation with the original author._
+
+**Author(s)**
+
+_Please state name and affiliation of the author or authors that should be credited with the changes in this pull request_
+
+**Licensing**
+
+By submitting this pull request I implicitly accept, that my submission is subject to the same licensing terms as the original package or feature(s) that are updated or amended by this pull request.
+
+**Backward Compatibility**
+
+_Please state whether any changes in the pull request break backward compatibility for inputs, and - if yes - explain what has been changed and why_
+
+**Implementation Notes**
+
+_Provide any relevant details about how the changes are implemented, how correctness was verified, how other features - if any - in LAMMPS are affected_
+
+**Post Submission Checklist**
+
+_Please check the fields below as they are completed_
+- [ ] The feature or features in this pull request is complete
+- [ ] Suitable updates to the existing docs are included
+- [ ] One or more example input decks are included
+- [ ] The source code follows the LAMMPS formatting guidelines
+
+**Further Information, Files, and Links**
+
+_Put any additional information here, attach relevant text or image files, and URLs to external sites (e.g. DOIs or webpages)_
+
+
--- a/doc/.gitignore
+++ b/doc/.gitignore
@ -1,4 +1,5 @@
 /html
+/latex
 /spelling
 /LAMMPS.epub
 /LAMMPS.mobi
--- a/doc/github-development-workflow.md
+++ b/doc/github-development-workflow.md
@ -123,7 +123,7 @@ Here are some items to check:
    * float.h -> cfloat
    * limits.h -> climits
    * math.h -> cmath
-    * omplex.h -> complex
+    * complex.h -> complex
    * setjmp.h -> csetjmp
    * signal.h -> csignal
    * stddef.h -> cstddef
--- a/doc/src/Commands_bond.txt
+++ b/doc/src/Commands_bond.txt
@ -61,7 +61,7 @@ OPT.
 "charmm (iko)"_angle_charmm.html,
 "class2 (ko)"_angle_class2.html,
 "class2/p6"_angle_class2.html,
-"cosine (o)"_angle_cosine.html,
+"cosine (ko)"_angle_cosine.html,
 "cosine/buck6d"_angle_cosine_buck6d.html,
 "cosine/delta (o)"_angle_cosine_delta.html,
 "cosine/periodic (o)"_angle_cosine_periodic.html,
--- a/doc/src/Install_windows.txt
+++ b/doc/src/Install_windows.txt
@ -12,7 +12,7 @@ Download an executable for Windows :h3
 Pre-compiled Windows installers which install LAMMPS executables on a
 Windows system can be downloaded from this site:

-"http://rpm.lammps.org/windows.html"_http://rpm.lammps.org/windows.html
+"http://packages.lammps.org/windows.html"_http://packages.lammps.org/windows.html

 Note that each installer package has a date in its name, which
 corresponds to the LAMMPS version of the same date.  Installers for
--- a/doc/src/Manual.txt
+++ b/doc/src/Manual.txt
@ -1,7 +1,7 @@
 <!-- HTML_ONLY -->
 <HEAD>
 <TITLE>LAMMPS Users Manual</TITLE>
-<META NAME="docnumber" CONTENT="1 Feb 2019 version">
+<META NAME="docnumber" CONTENT="8 Feb 2019 version">
 <META NAME="author" CONTENT="http://lammps.sandia.gov - Sandia National Laboratories">
 <META NAME="copyright" CONTENT="Copyright (2003) Sandia Corporation. This software and manual is distributed under the GNU General Public License.">
 </HEAD>
@ -21,7 +21,7 @@
 :line

 LAMMPS Documentation :c,h1
-1 Feb 2019 version :c,h2
+8 Feb 2019 version :c,h2

 "What is a LAMMPS version?"_Manual_version.html

@ -37,27 +37,21 @@ LAMMPS is an open-source code, distributed freely under the terms of
 the GNU Public License (GPL).

 The "LAMMPS website"_lws has a variety of information about the code.
-It includes links to an on-line version of this manual, a "mail
+It includes links to an on-line version of this manual, a "mailing
 list"_http://lammps.sandia.gov/mail.html where users can post
-questions, and a "GitHub site"https://github.com/lammps/lammps where
+questions, and a "GitHub site"_https://github.com/lammps/lammps where
 all LAMMPS development is coordinated.

 :line

-"PDF file"_Manual.pdf of the entire manual, generated by
-"htmldoc"_http://freecode.com/projects/htmldoc
-
 The content for this manual is part of the LAMMPS distribution.  You
 can build a local copy of the Manual as HTML pages or a PDF file, by
 following the steps on the "Manual build"_Manual_build.html doc page.
-
 There is also a "Developer.pdf"_Developer.pdf document which gives
 a brief description of the basic code structure of LAMMPS.

 :line

-This manual is organized into the following sections.
-
 Once you are familiar with LAMMPS, you may want to bookmark "this
 page"_Commands.html since it gives quick access to a doc page for
 every LAMMPS command.
--- a/doc/src/angle_cosine.txt
+++ b/doc/src/angle_cosine.txt
@ -8,6 +8,7 @@

 angle_style cosine command :h3
 angle_style cosine/omp command :h3
+angle_style cosine/kk command :h3

 [Syntax:]

--- a/doc/utils/sphinx-config/conf.py
+++ b/doc/utils/sphinx-config/conf.py
@ -219,7 +219,7 @@ latex_elements = {
 #  author, documentclass [howto, manual, or own class]).
 latex_documents = [
  ('Manual', 'LAMMPS.tex', 'LAMMPS Documentation',
-   'Steve Plimpton', 'manual'),
+   'The LAMMPS Developers', 'manual'),
 ]

 # The name of an image file (relative to this directory) to place at the top of
--- a/lib/kokkos/CHANGELOG.md
+++ b/lib/kokkos/CHANGELOG.md
@ -1,5 +1,26 @@
 # Change Log

+## [2.8.00](https://github.com/kokkos/kokkos/tree/2.8.00) (2019-02-05)
+[Full Changelog](https://github.com/kokkos/kokkos/compare/2.7.24...2.8.00)
+
+**Implemented enhancements:**
+
+- Capability, Tests: C++14 support and testing [\#1914](https://github.com/kokkos/kokkos/issues/1914)
+- Capability: Add environment variables for all command line arguments [\#1798](https://github.com/kokkos/kokkos/issues/1798)
+- Capability: --kokkos-ndevices not working for Slurm [\#1920](https://github.com/kokkos/kokkos/issues/1920)
+- View: Undefined behavior when deep copying from and to an empty unmanaged view [\#1967](https://github.com/kokkos/kokkos/issues/1967)
+- BuildSystem: nvcc\_wrapper should stop immediately if nvcc is not in PATH [\#1861](https://github.com/kokkos/kokkos/issues/1861)
+
+**Fixed bugs:**
+
+- Cuda: Fix Volta Issues 1 Non-deterministic behavior on Volta, runs fine on Pascal [\#1949](https://github.com/kokkos/kokkos/issues/1949)
+- Cuda: Fix Volta Issues 2 CUDA Team Scan gives wrong values on Volta with -G compile flag [\#1942](https://github.com/kokkos/kokkos/issues/1942)
+- Cuda: illegal warp sync in parallel\_reduce by functor on Turing 75 [\#1958](https://github.com/kokkos/kokkos/issues/1958)
+- Threads: Pthreads backend does not handle RangePolicy with offset correctly [\#1976](https://github.com/kokkos/kokkos/issues/1976)
+- Atomics: atomic\_fetch\_oper has no case for Kokkos::complex\<double\> or other 16-byte types [\#1951](https://github.com/kokkos/kokkos/issues/1951)
+- MDRangePolicy: Fix zero-length range [\#1948](https://github.com/kokkos/kokkos/issues/1948)
+- TeamThreadRange: TeamThreadRange MaxLoc reduce doesnt compile  [\#1909](https://github.com/kokkos/kokkos/issues/1909)
+
 ## [2.7.24](https://github.com/kokkos/kokkos/tree/2.7.24) (2018-11-04)
 [Full Changelog](https://github.com/kokkos/kokkos/compare/2.7.00...2.7.24)

--- a/lib/kokkos/Makefile.kokkos
+++ b/lib/kokkos/Makefile.kokkos
@ -6,16 +6,16 @@ ifndef KOKKOS_PATH
 endif
 CXXFLAGS=$(CCFLAGS)

-# Options: Cuda,ROCm,OpenMP,Pthread,Qthreads,Serial
+# Options: Cuda,ROCm,OpenMP,Pthreads,Qthreads,Serial
 KOKKOS_DEVICES ?= "OpenMP"
-#KOKKOS_DEVICES ?= "Pthread"
+#KOKKOS_DEVICES ?= "Pthreads"
 # Options: 
 # Intel:    KNC,KNL,SNB,HSW,BDW,SKX
-# NVIDIA:   Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72
+# NVIDIA:   Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72,Turing75
 # ARM:      ARMv80,ARMv81,ARMv8-ThunderX,ARMv8-TX2
 # IBM:      BGQ,Power7,Power8,Power9
 # AMD-GPUS: Kaveri,Carrizo,Fiji,Vega
-# AMD-CPUS: AMDAVX,Ryzen,Epyc
+# AMD-CPUS: AMDAVX,Ryzen,EPYC
 KOKKOS_ARCH ?= ""
 # Options: yes,no
 KOKKOS_DEBUG ?= "no"
@ -224,7 +224,7 @@ ifeq ($(KOKKOS_INTERNAL_COMPILER_PGI), 1)
 else
  ifeq ($(KOKKOS_INTERNAL_COMPILER_XL), 1)
     KOKKOS_INTERNAL_CXX11_FLAG := -std=c++11
-     #KOKKOS_INTERNAL_CXX14_FLAG := -std=c++14
+     KOKKOS_INTERNAL_CXX14_FLAG := -std=c++14
     KOKKOS_INTERNAL_CXX1Y_FLAG := -std=c++1y
     #KOKKOS_INTERNAL_CXX17_FLAG := -std=c++17
     #KOKKOS_INTERNAL_CXX1Z_FLAG := -std=c++1Z
@ -276,6 +276,7 @@ KOKKOS_INTERNAL_USE_ARCH_PASCAL61 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pas
 KOKKOS_INTERNAL_USE_ARCH_PASCAL60 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal60)
 KOKKOS_INTERNAL_USE_ARCH_VOLTA70 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta70)
 KOKKOS_INTERNAL_USE_ARCH_VOLTA72 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta72)
+KOKKOS_INTERNAL_USE_ARCH_TURING75 := $(call kokkos_has_string,$(KOKKOS_ARCH),Turing75)
 KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLER30)  \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER32)  \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER35)  \
@ -284,6 +285,7 @@ KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLE
                                              + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60)  \
 					      + $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
 					      + $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
+					      + $(KOKKOS_INTERNAL_USE_ARCH_TURING75) \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
                                              + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -300,6 +302,7 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
                                                + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60)  \
 						+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
 						+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
+						+ $(KOKKOS_INTERNAL_USE_ARCH_TURING75) \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
                                                + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -331,7 +334,7 @@ KOKKOS_INTERNAL_USE_ARCH_IBM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_
 # AMD based.
 KOKKOS_INTERNAL_USE_ARCH_AMDAVX := $(call kokkos_has_string,$(KOKKOS_ARCH),AMDAVX)
 KOKKOS_INTERNAL_USE_ARCH_RYZEN := $(call kokkos_has_string,$(KOKKOS_ARCH),Ryzen)
-KOKKOS_INTERNAL_USE_ARCH_EPYC := $(call kokkos_has_string,$(KOKKOS_ARCH),Epyc)
+KOKKOS_INTERNAL_USE_ARCH_EPYC := $(call kokkos_has_string,$(KOKKOS_ARCH),EPYC)
 KOKKOS_INTERNAL_USE_ARCH_KAVERI := $(call kokkos_has_string,$(KOKKOS_ARCH),Kaveri)
 KOKKOS_INTERNAL_USE_ARCH_CARRIZO := $(call kokkos_has_string,$(KOKKOS_ARCH),Carrizo)
 KOKKOS_INTERNAL_USE_ARCH_FIJI := $(call kokkos_has_string,$(KOKKOS_ARCH),Fiji)
@ -341,12 +344,12 @@ KOKKOS_INTERNAL_USE_ARCH_GFX901 := $(call kokkos_has_string,$(KOKKOS_ARCH),gfx90
 # Any AVX?
 KOKKOS_INTERNAL_USE_ARCH_SSE42      := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_WSM))
 KOKKOS_INTERNAL_USE_ARCH_AVX        := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_AMDAVX))
-KOKKOS_INTERNAL_USE_ARCH_AVX2       := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW))
+KOKKOS_INTERNAL_USE_ARCH_AVX2       := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW) + $(KOKKOS_INTERNAL_USE_ARCH_EPYC))
 KOKKOS_INTERNAL_USE_ARCH_AVX512MIC  := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KNL))
 KOKKOS_INTERNAL_USE_ARCH_AVX512XEON := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_SKX))

 # Decide what ISA level we are able to support.
-KOKKOS_INTERNAL_USE_ISA_X86_64    := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_WSM) + $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW) + $(KOKKOS_INTERNAL_USE_ARCH_KNL) + $(KOKKOS_INTERNAL_USE_ARCH_SKX))
+KOKKOS_INTERNAL_USE_ISA_X86_64    := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_WSM) + $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_HSW) + $(KOKKOS_INTERNAL_USE_ARCH_BDW) + $(KOKKOS_INTERNAL_USE_ARCH_KNL) + $(KOKKOS_INTERNAL_USE_ARCH_SKX) + $(KOKKOS_INTERNAL_USE_ARCH_EPYC))
 KOKKOS_INTERNAL_USE_ISA_KNC       := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KNC))
 KOKKOS_INTERNAL_USE_ISA_POWERPCLE := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_POWER8) + $(KOKKOS_INTERNAL_USE_ARCH_POWER9))
 KOKKOS_INTERNAL_USE_ISA_POWERPCBE := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_POWER7))
@ -658,6 +661,19 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV81), 1)
  endif
 endif

+ifeq ($(KOKKOS_INTERNAL_USE_ARCH_EPYC), 1)
+  tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_AMD_EPYC")
+  tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_AMD_AVX2")
+
+  ifeq ($(KOKKOS_INTERNAL_COMPILER_INTEL), 1)
+    KOKKOS_CXXFLAGS += -mavx2
+    KOKKOS_LDFLAGS += -mavx2
+  else
+    KOKKOS_CXXFLAGS += -march=znver1 -mtune=znver1
+    KOKKOS_LDFLAGS += -march=znver1 -mtune=znver1
+  endif
+endif
+
 ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX), 1)
  tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV80")
  tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV8_THUNDERX")
@ -950,6 +966,11 @@ ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA72")
    KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_72
  endif
+  ifeq ($(KOKKOS_INTERNAL_USE_ARCH_TURING75), 1)
+    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_TURING")
+    tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_TURING75")
+    KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_75
+  endif

  ifneq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
    KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)
--- a/lib/kokkos/README
+++ b/lib/kokkos/README
@ -73,6 +73,8 @@ For specifics see the LICENSE file contained in the repository or distribution.
  * NVCC 7.5 for CUDA (with gcc 4.8.4)
  * NVCC 8.0.44 for CUDA (with gcc 5.3.0)
  * NVCC 9.1 for CUDA (with gcc 6.1.0)
+  * NVCC 9.2 for CUDA (with gcc 7.2.0)
+  * NVCC 10.0 for CUDA (with gcc 7.4.0)

 ### Primary tested compilers on Power 8 are:
  * GCC 6.4.0 (OpenMP,Serial)
@ -109,7 +111,7 @@ GCC:   -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits
       -Wignored-qualifiers -Wempty-body -Wclobbered -Wuninitialized
 Intel: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
 Clang: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
-NVCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
+NVCC:  -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized

 Other compilers are tested occasionally, in particular when pushing from develop to 
 master branch, without -Werror and only for a select set of backends.
--- a/lib/kokkos/bin/nvcc_wrapper
+++ b/lib/kokkos/bin/nvcc_wrapper
@ -308,6 +308,16 @@ do
  shift
 done

+#Check if nvcc exists
+if [ $host_only -ne 1 ]; then
+  var=$(which nvcc )
+  if [ $? -gt 0 ]; then
+    echo "Could not find nvcc in PATH"
+    exit $?
+  fi
+fi
+
+
 # Only print host compiler version
 if [ $get_host_version -eq 1 ]; then
  $host_compiler --version
--- a/lib/kokkos/cmake/kokkos_options.cmake
+++ b/lib/kokkos/cmake/kokkos_options.cmake
@ -104,6 +104,7 @@ list(APPEND KOKKOS_ARCH_LIST
     Pascal61        # (GPU) NVIDIA Pascal generation CC 6.1
     Volta70         # (GPU) NVIDIA Volta generation CC 7.0
     Volta72         # (GPU) NVIDIA Volta generation CC 7.2
+     Turing75         # (GPU) NVIDIA Turing generation CC 7.5
    )

 # List of possible device architectures.
--- a/lib/kokkos/containers/src/Kokkos_DualView.hpp
+++ b/lib/kokkos/containers/src/Kokkos_DualView.hpp
@ -832,16 +832,14 @@ void
 deep_copy (DualView<DT,DL,DD,DM> dst, // trust me, this must not be a reference
           const DualView<ST,SL,SD,SM>& src )
 {
-  if(src.modified_flags.data()==NULL || dst.modified_flags.data()==NULL) {
-    return deep_copy(dst.d_view, src.d_view);
-  }
-  if (src.modified_flags(1) >= src.modified_flags(0)) {
-    deep_copy (dst.d_view, src.d_view);
-    dst.template modify<typename DualView<DT,DL,DD,DM>::device_type> ();
-  } else {
+  if ( src.need_sync_device() ) {
    deep_copy (dst.h_view, src.h_view);
-    dst.template modify<typename DualView<DT,DL,DD,DM>::host_mirror_space> ();
+    dst.modify_host();
  }
+  else {
+    deep_copy (dst.d_view, src.d_view);
+    dst.modify_device();
+  } 
 }

 template< class ExecutionSpace ,
@ -852,15 +850,12 @@ deep_copy (const ExecutionSpace& exec ,
           DualView<DT,DL,DD,DM> dst, // trust me, this must not be a reference
           const DualView<ST,SL,SD,SM>& src )
 {
-  if(src.modified_flags.data()==NULL || dst.modified_flags.data()==NULL) {
-    return deep_copy(exec, dst.d_view, src.d_view);
-  }
-  if (src.modified_flags(1) >= src.modified_flags(0)) {
-    deep_copy (exec, dst.d_view, src.d_view);
-    dst.template modify<typename DualView<DT,DL,DD,DM>::device_type> ();
-  } else {
+  if ( src.need_sync_device() ) {
    deep_copy (exec, dst.h_view, src.h_view);
-    dst.template modify<typename DualView<DT,DL,DD,DM>::host_mirror_space> ();
+    dst.modify_host();
+  } else {
+    deep_copy (exec, dst.d_view, src.d_view);
+    dst.modify_device();
  }
 }

--- a/lib/kokkos/containers/src/Kokkos_DynRankView.hpp
+++ b/lib/kokkos/containers/src/Kokkos_DynRankView.hpp
@ -368,8 +368,8 @@ public:
  enum { is_assignable = is_assignable_value_type &&
                         is_assignable_layout };

-  typedef ViewMapping< DstTraits , void >  DstType ;
-  typedef ViewMapping< SrcTraits , void >  SrcType ;
+  typedef ViewMapping< DstTraits , typename DstTraits::specialize >  DstType ;
+  typedef ViewMapping< SrcTraits , typename SrcTraits::specialize >  SrcType ;

  template < typename DT , typename ... DP , typename ST , typename ... SP >
  KOKKOS_INLINE_FUNCTION
@ -432,7 +432,7 @@ public:


 private:
-  typedef Kokkos::Impl::ViewMapping< traits , void > map_type ;
+  typedef Kokkos::Impl::ViewMapping< traits , typename traits::specialize > map_type ;
  typedef Kokkos::Impl::SharedAllocationTracker      track_type ;

  track_type  m_track ;
@ -567,11 +567,11 @@ public:
  // Allow specializations to query their specialized map
 #ifdef KOKKOS_ENABLE_DEPRECATED_CODE
  KOKKOS_INLINE_FUNCTION
-  const Kokkos::Impl::ViewMapping< traits , void > &
+  const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
  implementation_map() const { return m_map ; }
 #endif
  KOKKOS_INLINE_FUNCTION
-  const Kokkos::Impl::ViewMapping< traits , void > &
+  const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
  impl_map() const { return m_map ; }

  //----------------------------------------
@ -952,7 +952,7 @@ public:
    , m_rank(rhs.m_rank)
    {
      typedef typename DynRankView<RT,RP...> ::traits SrcTraits ;
-      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
+      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize > Mapping ;
      static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
      Mapping::assign( m_map , rhs.m_map , rhs.m_track );
    }
@ -962,7 +962,7 @@ public:
  DynRankView & operator = (const DynRankView<RT,RP...> & rhs )
    {
      typedef typename DynRankView<RT,RP...> ::traits SrcTraits ;
-      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
+      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize > Mapping ;
      static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
      Mapping::assign( m_map , rhs.m_map , rhs.m_track );
      m_track.assign( rhs.m_track , traits::is_managed );
@ -980,7 +980,7 @@ public:
    {
      typedef typename View<RT,RP...>::traits  SrcTraits ;
      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Impl::ViewToDynRankViewTag >  Mapping ;
-      static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
+      static_assert( Mapping::is_assignable , "Incompatible View to DynRankView copy construction" );
      Mapping::assign( *this , rhs );
    }

@ -1432,7 +1432,7 @@ public:
                    , Args ... args )
    {

-       typedef ViewMapping< traits_type, void >  DstType ;
+       typedef ViewMapping< traits_type, typename traits_type::specialize >  DstType ;

       typedef typename std::conditional< (rank==0) , ViewDimension<>
                                                    , typename std::conditional< (rank==1) , ViewDimension<0>
--- a/lib/kokkos/containers/unit_tests/TestDualView.hpp
+++ b/lib/kokkos/containers/unit_tests/TestDualView.hpp
@ -101,13 +101,98 @@ namespace Impl {
      result = run_me< Kokkos::DualView<Scalar**,Kokkos::LayoutLeft,Device> >(size,3);
    }

-   };
+  };
+
+  template < typename Scalar, class ViewType >
+  struct SumViewEntriesFunctor {
+
+    typedef Scalar value_type;
+
+    ViewType fv;
+
+    SumViewEntriesFunctor ( const ViewType & fv_ ) : fv(fv_) {}
+
+    KOKKOS_INLINE_FUNCTION
+    void operator() ( const int i , value_type & total ) const {
+      for ( size_t j = 0; j < fv.extent(1); ++j ) {
+        total += fv(i,j);
+      }
+    }
+
+  };
+  
+
+  template <typename Scalar, class Device>
+  struct test_dual_view_deep_copy
+  {
+    typedef Scalar scalar_type;
+    typedef Device execution_space;
+
+    template <typename ViewType>
+    void run_me() {
+
+      const unsigned int n = 10;
+      const unsigned int m = 5;
+      const unsigned int sum_total = n * m;
+
+      ViewType a("A",n,m);
+      ViewType b("B",n,m);
+
+      Kokkos::deep_copy( a.d_view , 1 );
+
+      a.template modify<typename ViewType::execution_space>();
+      a.template sync<typename ViewType::host_mirror_space>();
+
+      // Check device view is initialized as expected
+      scalar_type a_d_sum = 0;
+      // Execute on the execution_space associated with t_dev's memory space
+      typedef typename ViewType::t_dev::memory_space::execution_space t_dev_exec_space;
+      Kokkos::parallel_reduce( Kokkos::RangePolicy<t_dev_exec_space>(0,n), SumViewEntriesFunctor<scalar_type, typename ViewType::t_dev>(a.d_view), a_d_sum );
+      ASSERT_EQ(a_d_sum, sum_total);
+
+      // Check host view is synced as expected
+      scalar_type a_h_sum = 0;
+      for ( size_t i = 0; i < a.h_view.extent(0); ++i )
+        for ( size_t j = 0; j < a.h_view.extent(1); ++j ) {
+          a_h_sum += a.h_view(i,j);
+        }
+
+      ASSERT_EQ(a_h_sum, sum_total);
+
+
+      // Test deep_copy
+      Kokkos::deep_copy( b, a );
+      b.template sync<typename ViewType::host_mirror_space>();
+
+      // Perform same checks on b as done on a
+      // Check device view is initialized as expected
+      scalar_type b_d_sum = 0;
+      // Execute on the execution_space associated with t_dev's memory space
+      Kokkos::parallel_reduce( Kokkos::RangePolicy<t_dev_exec_space>(0,n), SumViewEntriesFunctor<scalar_type, typename ViewType::t_dev>(b.d_view), b_d_sum );
+      ASSERT_EQ(b_d_sum, sum_total);
+
+      // Check host view is synced as expected
+      scalar_type b_h_sum = 0;
+      for ( size_t i = 0; i < b.h_view.extent(0); ++i )
+        for ( size_t j = 0; j < b.h_view.extent(1); ++j ) {
+          b_h_sum += b.h_view(i,j);
+        }
+
+      ASSERT_EQ(b_h_sum, sum_total);
+
+    } // end run_me
+
+    test_dual_view_deep_copy()
+    {
+      run_me< Kokkos::DualView<Scalar**,Kokkos::LayoutLeft,Device> >();
+    }
+
+  };

 } // namespace Impl



-
 template <typename Scalar, typename Device>
 void test_dualview_combinations(unsigned int size)
 {
@ -116,10 +201,21 @@ void test_dualview_combinations(unsigned int size)

 }

+template <typename Scalar, typename Device>
+void test_dualview_deep_copy()
+{
+  Impl::test_dual_view_deep_copy<Scalar,Device> ();
+}
+
 TEST_F( TEST_CATEGORY, dualview_combination) {
    test_dualview_combinations<int,TEST_EXECSPACE>(10);
 }

+TEST_F( TEST_CATEGORY, dualview_deep_copy) {
+    test_dualview_deep_copy<int,TEST_EXECSPACE>();
+    test_dualview_deep_copy<double,TEST_EXECSPACE>();
+}
+

 } // namespace Test

--- a/lib/kokkos/core/src/Cuda/Kokkos_CudaSpace.cpp
+++ b/lib/kokkos/core/src/Cuda/Kokkos_CudaSpace.cpp
@ -829,7 +829,8 @@ void* cuda_resize_scratch_space(std::int64_t bytes, bool force_shrink) {
  }
  if(bytes > current_size) {
    current_size = bytes;
-    ptr = Kokkos::kokkos_realloc<Kokkos::CudaSpace>(ptr,current_size);
+    Kokkos::kokkos_free<Kokkos::CudaSpace>(ptr);
+    ptr = Kokkos::kokkos_malloc<Kokkos::CudaSpace>("CudaSpace::ScratchMemory",current_size);
  }
  if((bytes < current_size) && (force_shrink)) {
    current_size = bytes;
--- a/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Impl.cpp
+++ b/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Impl.cpp
@ -561,7 +561,11 @@ void CudaInternal::initialize( int cuda_device_id , int stream_count )
    }
  #endif

+  #ifdef KOKKOS_ENABLE_PRE_CUDA_10_DEPRECATION_API
  cudaThreadSetCacheConfig(cudaFuncCachePreferShared);
+  #else
+  cudaDeviceSetCacheConfig(cudaFuncCachePreferShared);
+  #endif

  // Init the array for used for arbitrarily sized atomics
  Impl::initialize_host_cuda_lock_arrays();
--- a/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel.hpp
+++ b/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel.hpp
@ -525,6 +525,7 @@ public:
  inline
  void execute() const
  {
+    if(m_rp.m_num_tiles==0) return;
    const array_index_type maxblocks = static_cast<array_index_type>(Kokkos::Impl::CudaTraits::UpperBoundGridCount);
    if ( RP::rank == 2 )
    {
@ -685,7 +686,7 @@ public:
        typename Policy::member_type( kokkos_impl_cuda_shared_memory<void>()
                                    , m_shmem_begin
                                    , m_shmem_size
-                                    , (void*) ( ((char*)m_scratch_ptr[1]) + threadid/(blockDim.x*blockDim.y) * m_scratch_size[1])
+                                    , (void*) ( ((char*)m_scratch_ptr[1]) + ptrdiff_t(threadid/(blockDim.x*blockDim.y)) * m_scratch_size[1])
                                    , m_scratch_size[1]
                                    , league_rank
                                    , m_league_size ) );
@ -1336,7 +1337,7 @@ public:
        ( Member( kokkos_impl_cuda_shared_memory<char>() + m_team_begin
                                        , m_shmem_begin
                                        , m_shmem_size
-                                        , (void*) ( ((char*)m_scratch_ptr[1]) + threadid/(blockDim.x*blockDim.y) * m_scratch_size[1])
+                                        , (void*) ( ((char*)m_scratch_ptr[1]) + ptrdiff_t(threadid/(blockDim.x*blockDim.y)) * m_scratch_size[1])
                                        , m_scratch_size[1]
                                        , league_rank
                                        , m_league_size )
@ -1378,7 +1379,7 @@ public:
        ( Member( kokkos_impl_cuda_shared_memory<char>() + m_team_begin
                                        , m_shmem_begin
                                        , m_shmem_size
-                                        , (void*) ( ((char*)m_scratch_ptr[1]) + threadid/(blockDim.x*blockDim.y) * m_scratch_size[1])
+                                        , (void*) ( ((char*)m_scratch_ptr[1]) + ptrdiff_t(threadid/(blockDim.x*blockDim.y)) * m_scratch_size[1])
                                        , m_scratch_size[1]
                                        , league_rank
                                        , m_league_size )
@ -2064,7 +2065,7 @@ private:
      #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      KOKKOS_IMPL_CUDA_SYNCWARP_MASK(MASK);
      #else
-      KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
+      KOKKOS_IMPL_CUDA_SYNCWARP;
      #endif
      if ( CudaTraits::WarpSize < word_count.value ) { __syncthreads(); } // Protect against large scan values.

@ -2291,7 +2292,7 @@ private:
      #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      KOKKOS_IMPL_CUDA_SYNCWARP_MASK(MASK);
      #else
-      KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
+      KOKKOS_IMPL_CUDA_SYNCWARP;
      #endif
      if ( CudaTraits::WarpSize < word_count.value ) { __syncthreads(); } // Protect against large scan values.

--- a/lib/kokkos/core/src/Cuda/Kokkos_Cuda_ReduceScan.hpp
+++ b/lib/kokkos/core/src/Cuda/Kokkos_Cuda_ReduceScan.hpp
@ -321,7 +321,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
      unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
      int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 2) {
        value_type tmp = Kokkos::shfl_down(value, 2,32);
@ -331,7 +331,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 4) {
        value_type tmp = Kokkos::shfl_down(value, 4,32);
@ -341,7 +341,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 8) {
        value_type tmp = Kokkos::shfl_down(value, 8,32);
@ -351,7 +351,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 16) {
        value_type tmp = Kokkos::shfl_down(value, 16,32);
@ -361,7 +361,7 @@ bool cuda_inter_block_reduction( typename FunctorValueTraits< FunctorType , ArgT
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
    }
  }
@ -506,7 +506,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
      unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
      int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 2) {
        value_type tmp = Kokkos::shfl_down(value, 2,32);
@ -516,7 +516,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 4) {
        value_type tmp = Kokkos::shfl_down(value, 4,32);
@ -526,7 +526,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 8) {
        value_type tmp = Kokkos::shfl_down(value, 8,32);
@ -536,7 +536,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
      if (int(blockDim.x*blockDim.y) > 16) {
        value_type tmp = Kokkos::shfl_down(value, 16,32);
@ -546,7 +546,7 @@ cuda_inter_block_reduction( const ReducerType& reducer,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-      active += KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+      active += KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
    }
  }
@ -578,7 +578,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, true> {
      const int width,                         // How much of the warp participates
      Scalar& result)
  {
-    unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)%(32/width))*width;
+    unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)/width)*width;
    for(int delta=skip_vector?blockDim.x:1; delta<width; delta*=2) {
      Scalar tmp;
      cuda_shfl_down(tmp,value,delta,width,mask);
@ -683,7 +683,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, false> {
      const int width)                         // How much of the warp participates
  {
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
-    unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)%(32/width))*width;
+    unsigned mask = width==32?0xffffffff:((1<<width)-1)<<((threadIdx.y*blockDim.x+threadIdx.x)/width)*width;
 #endif
    const int lane_id = (threadIdx.y*blockDim.x+threadIdx.x)%32;
    for(int delta=skip_vector?blockDim.x:1; delta<width; delta*=2) {
@ -693,7 +693,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, false> {
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
      KOKKOS_IMPL_CUDA_SYNCWARP_MASK(mask);
 #else
-      KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
+      KOKKOS_IMPL_CUDA_SYNCWARP;
 #endif
    }
    *value=*(value-lane_id);
@ -779,7 +779,7 @@ struct CudaReductionsFunctor<FunctorType, ArgTag, false, false> {
 /*
 *  Algorithmic constraints:
 *   (a) blockDim.y is a power of two
- *   (b) blockDim.y <= 512
+ *   (b) blockDim.y <= 1024
 *   (c) blockDim.x == blockDim.z == 1
 */

@ -828,14 +828,26 @@ void cuda_intra_block_reduce_scan( const FunctorType & functor ,
  { // Inter-warp reduce-scan by a single warp to avoid extra synchronizations
    const unsigned rtid_inter = ( threadIdx.y ^ BlockSizeMask ) << CudaTraits::WarpIndexShift ;

+    #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
+    unsigned inner_mask = KOKKOS_IMPL_CUDA_BALLOT_MASK(0xffffffff,(rtid_inter<blockDim.y));
+    #endif
    if ( rtid_inter < blockDim.y ) {

      const pointer_type tdata_inter = base_data + value_count * ( rtid_inter ^ BlockSizeMask );

-      if ( (1<<5) < BlockSizeMask ) {                        BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,5) }
-      if ( (1<<6) < BlockSizeMask ) { __threadfence_block(); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,6) }
-      if ( (1<<7) < BlockSizeMask ) { __threadfence_block(); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,7) }
-      if ( (1<<8) < BlockSizeMask ) { __threadfence_block(); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,8) }
+      #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
+      if ( (1<<5) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,5) }
+      if ( (1<<6) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,6) }
+      if ( (1<<7) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,7) }
+      if ( (1<<8) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,8) }
+      if ( (1<<9) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,9) }
+      #else
+      if ( (1<<5) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,5) }
+      if ( (1<<6) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,6) }
+      if ( (1<<7) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,7) }
+      if ( (1<<8) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,8) }
+      if ( (1<<9) < BlockSizeMask ) { KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_REDUCE_STEP(rtid_inter,tdata_inter,9) }
+      #endif

      if ( DoScan ) {

@ -846,10 +858,17 @@ void cuda_intra_block_reduce_scan( const FunctorType & functor ,

        if ( ! ( rtid_inter + n < blockDim.y ) ) n = 0 ;

-        __threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,8)
-        __threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,7)
-        __threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,6)
-        __threadfence_block(); BLOCK_SCAN_STEP(tdata_inter,n,5)
+        #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
+        KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,8)
+        KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,7)
+        KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,6)
+        KOKKOS_IMPL_CUDA_SYNCWARP_MASK(inner_mask); BLOCK_SCAN_STEP(tdata_inter,n,5)
+        #else
+        KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,8)
+        KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,7)
+        KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,6)
+        KOKKOS_IMPL_CUDA_SYNCWARP; BLOCK_SCAN_STEP(tdata_inter,n,5)
+        #endif
      }
    }
  }
@ -864,19 +883,17 @@ void cuda_intra_block_reduce_scan( const FunctorType & functor ,
            ( rtid_intra & 16 ) ? 16 : 0 ))));

    if ( ! ( rtid_intra + n < blockDim.y ) ) n = 0 ;
-    #ifdef KOKKOS_IMPL_CUDA_CLANG_WORKAROUND
-    BLOCK_SCAN_STEP(tdata_intra,n,4) __syncthreads();//__threadfence_block();
-    BLOCK_SCAN_STEP(tdata_intra,n,3) __syncthreads();//__threadfence_block();
-    BLOCK_SCAN_STEP(tdata_intra,n,2) __syncthreads();//__threadfence_block();
-    BLOCK_SCAN_STEP(tdata_intra,n,1) __syncthreads();//__threadfence_block();
-    BLOCK_SCAN_STEP(tdata_intra,n,0) __syncthreads();
-    #else
+    KOKKOS_IMPL_CUDA_SYNCWARP;
    BLOCK_SCAN_STEP(tdata_intra,n,4) __threadfence_block();
+    KOKKOS_IMPL_CUDA_SYNCWARP;
    BLOCK_SCAN_STEP(tdata_intra,n,3) __threadfence_block();
+    KOKKOS_IMPL_CUDA_SYNCWARP;
    BLOCK_SCAN_STEP(tdata_intra,n,2) __threadfence_block();
+    KOKKOS_IMPL_CUDA_SYNCWARP;
    BLOCK_SCAN_STEP(tdata_intra,n,1) __threadfence_block();
+    KOKKOS_IMPL_CUDA_SYNCWARP;
    BLOCK_SCAN_STEP(tdata_intra,n,0) __threadfence_block();
-    #endif
+    KOKKOS_IMPL_CUDA_SYNCWARP;
  }

 #undef BLOCK_SCAN_STEP
--- a/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Team.hpp
+++ b/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Team.hpp
@ -290,7 +290,7 @@ public:
      // Intra vector lane shuffle reduction:
      typename ReducerType::value_type tmp ( reducer.reference() );

-      unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x;
+      unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<((threadIdx.y%(32/blockDim.x))*blockDim.x);

      for ( int i = blockDim.x ; ( i >>= 1 ) ; ) {
        cuda_shfl_down( reducer.reference() , tmp , i , blockDim.x , mask );
@ -742,7 +742,7 @@ void parallel_for
  #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
  KOKKOS_IMPL_CUDA_SYNCWARP_MASK(blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x);
  #else
-  KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
+  KOKKOS_IMPL_CUDA_SYNCWARP;
  #endif
 #endif
 }
@ -915,7 +915,7 @@ void single(const Impl::VectorSingleStruct<Impl::CudaTeamMember>& , const Functo
  #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
  KOKKOS_IMPL_CUDA_SYNCWARP_MASK(blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x);
  #else
-  KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
+  KOKKOS_IMPL_CUDA_SYNCWARP;
  #endif
 #endif
 }
@ -928,7 +928,7 @@ void single(const Impl::ThreadSingleStruct<Impl::CudaTeamMember>& , const Functo
  #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
  KOKKOS_IMPL_CUDA_SYNCWARP_MASK(blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x);
  #else
-  KOKKOS_IMPL_CUDA_SYNCWARP_MASK;
+  KOKKOS_IMPL_CUDA_SYNCWARP;
  #endif
 #endif
 }
@ -938,7 +938,7 @@ KOKKOS_INLINE_FUNCTION
 void single(const Impl::VectorSingleStruct<Impl::CudaTeamMember>& , const FunctorType& lambda, ValueType& val) {
 #ifdef __CUDA_ARCH__
  if(threadIdx.x == 0) lambda(val);
-  unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<(threadIdx.y%(32/blockDim.x))*blockDim.x;
+  unsigned mask = blockDim.x==32?0xffffffff:((1<<blockDim.x)-1)<<((threadIdx.y%(32/blockDim.x))*blockDim.x);
  Impl::cuda_shfl(val,val,0,blockDim.x,mask);
 #endif
 }
--- a/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Version_9_8_Compatibility.hpp
+++ b/lib/kokkos/core/src/Cuda/Kokkos_Cuda_Version_9_8_Compatibility.hpp
@ -4,9 +4,9 @@
 #if ( CUDA_VERSION < 9000 )
 #define KOKKOS_IMPL_CUDA_ACTIVEMASK 0
 #define KOKKOS_IMPL_CUDA_SYNCWARP __threadfence_block()
-#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK __threadfence_block()
+#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) if(m)__threadfence_block()
 #define KOKKOS_IMPL_CUDA_BALLOT(x) __ballot(x)
-#define KOKKOS_IMPL_CUDA_BALLOT_MASK(x) __ballot(x)
+#define KOKKOS_IMPL_CUDA_BALLOT_MASK(m,x) __ballot(x)
 #define KOKKOS_IMPL_CUDA_SHFL(x,y,z) __shfl(x,y,z)
 #define KOKKOS_IMPL_CUDA_SHFL_MASK(m,x,y,z) __shfl(x,y,z)
 #define KOKKOS_IMPL_CUDA_SHFL_UP(x,y,z) __shfl_up(x,y,z)
@ -16,7 +16,7 @@
 #else
 #define KOKKOS_IMPL_CUDA_ACTIVEMASK __activemask()
 #define KOKKOS_IMPL_CUDA_SYNCWARP __syncwarp(0xffffffff)
-#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) __syncwarp(m);
+#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) __syncwarp(m)
 #define KOKKOS_IMPL_CUDA_BALLOT(x) __ballot_sync(__activemask(),x)
 #define KOKKOS_IMPL_CUDA_BALLOT_MASK(m,x) __ballot_sync(m,x)
 #define KOKKOS_IMPL_CUDA_SHFL(x,y,z) __shfl_sync(0xffffffff,x,y,z)
@ -29,9 +29,9 @@
 #else
 #define KOKKOS_IMPL_CUDA_ACTIVEMASK 0
 #define KOKKOS_IMPL_CUDA_SYNCWARP 
-#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK
+#define KOKKOS_IMPL_CUDA_SYNCWARP_MASK(m) (void)m
 #define KOKKOS_IMPL_CUDA_BALLOT(x) 0
-#define KOKKOS_IMPL_CUDA_BALLOT_MASK(x) 0
+#define KOKKOS_IMPL_CUDA_BALLOT_MASK(m,x) 0
 #define KOKKOS_IMPL_CUDA_SHFL(x,y,z) 0
 #define KOKKOS_IMPL_CUDA_SHFL_MASK(m,x,y,z) 0
 #define KOKKOS_IMPL_CUDA_SHFL_UP(x,y,z) 0
--- a/lib/kokkos/core/src/Kokkos_CopyViews.hpp
+++ b/lib/kokkos/core/src/Kokkos_CopyViews.hpp
@ -1401,7 +1401,33 @@ void deep_copy
  typedef typename src_type::memory_space     src_memory_space ;
  typedef typename dst_type::value_type       dst_value_type ;
  typedef typename src_type::value_type       src_value_type ;
-  if(dst.data() == NULL && src.data() == NULL) {
+  if(dst.data() == NULL || src.data() == NULL) {
+#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
+    // do nothing
+#else
+    // throw if dimension mismatch
+    if ( (src.extent(0) != dst.extent(0)) ||
+         (src.extent(1) != dst.extent(1)) ||
+         (src.extent(2) != dst.extent(2)) ||
+         (src.extent(3) != dst.extent(3)) ||
+         (src.extent(4) != dst.extent(4)) ||
+         (src.extent(5) != dst.extent(5)) ||
+         (src.extent(6) != dst.extent(6)) ||
+         (src.extent(7) != dst.extent(7))
+       ) {
+      std::string message("Deprecation Error: Kokkos::deep_copy extents of views don't match: ");
+      message += dst.label(); message += "(";
+      for(int r = 0; r<dst_type::Rank-1; r++)
+        { message+= std::to_string(dst.extent(r)); message += ","; }
+      message+= std::to_string(dst.extent(dst_type::Rank-1)); message += ") ";
+      message += src.label(); message += "(";
+      for(int r = 0; r<src_type::Rank-1; r++)
+        { message+= std::to_string(src.extent(r)); message += ","; }
+      message+= std::to_string(src.extent(src_type::Rank-1)); message += ") ";
+
+      Kokkos::Impl::throw_runtime_exception(message);
+    }
+#endif
    Kokkos::fence();
    return;
  }
@ -1646,7 +1672,33 @@ void deep_copy
  typedef typename dst_type::value_type       dst_value_type ;
  typedef typename src_type::value_type       src_value_type ;

-  if(dst.data() == NULL && src.data() == NULL) {
+  if(dst.data() == NULL || src.data() == NULL) {
+#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
+    // do nothing
+#else
+    // throw if dimension mismatch
+    if ( (src.extent(0) != dst.extent(0)) ||
+         (src.extent(1) != dst.extent(1)) ||
+         (src.extent(2) != dst.extent(2)) ||
+         (src.extent(3) != dst.extent(3)) ||
+         (src.extent(4) != dst.extent(4)) ||
+         (src.extent(5) != dst.extent(5)) ||
+         (src.extent(6) != dst.extent(6)) ||
+         (src.extent(7) != dst.extent(7))
+       ) {
+      std::string message("Deprecation Error: Kokkos::deep_copy extents of views don't match: ");
+      message += dst.label(); message += "(";
+      for(int r = 0; r<dst_type::Rank-1; r++)
+        { message+= std::to_string(dst.extent(r)); message += ","; }
+      message+= std::to_string(dst.extent(dst_type::Rank-1)); message += ") ";
+      message += src.label(); message += "(";
+      for(int r = 0; r<src_type::Rank-1; r++)
+        { message+= std::to_string(src.extent(r)); message += ","; }
+      message+= std::to_string(src.extent(src_type::Rank-1)); message += ") ";
+
+      Kokkos::Impl::throw_runtime_exception(message);
+    }
+#endif
    exec_space.fence();
    return;
  }
--- a/lib/kokkos/core/src/Kokkos_Crs.hpp
+++ b/lib/kokkos/core/src/Kokkos_Crs.hpp
@ -100,32 +100,27 @@ public:
  row_map_type row_map;
  entries_type entries;

-  //! Construct an empty view.
-  Crs() : row_map(), entries() {}
-
-  //! Copy constructor (shallow copy).
-  Crs(const Crs& rhs) : row_map(rhs.row_map), entries(rhs.entries)
-  {}
+  /*
+   * Default Constructors, operators and destructor
+   */
+  KOKKOS_FUNCTION Crs() = default;
+  KOKKOS_FUNCTION Crs(Crs const &) = default;
+  KOKKOS_FUNCTION Crs(Crs &&) = default;
+  KOKKOS_FUNCTION Crs& operator=(Crs const &) = default;
+  KOKKOS_FUNCTION Crs& operator=(Crs &&) = default;
+  KOKKOS_FUNCTION ~Crs() = default;

+  /** \brief Assign to a view of the rhs array.
+   *         If the old view is the last view
+   *         then allocated memory is deallocated.
+   */
  template<class EntriesType, class RowMapType>
-  Crs(const RowMapType& row_map_, const EntriesType& entries_) : row_map(row_map_), entries(entries_)
-  {}
-
-  /** \brief  Assign to a view of the rhs array.
-   *          If the old view is the last view
-   *          then allocated memory is deallocated.
-   */
-  Crs& operator= (const Crs& rhs) {
-    row_map = rhs.row_map;
-    entries = rhs.entries;
-    return *this;
+  KOKKOS_INLINE_FUNCTION
+  Crs(const RowMapType& row_map_, const EntriesType& entries_) 
+     : row_map(row_map_), entries(entries_)
+  {
  }

-  /**  \brief  Destroy this view of the array.
-   *           If the last view then allocated memory is deallocated.
-   */
-  ~Crs() {}
-
  /**  \brief  Return number of rows in the graph
   */
  KOKKOS_INLINE_FUNCTION
--- a/lib/kokkos/core/src/Kokkos_Macros.hpp
+++ b/lib/kokkos/core/src/Kokkos_Macros.hpp
@ -170,6 +170,10 @@
    // see https://github.com/kokkos/kokkos/issues/1470
    #define KOKKOS_CUDA_9_DEFAULTED_BUG_WORKAROUND
  #endif
+
+  #if ( 10000 > CUDA_VERSION )
+    #define KOKKOS_ENABLE_PRE_CUDA_10_DEPRECATION_API
+  #endif
 #endif // #if defined( KOKKOS_ENABLE_CUDA ) && defined( __CUDACC__ )

 //----------------------------------------------------------------------------
--- a/lib/kokkos/core/src/Kokkos_Parallel_Reduce.hpp
+++ b/lib/kokkos/core/src/Kokkos_Parallel_Reduce.hpp
@ -505,7 +505,7 @@ public:
  }

  KOKKOS_INLINE_FUNCTION
-  value_type& reference() {
+  value_type& reference() const {
    return *value;
  }

@ -559,7 +559,7 @@ public:
  }

  KOKKOS_INLINE_FUNCTION
-  value_type& reference() {
+  value_type& reference() const {
    return *value;
  }

@ -637,7 +637,7 @@ public:
  }

  KOKKOS_INLINE_FUNCTION
-  value_type& reference() {
+  value_type& reference() const {
    return *value;
  }

@ -727,7 +727,7 @@ public:
  }

  KOKKOS_INLINE_FUNCTION
-  value_type& reference() {
+  value_type& reference() const {
    return *value;
  }

--- a/lib/kokkos/core/src/Kokkos_View.hpp
+++ b/lib/kokkos/core/src/Kokkos_View.hpp
@ -198,6 +198,7 @@ struct ViewTraits< void >
  typedef void  HostMirrorSpace ;
  typedef void  array_layout ;
  typedef void  memory_traits ;
+  typedef void  specialize ;
 };

 template< class ... Prop >
@ -209,6 +210,7 @@ struct ViewTraits< void , void , Prop ... >
  typedef typename ViewTraits<void,Prop...>::HostMirrorSpace  HostMirrorSpace ;
  typedef typename ViewTraits<void,Prop...>::array_layout     array_layout ;
  typedef typename ViewTraits<void,Prop...>::memory_traits    memory_traits ;
+  typedef typename ViewTraits<void,Prop...>::specialize       specialize ;
 };

 template< class ArrayLayout , class ... Prop >
@ -221,6 +223,7 @@ struct ViewTraits< typename std::enable_if< Kokkos::Impl::is_array_layout<ArrayL
  typedef typename ViewTraits<void,Prop...>::HostMirrorSpace  HostMirrorSpace ;
  typedef          ArrayLayout                                array_layout ;
  typedef typename ViewTraits<void,Prop...>::memory_traits    memory_traits ;
+  typedef typename ViewTraits<void,Prop...>::specialize       specialize ;
 };

 template< class Space , class ... Prop >
@ -239,6 +242,7 @@ struct ViewTraits< typename std::enable_if< Kokkos::Impl::is_space<Space>::value
  typedef typename Kokkos::Impl::HostMirror< Space >::Space HostMirrorSpace ;
  typedef typename execution_space::array_layout            array_layout ;
  typedef typename ViewTraits<void,Prop...>::memory_traits  memory_traits ;
+  typedef typename ViewTraits<void,Prop...>::specialize       specialize ;
 };

 template< class MemoryTraits , class ... Prop >
@ -257,6 +261,7 @@ struct ViewTraits< typename std::enable_if< Kokkos::Impl::is_memory_traits<Memor
  typedef void          HostMirrorSpace ;
  typedef void          array_layout ;
  typedef MemoryTraits  memory_traits ;
+  typedef void          specialize ;
 };


@ -335,7 +340,12 @@ public:

  typedef ArrayLayout                         array_layout ;
  typedef typename data_analysis::dimension   dimension ;
-  typedef typename data_analysis::specialize  specialize /* mapping specialization tag */ ;
+
+  typedef typename std::conditional<
+                      std::is_same<typename data_analysis::specialize,void>::value
+                      ,typename prop::specialize
+                      ,typename data_analysis::specialize>::type
+                   specialize ; /* mapping specialization tag */

  enum { rank         = dimension::rank };
  enum { rank_dynamic = dimension::rank_dynamic };
@ -542,7 +552,7 @@ public:

 private:

-  typedef Kokkos::Impl::ViewMapping< traits , void > map_type ;
+  typedef Kokkos::Impl::ViewMapping< traits , typename traits::specialize > map_type ;
  typedef Kokkos::Impl::SharedAllocationTracker      track_type ;

  track_type  m_track ;
@ -608,13 +618,18 @@ public:
  template< typename iType >
  KOKKOS_INLINE_FUNCTION constexpr
  typename std::enable_if< std::is_integral<iType>::value , size_t >::type
-  extent( const iType & r ) const
+  extent( const iType & r ) const noexcept
    { return m_map.extent(r); }

+  static KOKKOS_INLINE_FUNCTION constexpr
+  size_t
+  static_extent( const unsigned r ) noexcept
+    { return map_type::static_extent(r); }
+
  template< typename iType >
  KOKKOS_INLINE_FUNCTION constexpr
  typename std::enable_if< std::is_integral<iType>::value , int >::type
-  extent_int( const iType & r ) const
+  extent_int( const iType & r ) const noexcept
    { return static_cast<int>(m_map.extent(r)); }

  KOKKOS_INLINE_FUNCTION constexpr
@ -709,11 +724,11 @@ public:

 #ifdef KOKKOS_ENABLE_DEPRECATED_CODE
  KOKKOS_INLINE_FUNCTION
-  const Kokkos::Impl::ViewMapping< traits , void > &
+  const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
  implementation_map() const { return m_map ; }
 #endif
  KOKKOS_INLINE_FUNCTION
-  const Kokkos::Impl::ViewMapping< traits , void > &
+  const Kokkos::Impl::ViewMapping< traits , typename traits::specialize > &
  impl_map() const { return m_map ; }
  KOKKOS_INLINE_FUNCTION
  const Kokkos::Impl::SharedAllocationTracker &
@ -1955,7 +1970,7 @@ public:
    , m_map()
    {
      typedef typename View<RT,RP...>::traits  SrcTraits ;
-      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void >  Mapping ;
+      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize >  Mapping ;
      static_assert( Mapping::is_assignable , "Incompatible View copy construction" );
      Mapping::assign( m_map , rhs.m_map , rhs.m_track );
    }
@ -1965,7 +1980,7 @@ public:
  View & operator = ( const View<RT,RP...> & rhs )
    {
      typedef typename View<RT,RP...>::traits  SrcTraits ;
-      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void >  Mapping ;
+      typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , typename traits::specialize >  Mapping ;
      static_assert( Mapping::is_assignable , "Incompatible View copy assignment" );
      Mapping::assign( m_map , rhs.m_map , rhs.m_track );
      m_track.assign( rhs.m_track , traits::is_managed );
@ -1992,7 +2007,7 @@ public:

      typedef typename Mapping::type DstType ;

-      static_assert( Kokkos::Impl::ViewMapping< traits , typename DstType::traits , void >::is_assignable
+      static_assert( Kokkos::Impl::ViewMapping< traits , typename DstType::traits , typename traits::specialize >::is_assignable
        , "Subview construction requires compatible view and subview arguments" );

      Mapping::assign( m_map, src_view.m_map, arg0 , args... );
@ -2266,10 +2281,10 @@ public:
    }
  template <class Traits>
  KOKKOS_INLINE_FUNCTION
-  View( const track_type & track,  const Kokkos::Impl::ViewMapping< Traits , void >  &map ) :
+  View( const track_type & track,  const Kokkos::Impl::ViewMapping< Traits , typename Traits::specialize >  &map ) :
  m_track(track), m_map()
  {
-    typedef Kokkos::Impl::ViewMapping< traits , Traits , void >  Mapping ;
+    typedef Kokkos::Impl::ViewMapping< traits , Traits , typename traits::specialize >  Mapping ;
    static_assert( Mapping::is_assignable , "Incompatible View copy construction" );
    Mapping::assign( m_map , map , track );
  }
--- a/lib/kokkos/core/src/Threads/Kokkos_Threads_Parallel.hpp
+++ b/lib/kokkos/core/src/Threads/Kokkos_Threads_Parallel.hpp
@ -142,16 +142,15 @@ private:

    WorkRange range( self.m_policy , exec.pool_rank() , exec.pool_size() );

-    exec.set_work_range(range.begin(),range.end(),self.m_policy.chunk_size());
+    exec.set_work_range(range.begin()-self.m_policy.begin(),range.end()-self.m_policy.begin(),self.m_policy.chunk_size());
    exec.reset_steal_target();
    exec.barrier();

    long work_index = exec.get_work_index();

    while(work_index != -1) {
-      const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size();
+      const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size()+self.m_policy.begin();
      const Member end = begin + self.m_policy.chunk_size() < self.m_policy.end()?begin+self.m_policy.chunk_size():self.m_policy.end();
-
      ParallelFor::template exec_range< WorkTag >
        ( self.m_functor , begin , end );
      work_index = exec.get_work_index();
@ -470,14 +469,14 @@ private:
    const ParallelReduce & self = * ((const ParallelReduce *) arg );
    const WorkRange range( self.m_policy, exec.pool_rank(), exec.pool_size() );

-    exec.set_work_range(range.begin(),range.end(),self.m_policy.chunk_size());
+    exec.set_work_range(range.begin()-self.m_policy.begin(),range.end()-self.m_policy.begin(),self.m_policy.chunk_size());
    exec.reset_steal_target();
    exec.barrier();

    long work_index = exec.get_work_index();
    reference_type update = ValueInit::init( ReducerConditional::select(self.m_functor , self.m_reducer) , exec.reduce_memory() );
    while(work_index != -1) {
-      const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size();
+      const Member begin = static_cast<Member>(work_index) * self.m_policy.chunk_size() + self.m_policy.begin();
      const Member end = begin + self.m_policy.chunk_size() < self.m_policy.end()?begin+self.m_policy.chunk_size():self.m_policy.end();
      ParallelReduce::template exec_range< WorkTag >
        ( self.m_functor , begin , end
--- a/lib/kokkos/core/src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Atomic_Compare_Exchange_Strong.hpp
@ -111,7 +111,7 @@ T atomic_compare_exchange( volatile T * const dest , const T & compare ,
  unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
  unsigned int done_active = 0;
  while (active!=done_active) {
@ -127,7 +127,7 @@ T atomic_compare_exchange( volatile T * const dest , const T & compare ,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
 #else
-    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
+    done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
 #endif
  }
  return return_val;
@ -308,6 +308,16 @@ T atomic_compare_exchange( volatile T * const dest_v, const T compare, const T v
 #endif
 #endif // !defined ROCM_ATOMICS

+// dummy for non-CUDA Kokkos headers being processed by NVCC
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
+template <typename T>
+__inline__ __device__
+T atomic_compare_exchange(volatile T * const, const Kokkos::Impl::identity_t<T>, const Kokkos::Impl::identity_t<T>)
+{
+  return T();
+}
+#endif
+
 template <typename T>
 KOKKOS_INLINE_FUNCTION
 bool atomic_compare_exchange_strong(volatile T* const dest, const T compare, const T val)
--- a/lib/kokkos/core/src/impl/Kokkos_Atomic_Exchange.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Atomic_Exchange.hpp
@ -134,7 +134,7 @@ T atomic_exchange( volatile T * const dest ,
  unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
  unsigned int done_active = 0;
  while (active!=done_active) {
@ -149,7 +149,7 @@ T atomic_exchange( volatile T * const dest ,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
 #else
-    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
+    done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
 #endif
  }
  return return_val;
@ -418,6 +418,23 @@ void atomic_assign( volatile T * const dest_v , const T val )

 #endif
 #endif
+
+// dummy for non-CUDA Kokkos headers being processed by NVCC
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
+template <typename T>
+__inline__ __device__
+T atomic_exchange(volatile T * const, const Kokkos::Impl::identity_t<T>)
+{
+  return T();
+}
+
+template < typename T >
+__inline__ __device__
+void atomic_assign(volatile T * const, const Kokkos::Impl::identity_t<T>)
+{
+}
+#endif
+
 } // namespace Kokkos

 #endif
--- a/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_Add.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_Add.hpp
@ -147,7 +147,7 @@ T atomic_fetch_add( volatile T * const dest ,
  unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
  unsigned int done_active = 0;
  while (active!=done_active) {
@ -164,7 +164,7 @@ T atomic_fetch_add( volatile T * const dest ,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
 #else
-    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
+    done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
 #endif
  }
  return return_val;
@ -384,6 +384,15 @@ T atomic_fetch_add( volatile T * const dest_v , typename std::add_const<T>::type
 #endif // !defined ROCM_ATOMICS
 //----------------------------------------------------------------------------

+// dummy for non-CUDA Kokkos headers being processed by NVCC
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
+template< typename T >
+__inline__ __device__
+T atomic_fetch_add(volatile T* const, Kokkos::Impl::identity_t<T>) {
+  return T();
+}
+#endif
+
 // Simpler version of atomic_fetch_add without the fetch
 template <typename T>
 KOKKOS_INLINE_FUNCTION
--- a/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_And.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_And.hpp
@ -149,6 +149,15 @@ T atomic_fetch_and( volatile T * const dest_v , const T val )
 #endif
 //----------------------------------------------------------------------------

+// dummy for non-CUDA Kokkos headers being processed by NVCC
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
+template< typename T >
+__inline__ __device__
+T atomic_fetch_and(volatile T* const, Kokkos::Impl::identity_t<T>) {
+  return T();
+}
+#endif
+
 // Simpler version of atomic_fetch_and without the fetch
 template <typename T>
 KOKKOS_INLINE_FUNCTION
--- a/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_Or.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_Or.hpp
@ -149,6 +149,15 @@ T atomic_fetch_or( volatile T * const dest_v , const T val )
 #endif
 //----------------------------------------------------------------------------

+// dummy for non-CUDA Kokkos headers being processed by NVCC
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
+template< typename T >
+__inline__ __device__
+T atomic_fetch_or(volatile T* const, Kokkos::Impl::identity_t<T>) {
+  return T();
+}
+#endif
+
 // Simpler version of atomic_fetch_or without the fetch
 template <typename T>
 KOKKOS_INLINE_FUNCTION
--- a/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_Sub.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Atomic_Fetch_Sub.hpp
@ -139,7 +139,7 @@ T atomic_fetch_sub( volatile T * const dest ,
  unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
  unsigned int done_active = 0;
  while (active!=done_active) {
@ -154,7 +154,7 @@ T atomic_fetch_sub( volatile T * const dest ,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
 #else
-    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
+    done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
 #endif
  }
  return return_val;
@ -304,6 +304,15 @@ T atomic_fetch_sub( volatile T * const dest_v , const T val )
 #endif
 #endif // !defined ROCM_ATOMICS

+// dummy for non-CUDA Kokkos headers being processed by NVCC
+#if defined(__CUDA_ARCH__) && !defined(KOKKOS_ENABLE_CUDA)
+template< typename T >
+__inline__ __device__
+T atomic_fetch_sub(volatile T* const, Kokkos::Impl::identity_t<T>) {
+  return T();
+}
+#endif
+
 // Simpler version of atomic_fetch_sub without the fetch
 template <typename T>
 KOKKOS_INLINE_FUNCTION
--- a/lib/kokkos/core/src/impl/Kokkos_Atomic_Generic.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Atomic_Generic.hpp
@ -230,9 +230,6 @@ T atomic_fetch_oper( const Oper& op, volatile T * const dest ,
  typename Kokkos::Impl::enable_if<
                ( sizeof(T) != 4 )
             && ( sizeof(T) != 8 )
-          #if defined(KOKKOS_ENABLE_ASM) && defined(KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_HOST)
-             && ( sizeof(T) != 16 )
-          #endif
           , const T >::type val )
 {

@ -250,7 +247,7 @@ T atomic_fetch_oper( const Oper& op, volatile T * const dest ,
  unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
  unsigned int done_active = 0;
  while (active!=done_active) {
@ -265,7 +262,7 @@ T atomic_fetch_oper( const Oper& op, volatile T * const dest ,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
 #else
-    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
+    done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
 #endif
  }
  return return_val;
@ -298,7 +295,7 @@ T atomic_oper_fetch( const Oper& op, volatile T * const dest ,
  unsigned int mask = KOKKOS_IMPL_CUDA_ACTIVEMASK;
  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,1);
 #else
-  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT_MASK(1);
+  unsigned int active = KOKKOS_IMPL_CUDA_BALLOT(1);
 #endif
  unsigned int done_active = 0;
  while (active!=done_active) {
@ -313,7 +310,7 @@ T atomic_oper_fetch( const Oper& op, volatile T * const dest ,
 #ifdef KOKKOS_IMPL_CUDA_SYNCWARP_NEEDS_MASK
    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(mask,done);
 #else
-    done_active = KOKKOS_IMPL_CUDA_BALLOT_MASK(done);
+    done_active = KOKKOS_IMPL_CUDA_BALLOT(done);
 #endif
  }
  return return_val;
--- a/lib/kokkos/core/src/impl/Kokkos_Core.cpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Core.cpp
@ -49,6 +49,7 @@
 #include <sstream>
 #include <cstdlib>
 #include <stack>
+#include <cerrno>

 //----------------------------------------------------------------------------

@ -70,7 +71,6 @@ bool is_unsigned_int(const char* str)
  }
  return true;
 }
-
 void initialize_internal(const InitArguments& args)
 {
 // This is an experimental setting
@ -99,6 +99,7 @@ setenv("MEMKIND_HBW_NODES", "1", 0);
  if (use_gpu < 0 && ndevices >= 0) {
    auto local_rank_str = std::getenv("OMPI_COMM_WORLD_LOCAL_RANK"); //OpenMPI
    if (!local_rank_str) local_rank_str = std::getenv("MV2_COMM_WORLD_LOCAL_RANK"); //MVAPICH2
+    if (!local_rank_str) local_rank_str = std::getenv("SLURM_LOCALID"); //SLURM
    if (local_rank_str) {
      auto local_rank = std::atoi(local_rank_str);
      use_gpu = local_rank % ndevices;
@ -532,6 +533,85 @@ void initialize(int& narg, char* arg[])
      iarg++;
    }

+    //Read environment variables
+    char * endptr;
+    auto env_num_threads_str = std::getenv("KOKKOS_NUM_THREADS");
+    if (env_num_threads_str!=nullptr) {
+        errno = 0;
+        auto env_num_threads = std::strtol(env_num_threads_str,&endptr,10);
+        if (endptr== env_num_threads_str) 
+            Impl::throw_runtime_exception("Error: cannot convert KOKKOS_NUM_THREADS to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if (errno == ERANGE)
+            Impl::throw_runtime_exception("Error: KOKKOS_NUM_THREADS out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if ((num_threads != -1)&&(env_num_threads!=num_threads))
+            Impl::throw_runtime_exception("Error: expecting a match between --kokkos-threads and KOKKOS_NUM_THREADS if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        else
+            num_threads = env_num_threads;
+    }
+    auto env_numa_str = std::getenv("KOKKOS_NUMA");
+    if (env_numa_str!=nullptr) {
+        errno = 0;
+        auto env_numa = std::strtol(env_numa_str,&endptr,10);
+        if (endptr== env_numa_str) 
+            Impl::throw_runtime_exception("Error: cannot convert KOKKOS_NUMA to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if (errno == ERANGE)
+            Impl::throw_runtime_exception("Error: KOKKOS_NUMA out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if ((numa != -1)&&(env_numa!=numa))
+            Impl::throw_runtime_exception("Error: expecting a match between --kokkos-numa and KOKKOS_NUMA if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        else
+            numa = env_numa;
+    }
+    auto env_device_str = std::getenv("KOKKOS_DEVICE_ID");
+    if (env_device_str!=nullptr) {
+        errno = 0;
+        auto env_device = std::strtol(env_device_str,&endptr,10);
+        if (endptr== env_device_str) 
+            Impl::throw_runtime_exception("Error: cannot convert KOKKOS_DEVICE_ID to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if (errno == ERANGE)
+            Impl::throw_runtime_exception("Error: KOKKOS_DEVICE_ID out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if ((device != -1)&&(env_device!=device))
+            Impl::throw_runtime_exception("Error: expecting a match between --kokkos-device and KOKKOS_DEVICE_ID if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        else
+            device = env_device;
+    }
+    auto env_ndevices_str = std::getenv("KOKKOS_NUM_DEVICES");
+    if (env_ndevices_str!=nullptr) {
+        errno = 0;
+        auto env_ndevices = std::strtol(env_ndevices_str,&endptr,10);
+        if (endptr== env_ndevices_str) 
+            Impl::throw_runtime_exception("Error: cannot convert KOKKOS_NUM_DEVICES to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if (errno == ERANGE)
+            Impl::throw_runtime_exception("Error: KOKKOS_NUM_DEVICES out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        if ((ndevices != -1)&&(env_ndevices!=ndevices))
+            Impl::throw_runtime_exception("Error: expecting a match between --kokkos-ndevices and KOKKOS_NUM_DEVICES if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
+        else
+            ndevices = env_ndevices;
+        //Skip device
+        auto env_skip_device_str = std::getenv("KOKKOS_SKIP_DEVICE");
+        if (env_skip_device_str!=nullptr) {
+            errno = 0;
+            auto env_skip_device = std::strtol(env_skip_device_str,&endptr,10);
+            if (endptr== env_skip_device_str) 
+                Impl::throw_runtime_exception("Error: cannot convert KOKKOS_SKIP_DEVICE to an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+            if (errno == ERANGE)
+                Impl::throw_runtime_exception("Error: KOKKOS_SKIP_DEVICE out of range of representable values by an integer. Raised by Kokkos::initialize(int narg, char* argc[]).");
+            if ((skip_device != 9999)&&(env_skip_device!=skip_device))
+                Impl::throw_runtime_exception("Error: expecting a match between --kokkos-ndevices and KOKKOS_SKIP_DEVICE if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
+            else
+                skip_device = env_skip_device;
+        }
+    }
+    char * env_disablewarnings_str = std::getenv("KOKKOS_DISABLE_WARNINGS");
+    if (env_disablewarnings_str!=nullptr) {
+        std::string env_str (env_disablewarnings_str); // deep-copies string
+        for (char& c : env_str) { c = toupper (c); }
+        if ((env_str == "TRUE") || (env_str == "ON") || (env_str == "1"))
+            disable_warnings = true;
+        else
+            if (disable_warnings)
+                Impl::throw_runtime_exception("Error: expecting a match between --kokkos-disable-warnings and KOKKOS_DISABLE_WARNINGS if both are set. Raised by Kokkos::initialize(int narg, char* argc[]).");
+    }
+
    InitArguments arguments;
    arguments.num_threads = num_threads;
    arguments.num_numa = numa;
--- a/lib/kokkos/core/src/impl/Kokkos_Utilities.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_Utilities.hpp
@ -409,6 +409,9 @@ struct inclusive_scan_integer_sequence
  static constexpr value_type value  = helper::value ;
 };

+template <typename T>
+using identity_t = T;
+
 }} // namespace Kokkos::Impl


--- a/lib/kokkos/core/src/impl/Kokkos_ViewArray.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_ViewArray.hpp
@ -103,13 +103,7 @@ namespace Impl {

 /** \brief  View mapping for non-specialized data type and standard layout */
 template< class Traits >
-class ViewMapping< Traits ,
-  typename std::enable_if<(
-    std::is_same< typename Traits::specialize , Kokkos::Array<> >::value &&
-    ( std::is_same< typename Traits::array_layout , Kokkos::LayoutLeft >::value ||
-      std::is_same< typename Traits::array_layout , Kokkos::LayoutRight >::value ||
-      std::is_same< typename Traits::array_layout , Kokkos::LayoutStride >::value )
-  )>::type >
+class ViewMapping< Traits , Kokkos::Array<> >
 {
 private:

@ -345,64 +339,6 @@ public:
  }
 };

-//----------------------------------------------------------------------------
-//----------------------------------------------------------------------------
-/** \brief  Assign compatible default mappings */
-
-template< class DstTraits , class SrcTraits >
-class ViewMapping< DstTraits , SrcTraits ,
-  typename std::enable_if<(
-    std::is_same< typename DstTraits::memory_space , typename SrcTraits::memory_space >::value
-    &&
-    std::is_same< typename DstTraits::specialize , Kokkos::Array<> >::value
-    &&
-    (
-      std::is_same< typename DstTraits::array_layout , Kokkos::LayoutLeft >::value ||
-      std::is_same< typename DstTraits::array_layout , Kokkos::LayoutRight >::value ||
-      std::is_same< typename DstTraits::array_layout , Kokkos::LayoutStride >::value
-    )
-    &&
-    std::is_same< typename SrcTraits::specialize , Kokkos::Array<> >::value
-    &&
-    (
-      std::is_same< typename SrcTraits::array_layout , Kokkos::LayoutLeft >::value ||
-      std::is_same< typename SrcTraits::array_layout , Kokkos::LayoutRight >::value ||
-      std::is_same< typename SrcTraits::array_layout , Kokkos::LayoutStride >::value
-    )
-  )>::type >
-{
-public:
-
-  enum { is_assignable = true };
-
-  typedef Kokkos::Impl::SharedAllocationTracker  TrackType ;
-  typedef ViewMapping< DstTraits , void >  DstType ;
-  typedef ViewMapping< SrcTraits , void >  SrcType ;
-
-  KOKKOS_INLINE_FUNCTION
-  static void assign( DstType & dst , const SrcType & src , const TrackType & src_track )
-    {
-      static_assert( std::is_same< typename DstTraits::value_type , typename SrcTraits::value_type >::value ||
-                     std::is_same< typename DstTraits::value_type , typename SrcTraits::const_value_type >::value
-                   , "View assignment must have same value type or const = non-const" );
-
-      static_assert( ViewDimensionAssignable< typename DstTraits::dimension , typename SrcTraits::dimension >::value
-                   , "View assignment must have compatible dimensions" );
-
-      static_assert( std::is_same< typename DstTraits::array_layout , typename SrcTraits::array_layout >::value ||
-                     std::is_same< typename DstTraits::array_layout , Kokkos::LayoutStride >::value ||
-                     ( DstTraits::dimension::rank == 0 ) ||
-                     ( DstTraits::dimension::rank == 1 && DstTraits::dimension::rank_dynamic == 1 )
-                   , "View assignment must have compatible layout or have rank <= 1" );
-
-      typedef typename DstType::offset_type  dst_offset_type ;
-
-      dst.m_impl_offset = dst_offset_type( src.m_impl_offset );
-      dst.m_impl_handle = src.m_impl_handle ;
-      dst.m_stride = src.m_stride ;
-    }
-};
-
 /** \brief Assign Array to non-Array */

 template< class DstTraits , class SrcTraits >
@ -436,7 +372,7 @@ public:

  typedef Kokkos::Impl::SharedAllocationTracker  TrackType ;
  typedef ViewMapping< DstTraits , void >  DstType ;
-  typedef ViewMapping< SrcTraits , void >  SrcType ;
+  typedef ViewMapping< SrcTraits , Kokkos::Array<> >  SrcType ;

  KOKKOS_INLINE_FUNCTION
  static void assign( DstType & dst , const SrcType & src , const TrackType & src_track )
@ -480,6 +416,7 @@ public:
    }
 };

+
 //----------------------------------------------------------------------------
 //----------------------------------------------------------------------------

--- a/lib/kokkos/core/src/impl/Kokkos_ViewMapping.hpp
+++ b/lib/kokkos/core/src/impl/Kokkos_ViewMapping.hpp
@ -195,7 +195,7 @@ struct ViewDimension
    {}

  KOKKOS_INLINE_FUNCTION
-  constexpr size_t extent( const unsigned r ) const
+  constexpr size_t extent( const unsigned r ) const noexcept
    {
      return r == 0 ? N0 : (
             r == 1 ? N1 : (
@ -207,6 +207,19 @@ struct ViewDimension
             r == 7 ? N7 : 0 )))))));
    }

+  static KOKKOS_INLINE_FUNCTION
+  constexpr size_t static_extent( const unsigned r ) noexcept
+    {
+      return r == 0 ? ArgN0 : (
+             r == 1 ? ArgN1 : (
+             r == 2 ? ArgN2 : (
+             r == 3 ? ArgN3 : (
+             r == 4 ? ArgN4 : (
+             r == 5 ? ArgN5 : (
+             r == 6 ? ArgN6 : (
+             r == 7 ? ArgN7 : 0 )))))));
+    }
+
  template< size_t N >
  struct prepend { typedef ViewDimension< N , Vals... > type ; };

@ -2640,6 +2653,12 @@ public:
  KOKKOS_INLINE_FUNCTION constexpr size_t extent( const iType & r ) const
    { return m_impl_offset.m_dim.extent(r); }

+  static KOKKOS_INLINE_FUNCTION constexpr size_t static_extent( const unsigned r ) noexcept
+    {
+      using dim_type = typename offset_type::dimension_type;
+      return dim_type::static_extent(r);
+    }
+
  KOKKOS_INLINE_FUNCTION constexpr
  typename Traits::array_layout layout() const
    { return m_impl_offset.layout(); }
--- a/lib/kokkos/core/unit_test/TestCrs.hpp
+++ b/lib/kokkos/core/unit_test/TestCrs.hpp
@ -63,6 +63,86 @@ struct CountFillFunctor {
  }
 };

+/* RunUpdateCrsTest
+ *   4 test cases:
+ *     1. use member object version which is constructed directly using the copy constructor
+ *     2. excplicity copy construct in local variable
+ *     3. construct default and assign to input object
+ *     4. construct object from views
+ */
+template< class CrsType, class ExecSpace, class scalarType >
+struct RunUpdateCrsTest {
+
+  struct TestOne {};
+  struct TestTwo {};
+  struct TestThree {};
+  struct TestFour {};
+
+  CrsType graph;
+  RunUpdateCrsTest( CrsType g_in ) : graph(g_in)
+  {
+  }
+
+  void run_test(int nTest) {
+     switch (nTest)
+     {
+        case 1:
+           parallel_for ("TestCrs1", Kokkos::RangePolicy<ExecSpace, TestOne>(0,graph.numRows()),*this);
+           break;
+        case 2:
+           parallel_for ("TestCrs2", Kokkos::RangePolicy<ExecSpace, TestTwo>(0,graph.numRows()),*this);
+           break;
+        case 3:
+           parallel_for ("TestCrs3", Kokkos::RangePolicy<ExecSpace, TestThree>(0,graph.numRows()),*this);
+           break;
+        case 4:
+           parallel_for ("TestCrs4", Kokkos::RangePolicy<ExecSpace, TestFour>(0,graph.numRows()),*this);
+           break;
+        default:
+           break;
+     }
+  }
+
+  KOKKOS_INLINE_FUNCTION
+  void updateGraph(const CrsType & g_in, const scalarType row) const {
+     auto row_map = g_in.row_map;
+     auto entries = g_in.entries;
+     auto j_start = row_map(row);
+     auto j_end = row_map(row+1)-j_start;
+     for (scalarType j = 0; j < j_end; ++j) {
+        entries(j_start+j) = (j+1)*(j+1);
+     }
+  }
+
+  // Test Crs class from class member
+  KOKKOS_INLINE_FUNCTION
+  void operator()(const TestOne &, const scalarType row) const {
+      updateGraph(graph, row);
+  }
+
+  // Test Crs class from copy constructor (local_graph(graph)
+  KOKKOS_INLINE_FUNCTION
+  void operator()(const TestTwo &, const scalarType row) const {
+      CrsType local_graph(graph);
+      updateGraph(local_graph, row);
+  }
+
+  // Test Crs class from default constructor assigned to function parameter
+  KOKKOS_INLINE_FUNCTION
+  void operator()(const TestThree &, const scalarType row) const {
+      CrsType local_graph;
+      local_graph = graph;
+      updateGraph(local_graph, row);
+  }
+
+  // Test Crs class from local graph constructed from row_map and entities access on input parameter)
+  KOKKOS_INLINE_FUNCTION
+  void operator()(const TestFour &, const scalarType row) const {
+      CrsType local_graph(graph.row_map, graph.entries);
+      updateGraph(local_graph, row);
+  }
+};
+
 template< class ExecSpace >
 void test_count_fill(std::int32_t nrows) {
  Kokkos::Crs<std::int32_t, ExecSpace, void, std::int32_t> graph;
@ -81,6 +161,38 @@ void test_count_fill(std::int32_t nrows) {
  }
 }

+// Test Crs Constructor / assignment operation by 
+// using count and fill to create/populate initial graph,
+// then use parallel_for with Crs directly to update content
+// then verify results
+template< class ExecSpace >
+void test_constructor(std::int32_t nrows) {
+
+  for (int nTest = 1; nTest < 5; nTest++)
+  {
+     typedef Kokkos::Crs<std::int32_t, ExecSpace, void, std::int32_t> crs_int32;
+     crs_int32 graph;
+     Kokkos::count_and_fill_crs(graph, nrows, CountFillFunctor<ExecSpace>());
+     ASSERT_EQ(graph.numRows(), nrows);
+
+     RunUpdateCrsTest<crs_int32, ExecSpace, std::int32_t> crstest(graph);
+     crstest.run_test(nTest);
+
+     auto row_map = Kokkos::create_mirror_view(graph.row_map);
+     Kokkos::deep_copy(row_map, graph.row_map);
+     auto entries = Kokkos::create_mirror_view(graph.entries);
+     Kokkos::deep_copy(entries, graph.entries);
+
+     for (std::int32_t row = 0; row < nrows; ++row) {
+       auto n = (row % 4) + 1;
+       ASSERT_EQ(row_map(row + 1) - row_map(row), n);    
+       for (std::int32_t j = 0; j < n; ++j) {
+         ASSERT_EQ(entries(row_map(row) + j), (j + 1)*(j+1));
+       }
+     }
+  }
+}
+
 } // anonymous namespace

 TEST_F( TEST_CATEGORY, crs_count_fill )
@ -95,4 +207,17 @@ TEST_F( TEST_CATEGORY, crs_count_fill )
  test_count_fill<TEST_EXECSPACE>(10000);
 }

+TEST_F( TEST_CATEGORY, crs_copy_constructor )
+{
+  test_constructor<TEST_EXECSPACE>(0);
+  test_constructor<TEST_EXECSPACE>(1);
+  test_constructor<TEST_EXECSPACE>(2);
+  test_constructor<TEST_EXECSPACE>(3);
+  test_constructor<TEST_EXECSPACE>(13);
+  test_constructor<TEST_EXECSPACE>(100);
+  test_constructor<TEST_EXECSPACE>(1000);
+  test_constructor<TEST_EXECSPACE>(10000);
+}
+
+
 } // namespace Test
--- a/lib/kokkos/core/unit_test/TestMDRange.hpp
+++ b/lib/kokkos/core/unit_test/TestMDRange.hpp
@ -956,7 +956,12 @@ struct TestMDRange_3D {
          }
        , Kokkos::Min<double>(min) );

-      ASSERT_EQ( min, 8.0 );
+      if((N0-1)*(N1-1)*(N2-1)>0)
+        ASSERT_EQ( min, 8.0 );
+      else {
+        double min_identity = Kokkos::reduction_identity<double>::min();
+        ASSERT_EQ( min, min_identity );
+      }
    }
 #endif
 #endif
--- a/lib/kokkos/core/unit_test/TestMDRange_d.hpp
+++ b/lib/kokkos/core/unit_test/TestMDRange_d.hpp
@ -46,8 +46,10 @@
 namespace Test {

 TEST_F( TEST_CATEGORY , mdrange_3d) {
+  TestMDRange_3D< TEST_EXECSPACE >::test_for3( 1, 10, 100 );
  TestMDRange_3D< TEST_EXECSPACE >::test_for3( 100, 10, 100 );
 #if !defined( KOKKOS_ENABLE_ROCM ) // MDRange Reduced explicitly handled in its own cpp file
+  TestMDRange_3D< TEST_EXECSPACE >::test_reduce3( 1, 10, 100 );
  TestMDRange_3D< TEST_EXECSPACE >::test_reduce3( 100, 10, 100 );
 #endif
 }
--- a/lib/kokkos/core/unit_test/TestRange.hpp
+++ b/lib/kokkos/core/unit_test/TestRange.hpp
@ -60,8 +60,11 @@ struct TestRange {
  struct VerifyInitTag {};
  struct ResetTag {};
  struct VerifyResetTag {};
+  struct OffsetTag {};
+  struct VerifyOffsetTag {};

-  int N; 
+  int N;
+  static const int offset = 13;
  TestRange( const size_t N_ )
    : m_flags( Kokkos::ViewAllocateWithoutInitializing( "flags" ), N_ ), N(N_)
    {}
@ -117,6 +120,18 @@ struct TestRange {
      if ( int( 2 * i ) != host_flags( i ) ) ++error_count;
    }
    ASSERT_EQ( error_count, int( 0 ) );
+
+    Kokkos::parallel_for( Kokkos::RangePolicy< ExecSpace, ScheduleType, OffsetTag >( offset, N + offset ), *this );
+    Kokkos::parallel_for( std::string("TestKernelFor"), Kokkos::RangePolicy<ExecSpace, ScheduleType, VerifyOffsetTag>( 0, N ), *this);
+
+    Kokkos::deep_copy(host_flags, m_flags);
+
+    error_count = 0;
+    for (int i = 0; i < N; ++i) {
+      if (i + offset != host_flags(i))
+        ++error_count;
+    }
+    ASSERT_EQ(error_count, int(0));
  }

  KOKKOS_INLINE_FUNCTION
@ -144,9 +159,19 @@ struct TestRange {
    }
  }

-  //----------------------------------------
+  KOKKOS_INLINE_FUNCTION
+  void operator()(const OffsetTag &, const int i) const {
+    m_flags(i - offset) = i;
+  }

-  struct OffsetTag {};
+  KOKKOS_INLINE_FUNCTION
+  void operator()(const VerifyOffsetTag &, const int i) const {
+    if (i + offset != m_flags(i)) {
+      printf("TestRange::test_for error at %d != %d\n", i + offset, m_flags(i));
+    }
+  }
+
+  //----------------------------------------

  void test_reduce( )
  {
@ -158,7 +183,7 @@ struct TestRange {
    // sum( 0 .. N-1 )
    ASSERT_EQ( size_t( ( N - 1 ) * ( N ) / 2 ), size_t( total ) );

-    Kokkos::parallel_reduce( Kokkos::RangePolicy< ExecSpace, ScheduleType, OffsetTag>( 0, N ), *this, total );
+    Kokkos::parallel_reduce( Kokkos::RangePolicy< ExecSpace, ScheduleType, OffsetTag>( offset, N+offset ), *this, total );
    // sum( 1 .. N )
    ASSERT_EQ( size_t( ( N ) * ( N + 1 ) / 2 ), size_t( total ) );
  }
@ -169,7 +194,7 @@ struct TestRange {

  KOKKOS_INLINE_FUNCTION
  void operator()( const OffsetTag &, const int i, value_type & update ) const
-  { update += 1 + m_flags( i ); }
+  { update += 1 + m_flags( i-offset ); }

  //----------------------------------------

--- a/lib/kokkos/core/unit_test/TestTeamVector.hpp
+++ b/lib/kokkos/core/unit_test/TestTeamVector.hpp
@ -532,7 +532,11 @@ struct functor_vec_single {
  typedef ExecutionSpace execution_space;

  Kokkos::View< int, Kokkos::LayoutLeft, ExecutionSpace > flag;
-  functor_vec_single( Kokkos::View< int, Kokkos::LayoutLeft, ExecutionSpace > flag_ ) : flag( flag_ ) {}
+  int nStart;
+  int nEnd;
+
+  functor_vec_single( Kokkos::View< int, Kokkos::LayoutLeft, ExecutionSpace > flag_, const int start_, const int end_ ) : 
+                           flag( flag_ ), nStart(start_), nEnd(end_) {}

  KOKKOS_INLINE_FUNCTION
  void operator()( typename policy_type::member_type team ) const {
@ -541,7 +545,7 @@ struct functor_vec_single {
    // inside a parallel_for and write to it.
    Scalar value = 0;

-    Kokkos::parallel_for( Kokkos::ThreadVectorRange( team, 0, 13 ), [&] ( int i )
+    Kokkos::parallel_for( Kokkos::ThreadVectorRange( team, nStart, nEnd ), [&] ( int i )
    {
      value = i; // This write is violating Kokkos semantics for nested parallelism.
    });
@ -552,12 +556,12 @@ struct functor_vec_single {
    }, value );

    Scalar value2 = 0;
-    Kokkos::parallel_reduce( Kokkos::ThreadVectorRange( team, 0, 13 ), [&] ( int i, Scalar & val )
+    Kokkos::parallel_reduce( Kokkos::ThreadVectorRange( team, nStart, nEnd ), [&] ( int i, Scalar & val )
    {
      val += value;
    }, value2 );

-    if ( value2 != ( value * 13 ) ) {
+    if ( value2 != ( value * (nEnd-nStart) ) ) {
      printf( "FAILED vector_single broadcast %i %i %f %f\n",
              team.league_rank(), team.team_rank(), (double) value2, (double) value );

@ -746,12 +750,6 @@ bool test_scalar( int nteams, int team_size, int test ) {
                          functor_vec_red< Scalar, ExecutionSpace >( d_flag ) );
  }
  else if ( test == 1 ) {
-    // WORKAROUND CUDA
-    #if defined(KOKKOS_ENABLE_CUDA)
-    #if defined(KOKKOS_IMPL_CUDA_CLANG_WORKAROUND) || defined(KOKKOS_ARCH_PASCAL)
-    if(!std::is_same<ExecutionSpace,Kokkos::Cuda>::value)
-    #endif
-    #endif
    Kokkos::parallel_for( Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
                          functor_vec_red_reducer< Scalar, ExecutionSpace >( d_flag ) );
  }
@ -765,7 +763,7 @@ bool test_scalar( int nteams, int team_size, int test ) {
  }
  else if ( test == 4 ) {
    Kokkos::parallel_for( "B", Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
-                          functor_vec_single< Scalar, ExecutionSpace >( d_flag ) );
+                          functor_vec_single< Scalar, ExecutionSpace >( d_flag, 0, 13 ) );
  }
  else if ( test == 5 ) {
    Kokkos::parallel_for( Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size ),
@ -791,6 +789,10 @@ bool test_scalar( int nteams, int team_size, int test ) {
    Kokkos::parallel_for( Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
                          functor_team_vector_reduce_reducer< Scalar, ExecutionSpace >( d_flag ) );
  }
+  else if ( test == 11 ) {
+    Kokkos::parallel_for( "B", Kokkos::TeamPolicy< ExecutionSpace >( nteams, team_size, 8 ),
+                          functor_vec_single< Scalar, ExecutionSpace >( d_flag, 4, 13 ) );
+  }

  Kokkos::deep_copy( h_flag, d_flag );

@ -938,6 +940,7 @@ TEST_F( TEST_CATEGORY, team_vector )
  ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 8 ) ) );
  ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 9 ) ) );
  ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 10 ) ) );
+  ASSERT_TRUE( ( TestTeamVector::Test< TEST_EXECSPACE >( 11 ) ) );
 }
 #endif

--- a/lib/kokkos/core/unit_test/TestViewCopy.hpp
+++ b/lib/kokkos/core/unit_test/TestViewCopy.hpp
@ -56,17 +56,13 @@ struct TestViewCopy {

  using InExecSpace = ExecSpace;

-  static void test_view_copy()
+  static void test_view_copy(const int dim0, const int dim1, const int dim2)
  {
 #if defined( KOKKOS_ENABLE_CUDA ) || defined( KOKKOS_ENABLE_ROCM )
   // ExecSpace = CudaUVM, CudaHostPinned
   // This test will fail at runtime with an illegal memory access if something goes wrong
   // Test 1: deep_copy from host_mirror_space to ExecSpace and ExecSpace back to host_mirror_space
   {
-    const int dim0 = 4;
-    const int dim1 = 2;
-    const int dim2 = 3;
-
    typedef Kokkos::View<double****,InExecSpace> Rank4ViewType;
    Rank4ViewType view_4;
    view_4 = Rank4ViewType("view_4", dim0, dim1, dim2, dim2);
@ -88,19 +84,21 @@ struct TestViewCopy {

   // Test 2: deep_copy from Cuda to ExecSpace and ExecSpace back to Cuda
   {
-    const int dim0 = 4;
-    const int dim1 = 2;
-    const int dim2 = 3;
-
    typedef Kokkos::View<double****,InExecSpace> Rank4ViewType;
    Rank4ViewType view_4;
    view_4 = Rank4ViewType("view_4", dim0, dim1, dim2, dim2);

 #if defined( KOKKOS_ENABLE_CUDA )
-    typedef Kokkos::Cuda space_type;
+    typedef typename std::conditional<
+        Kokkos::Impl::MemorySpaceAccess<Kokkos::CudaSpace,typename InExecSpace::memory_space>::accessible,
+        Kokkos::CudaSpace,
+        InExecSpace>::type space_type;
 #endif
 #if defined( KOKKOS_ENABLE_ROCM )
-    typedef Kokkos::Experimental::ROCm space_type;
+    typedef typename std::conditional<
+        Kokkos::Impl::MemorySpaceAccess<Kokkos::ROCmSpace,typename InExecSpace::memory_space>::accessible,
+        Kokkos::ROCmSpace,
+        InExecSpace>::type space_type;
 #endif
    Kokkos::View<double**,Kokkos::LayoutLeft,space_type> srcView("srcView", dim2, dim2);

@ -118,10 +116,6 @@ struct TestViewCopy {

   // Test 3: deep_copy from host_space to ExecSpace and ExecSpace back to host_space
   {
-    const int dim0 = 4;
-    const int dim1 = 2;
-    const int dim2 = 3;
-
    typedef Kokkos::View<double****,InExecSpace> Rank4ViewType;
    Rank4ViewType view_4;
    view_4 = Rank4ViewType("view_4", dim0, dim1, dim2, dim2);
@ -149,7 +143,41 @@ struct TestViewCopy {

 TEST_F( TEST_CATEGORY , view_copy_tests ) {
  //Only include this file to be compiled with CudaUVM and CudaHostPinned
-  TestViewCopy< TEST_EXECSPACE >::test_view_copy();
+  TestViewCopy< TEST_EXECSPACE >::test_view_copy(4,2,3);
+  TestViewCopy< TEST_EXECSPACE >::test_view_copy(4,2,0);
+}
+
+TEST_F( TEST_CATEGORY , view_copy_degenerated ) {
+  //Only include this file to be compiled with CudaUVM and CudaHostPinned
+  Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_def_1;
+  Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_1( reinterpret_cast<int*>(-1), 0 );
+  Kokkos::View<int*> v_m_def_1;
+  Kokkos::View<int*> v_m_1("v_m_1", 0);
+
+  Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_def_2;
+  Kokkos::View<int*, Kokkos::MemoryTraits<Kokkos::Unmanaged>> v_um_2( reinterpret_cast<int*>(-1), 0 );
+  Kokkos::View<int*> v_m_def_2;
+  Kokkos::View<int*> v_m_2("v_m_2", 0);
+
+  Kokkos::deep_copy(v_um_def_1, v_um_def_2);
+  Kokkos::deep_copy(v_um_def_1, v_um_2);
+  Kokkos::deep_copy(v_um_def_1, v_m_def_2);
+  Kokkos::deep_copy(v_um_def_1, v_m_2);
+
+  Kokkos::deep_copy(v_um_1, v_um_def_2);
+  Kokkos::deep_copy(v_um_1, v_um_2);
+  Kokkos::deep_copy(v_um_1, v_m_def_2);
+  Kokkos::deep_copy(v_um_1, v_m_2);
+
+  Kokkos::deep_copy(v_m_def_1, v_um_def_2);
+  Kokkos::deep_copy(v_m_def_1, v_um_2);
+  Kokkos::deep_copy(v_m_def_1, v_m_def_2);
+  Kokkos::deep_copy(v_m_def_1, v_m_2);
+
+  Kokkos::deep_copy(v_m_1, v_um_def_2);
+  Kokkos::deep_copy(v_m_1, v_um_2);
+  Kokkos::deep_copy(v_m_1, v_m_def_2);
+  Kokkos::deep_copy(v_m_1, v_m_2);
 }

 } // namespace Test
--- a/lib/kokkos/core/unit_test/TestViewMapping_a.hpp
+++ b/lib/kokkos/core/unit_test/TestViewMapping_a.hpp
@ -1245,5 +1245,12 @@ TEST_F( TEST_CATEGORY , view_mapping_operator )
  test_view_mapping_operator< TEST_EXECSPACE >();
 }

+TEST_F( TEST_CATEGORY , static_extent )
+{
+  using T = Kokkos::View<double*[2][3]>;
+  ASSERT_EQ( T::static_extent(1), 2 );
+  ASSERT_EQ( T::static_extent(2), 3 );
+}
+
 }

--- a/lib/kokkos/core/unit_test/cuda/TestCuda_Spaces.cpp
+++ b/lib/kokkos/core/unit_test/cuda/TestCuda_Spaces.cpp
@ -228,6 +228,10 @@ TEST_F( cuda, uvm )
  }
 }

+/* Removing UVM Allocs Test due to added time to complete overall unit test
+ * The issue verified with this unit test appears to no longer be an 
+ * problem.  Refer to github issue 1880 for more details
+ *
 TEST_F( cuda, uvm_num_allocs )
 {
  // The max number of UVM allocations allowed is 65536.
@ -288,6 +292,7 @@ TEST_F( cuda, uvm_num_allocs )

  #undef MAX_NUM_ALLOCS
 }
+*/

 template< class MemSpace, class ExecSpace >
 struct TestViewCudaAccessible {
--- a/lib/kokkos/core/unit_test/openmp/TestOpenMP_ViewAPI_e.cpp
+++ b/lib/kokkos/core/unit_test/openmp/TestOpenMP_ViewAPI_e.cpp
@ -43,3 +43,4 @@

 #include <openmp/TestOpenMP_Category.hpp>
 #include <TestViewAPI_e.hpp>
+#include <TestViewCopy.hpp>
--- a/lib/kokkos/core/unit_test/serial/TestSerial_ViewAPI_e.cpp
+++ b/lib/kokkos/core/unit_test/serial/TestSerial_ViewAPI_e.cpp
@ -43,3 +43,5 @@

 #include <serial/TestSerial_Category.hpp>
 #include <TestViewAPI_e.hpp>
+#include <TestViewCopy.hpp>
+
--- a/lib/kokkos/core/unit_test/threads/TestThreads_ViewAPI_e.cpp
+++ b/lib/kokkos/core/unit_test/threads/TestThreads_ViewAPI_e.cpp
@ -43,3 +43,4 @@

 #include <threads/TestThreads_Category.hpp>
 #include <TestViewAPI_e.hpp>
+#include <TestViewCopy.hpp>
--- a/lib/kokkos/generate_makefile.bash
+++ b/lib/kokkos/generate_makefile.bash
@ -68,6 +68,9 @@ do
    --cxxflags*)
      CXXFLAGS="${key#*=}"
      ;;
+    --cxxstandard*)
+      KOKKOS_CXX_STANDARD="${key#*=}"
+      ;;
    --ldflags*)
      LDFLAGS="${key#*=}"
      ;;
@ -127,6 +130,7 @@ do
      echo "--arch=[OPT]:  Set target architectures. Options are:"
      echo "               [AMD]"
      echo "                 AMDAVX          = AMD CPU"
+      echo "                 EPYC            = AMD EPYC Zen-Core CPU"
      echo "               [ARM]"
      echo "                 ARMv80          = ARMv8.0 Compatible CPU"
      echo "                 ARMv81          = ARMv8.1 Compatible CPU"
@ -165,6 +169,8 @@ do
      echo "                                build.  This will still set certain required"
      echo "                                flags via KOKKOS_CXXFLAGS (such as -fopenmp,"
      echo "                                --std=c++11, etc.)."
+      echo "--cxxstandard=[FLAGS]         Overwrite KOKKOS_CXX_STANDARD for library build and test"
+      echo "                                c++11 (default), c++14, c++17, c++1y, c++1z, c++2a"
      echo "--ldflags=[FLAGS]             Overwrite LDFLAGS for library build and test"
      echo "                                build. This will still set certain required"
      echo "                                flags via KOKKOS_LDFLAGS (such as -fopenmp,"
@ -243,6 +249,10 @@ if [ ${#CXXFLAGS} -gt 0 ]; then
  KOKKOS_SETTINGS="${KOKKOS_SETTINGS} CXXFLAGS=\"${CXXFLAGS}\""
 fi

+if [ ${#KOKKOS_CXX_STANDARD} -gt 0 ]; then
+  KOKKOS_SETTINGS="${KOKKOS_SETTINGS} KOKKOS_CXX_STANDARD=\"${KOKKOS_CXX_STANDARD}\""
+fi
+
 if [ ${#LDFLAGS} -gt 0 ]; then
  KOKKOS_SETTINGS="${KOKKOS_SETTINGS} LDFLAGS=\"${LDFLAGS}\""
 fi
--- a/lib/kokkos/master_history.txt
+++ b/lib/kokkos/master_history.txt
@ -15,3 +15,4 @@ tag:  2.5.00     date: 12:15:2017    master: dfe685f4    develop: ec7ad6d8
 tag:  2.6.00     date: 03:07:2018    master: 62e760fa    develop: d1ba7d71
 tag:  2.7.00     date: 05:24:2018    master: e01945d0    develop: 2d13f608
 tag:  2.7.24     date: 11:04:2018    master: d3a94192    develop: 7a06fc81
+tag:  2.8.00     date: 02:05:2019    master: 34931a36    develop: d1659d1d
--- a/lib/kokkos/scripts/testing_scripts/test_all_sandia
+++ b/lib/kokkos/scripts/testing_scripts/test_all_sandia
@ -88,6 +88,8 @@ CXX_FLAGS_EXTRA=""
 LD_FLAGS_EXTRA=""
 KOKKOS_OPTIONS=""

+CXX_STANDARD="c++11"
+
 #
 # Handle arguments.
 #
@ -142,6 +144,9 @@ do
    --cxxflags-extra*)
      CXX_FLAGS_EXTRA="${key#*=}"
      ;;
+    --cxxstandard*)
+      CXX_STANDARD="${key#*=}"
+      ;;
    --ldflags-extra*)
      LD_FLAGS_EXTRA="${key#*=}"
      ;;
@ -227,18 +232,30 @@ elif [ "$MACHINE" = "white" ]; then
  export SLURM_TASKS_PER_NODE=32

  BASE_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>"
-  IBM_MODULE_LIST="<COMPILER_NAME>/xl/<COMPILER_VERSION>"
+  IBM_MODULE_LIST="<COMPILER_NAME>/xl/<COMPILER_VERSION>,gcc/7.2.0"
  CUDA_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/7.2.0,ibm/xl/16.1.0"
+  CUDA10_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/7.4.0,ibm/xl/16.1.0"

  # Don't do pthread on white.
  GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"

-  # Format: (compiler module-list build-list exe-name warning-flag)
-  COMPILERS=("gcc/6.4.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
-             "gcc/7.2.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
-             "ibm/16.1.0 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
-             "cuda/9.2.88 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
-  )
+  if [ "$SPOT_CHECK" = "True" ]; then
+    # Format: (compiler module-list build-list exe-name warning-flag)
+    COMPILERS=("gcc/6.4.0 $BASE_MODULE_LIST "OpenMP_Serial" g++ $GCC_WARNING_FLAGS"
+               "gcc/7.2.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
+               "ibm/16.1.0 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
+               "cuda/9.2.88 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
+    )
+  else
+    # Format: (compiler module-list build-list exe-name warning-flag)
+    COMPILERS=("gcc/6.4.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
+               "gcc/7.2.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
+               "ibm/16.1.0 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
+               "ibm/16.1.1 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
+               "cuda/9.2.88 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
+               "cuda/10.0.130 $CUDA10_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
+    )
+  fi

  if [ -z "$ARCH_FLAG" ]; then
    ARCH_FLAG="--arch=Power8,Kepler37"
@ -323,6 +340,7 @@ elif [ "$MACHINE" = "apollo" ]; then
  BASE_MODULE_LIST="sems-env,kokkos-env,sems-<COMPILER_NAME>/<COMPILER_VERSION>,kokkos-hwloc/1.10.1/base"
  CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base"
  CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"
+  CUDA10_MODULE_LIST="sems-env,kokkos-env,<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"

  CLANG_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/9.0.69"
  CLANG7_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/9.1"
@ -344,6 +362,7 @@ elif [ "$MACHINE" = "apollo" ]; then
  else
    # Format: (compiler module-list build-list exe-name warning-flag)
    COMPILERS=("cuda/9.1 $CUDA8_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
+               "cuda/10.0 $CUDA10_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
               "clang/6.0 $CLANG_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS"
               "clang/7.0 $CLANG7_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS"
               "clang/3.9.0 $CLANG_MODULE_LIST $BUILD_LIST_CLANG clang++ $CLANG_WARNING_FLAGS"
@ -629,6 +648,8 @@ single_build_and_test() {
  local cxxflags="${cxxflags} ${CXX_FLAGS_EXTRA}"
  local ldflags="${ldflags} ${LD_FLAGS_EXTRA}"

+  local cxx_standard="${CXX_STANDARD}"
+
  if [[ "$KOKKOS_CUDA_OPTIONS" != "" ]]; then
    local extra_args="$extra_args $KOKKOS_CUDA_OPTIONS"
  fi
@ -650,7 +671,7 @@ single_build_and_test() {
      run_cmd ls fake_problem >& ${desc}.configure.log || { report_and_log_test_result 1 $desc configure && return 0; }
    fi
  else
-    run_cmd ${KOKKOS_PATH}/generate_makefile.bash --with-devices=$build $ARCH_FLAG --compiler=$(which $compiler_exe) --cxxflags=\"$cxxflags\" --ldflags=\"$ldflags\" $extra_args &>> ${desc}.configure.log || { report_and_log_test_result 1 ${desc} configure && return 0; }
+    run_cmd ${KOKKOS_PATH}/generate_makefile.bash --with-devices=$build $ARCH_FLAG --compiler=$(which $compiler_exe) --cxxflags=\"$cxxflags\" --cxxstandard=\"$cxx_standard\" --ldflags=\"$ldflags\" $extra_args &>> ${desc}.configure.log || { report_and_log_test_result 1 ${desc} configure && return 0; }
    local -i build_start_time=$(date +%s)
    run_cmd make -j 48 build-test >& ${desc}.build.log || { report_and_log_test_result 1 ${desc} build && return 0; }
    local -i build_end_time=$(date +%s)
--- a/src/BODY/fix_wall_body_polygon.cpp
+++ b/src/BODY/fix_wall_body_polygon.cpp
@ -206,7 +206,7 @@ void FixWallBodyPolygon::setup(int vflag)

 void FixWallBodyPolygon::post_force(int /*vflag*/)
 {
-  double vwall[3],dx,dy,dz,del1,del2,delxy,delr,rsq,eradi,rradi,wall_pos;
+  double vwall[3],dx,dy,dz,del1,del2,delxy,delr,rsq,eradi,wall_pos;
  int i,ni,npi,ifirst,nei,iefirst,side;
  double facc[3];

@ -316,7 +316,6 @@ void FixWallBodyPolygon::post_force(int /*vflag*/)
      nei = ednum[i];
      iefirst = edfirst[i];
      eradi = enclosing_radius[i];
-      rradi = rounded_radius[i];

      // reset vertex and edge forces

@ -332,14 +331,14 @@ void FixWallBodyPolygon::post_force(int /*vflag*/)
        edge[iefirst+ni][4] = 0;
      }

-      int interact, num_contacts, done;
+      int num_contacts, done;
      double delta_a, delta_ua, j_a;
      Contact contact_list[MAX_CONTACTS];

      num_contacts = 0;
      facc[0] = facc[1] = facc[2] = 0;
-      interact = vertex_against_wall(i, wall_pos, x, f, torque, side,
-                                     contact_list, num_contacts, facc);
+      vertex_against_wall(i, wall_pos, x, f, torque, side,
+                          contact_list, num_contacts, facc);

      if (num_contacts >= 2) {

@ -475,12 +474,11 @@ int FixWallBodyPolygon::vertex_against_wall(int i, double wall_pos,
                Contact* contact_list, int &num_contacts, double* /*facc*/)
 {
  int ni, npi, ifirst, interact;
-  double xpi[3], eradi, rradi;
+  double xpi[3], rradi;
  double fx, fy, fz;

  npi = dnum[i];
  ifirst = dfirst[i];
-  eradi = enclosing_radius[i];
  rradi = rounded_radius[i];

  interact = 0;
--- a/src/BODY/fix_wall_body_polyhedron.cpp
+++ b/src/BODY/fix_wall_body_polyhedron.cpp
@ -213,8 +213,8 @@ void FixWallBodyPolyhedron::setup(int vflag)

 void FixWallBodyPolyhedron::post_force(int /*vflag*/)
 {
-  double vwall[3],dx,dy,dz,del1,del2,rsq,eradi,rradi,wall_pos;
-  int i,ni,npi,ifirst,nei,iefirst,nfi,iffirst,side;
+  double vwall[3],dx,dy,dz,del1,del2,rsq,wall_pos;
+  int i,ni,npi,ifirst,nei,iefirst,side;
  double facc[3];

  // set position of wall to initial settings and velocity to 0.0
@ -330,10 +330,6 @@ void FixWallBodyPolyhedron::post_force(int /*vflag*/)
      ifirst = dfirst[i];
      nei = ednum[i];
      iefirst = edfirst[i];
-      nfi = facnum[i];
-      iffirst = facfirst[i];
-      eradi = enclosing_radius[i];
-      rradi = rounded_radius[i];

      if (npi == 1) {
        sphere_against_wall(i, wall_pos, side, vwall, x, v, f, angmom, torque);
@ -356,13 +352,13 @@ void FixWallBodyPolyhedron::post_force(int /*vflag*/)
        edge[iefirst+ni][5] = 0;
      }

-      int interact, num_contacts;
+      int num_contacts;
      Contact contact_list[MAX_CONTACTS];

      num_contacts = 0;
      facc[0] = facc[1] = facc[2] = 0;
-      interact = edge_against_wall(i, wall_pos, side, vwall, x, f, torque,
-                                   contact_list, num_contacts, facc);
+      edge_against_wall(i, wall_pos, side, vwall, x, f, torque,
+                        contact_list, num_contacts, facc);

    } // group bit
  }
@ -544,7 +540,7 @@ int FixWallBodyPolyhedron::edge_against_wall(int i, double wall_pos,
     int side, double* vwall, double** x, double** /*f*/, double** /*torque*/,
     Contact* /*contact_list*/, int &/*num_contacts*/, double* /*facc*/)
 {
-  int ni, nei, mode, contact;
+  int ni, nei, contact;
  double rradi;

  nei = ednum[i];
@ -555,8 +551,7 @@ int FixWallBodyPolyhedron::edge_against_wall(int i, double wall_pos,
  // loop through body i's edges

  for (ni = 0; ni < nei; ni++)
-    mode = compute_distance_to_wall(i, ni, x[i], rradi, wall_pos, side, vwall,
-                                    contact);
+    compute_distance_to_wall(i, ni, x[i], rradi, wall_pos, side, vwall, contact);

  return contact;
 }
--- a/src/GPU/pair_born_coul_long_cs_gpu.cpp
+++ b/src/GPU/pair_born_coul_long_cs_gpu.cpp
@ -212,8 +212,8 @@ double PairBornCoulLongCSGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairBornCoulLongCSGPU::cpu_compute(int start, int inum, int eflag,
-                                      int vflag, int *ilist, int *numneigh,
-                                      int **firstneigh)
+                                      int /* vflag */, int *ilist,
+                                      int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itable,itype,jtype;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
--- a/src/GPU/pair_born_coul_long_gpu.cpp
+++ b/src/GPU/pair_born_coul_long_gpu.cpp
@ -207,8 +207,8 @@ double PairBornCoulLongGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairBornCoulLongGPU::cpu_compute(int start, int inum, int eflag,
-                                      int vflag, int *ilist, int *numneigh,
-                                      int **firstneigh)
+                                      int /* vflag */, int *ilist,
+                                      int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
--- a/src/GPU/pair_born_coul_wolf_cs_gpu.cpp
+++ b/src/GPU/pair_born_coul_wolf_cs_gpu.cpp
@ -193,9 +193,9 @@ double PairBornCoulWolfCSGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairBornCoulWolfCSGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                                      int *ilist, int *numneigh,
-                                      int **firstneigh) {
+void PairBornCoulWolfCSGPU::cpu_compute(int start, int inum, int eflag,
+                                      int /* vflag */, int *ilist,
+                                      int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,qtmp,delx,dely,delz,evdwl,ecoul,fpair;
  double rsq,r2inv,r6inv,forcecoul,forceborn,factor_coul,factor_lj;
--- a/src/GPU/pair_buck_coul_long_gpu.cpp
+++ b/src/GPU/pair_buck_coul_long_gpu.cpp
@ -201,8 +201,8 @@ double PairBuckCoulLongGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairBuckCoulLongGPU::cpu_compute(int start, int inum, int eflag,
-                                       int vflag, int *ilist, int *numneigh,
-                                       int **firstneigh)
+                                       int /* vflag */, int *ilist,
+                                       int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
--- a/src/GPU/pair_colloid_gpu.cpp
+++ b/src/GPU/pair_colloid_gpu.cpp
@ -184,8 +184,9 @@ double PairColloidGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairColloidGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                                 int *ilist, int *numneigh, int **firstneigh)
+void PairColloidGPU::cpu_compute(int start, int inum, int eflag,
+                                 int /* vflag */, int *ilist,
+                                 int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
--- a/src/GPU/pair_coul_long_cs_gpu.cpp
+++ b/src/GPU/pair_coul_long_cs_gpu.cpp
@ -153,7 +153,7 @@ void PairCoulLongCSGPU::init_style()
  for (int i = 1; i <= atom->ntypes; i++) {
    for (int j = i; j <= atom->ntypes; j++) {
      if (setflag[i][j] != 0 || (setflag[i][i] != 0 && setflag[j][j] != 0)) {
-        double cut = init_one(i,j);
+        init_one(i,j);
      }
    }
  }
@ -208,8 +208,8 @@ double PairCoulLongCSGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairCoulLongCSGPU::cpu_compute(int start, int inum, int eflag,
-                                  int vflag, int *ilist, int *numneigh,
-                                  int **firstneigh)
+                                  int /* vflag */, int *ilist,
+                                  int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itable,itype,jtype;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,ecoul,fpair;
--- a/src/GPU/pair_coul_long_gpu.cpp
+++ b/src/GPU/pair_coul_long_gpu.cpp
@ -148,7 +148,7 @@ void PairCoulLongGPU::init_style()
  for (int i = 1; i <= atom->ntypes; i++) {
    for (int j = i; j <= atom->ntypes; j++) {
      if (setflag[i][j] != 0 || (setflag[i][i] != 0 && setflag[j][j] != 0)) {
-        double cut = init_one(i,j);
+        init_one(i,j);
      }
    }
  }
@ -203,8 +203,8 @@ double PairCoulLongGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairCoulLongGPU::cpu_compute(int start, int inum, int eflag,
-                                  int vflag, int *ilist, int *numneigh,
-                                  int **firstneigh)
+                                  int /* vflag */, int *ilist,
+                                  int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itable;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,ecoul,fpair;
--- a/src/GPU/pair_eam_alloy_gpu.cpp
+++ b/src/GPU/pair_eam_alloy_gpu.cpp
@ -192,8 +192,8 @@ void PairEAMAlloyGPU::init_style()
 /* ---------------------------------------------------------------------- */

 double PairEAMAlloyGPU::single(int i, int j, int itype, int jtype,
-                       double rsq, double factor_coul, double factor_lj,
-                       double &fforce)
+                               double rsq, double /* factor_coul */,
+                               double /* factor_lj */, double &fforce)
 {
  int m;
  double r,p,rhoip,rhojp,z2,z2p,recip,phi,phip,psip;
@ -235,7 +235,7 @@ double PairEAMAlloyGPU::single(int i, int j, int itype, int jtype,
 /* ---------------------------------------------------------------------- */

 int PairEAMAlloyGPU::pack_forward_comm(int n, int *list, double *buf,
-                                  int pbc_flag,int *pbc)
+                                       int /* pbc_flag */, int * /* pbc */)
 {
  int i,j,m;

--- a/src/GPU/pair_eam_fs_gpu.cpp
+++ b/src/GPU/pair_eam_fs_gpu.cpp
@ -192,8 +192,8 @@ void PairEAMFSGPU::init_style()
 /* ---------------------------------------------------------------------- */

 double PairEAMFSGPU::single(int i, int j, int itype, int jtype,
-                       double rsq, double factor_coul, double factor_lj,
-                       double &fforce)
+                            double rsq, double /* factor_coul */,
+                            double /* factor_lj */, double &fforce)
 {
  int m;
  double r,p,rhoip,rhojp,z2,z2p,recip,phi,phip,psip;
@ -235,7 +235,7 @@ double PairEAMFSGPU::single(int i, int j, int itype, int jtype,
 /* ---------------------------------------------------------------------- */

 int PairEAMFSGPU::pack_forward_comm(int n, int *list, double *buf,
-                                  int pbc_flag,int *pbc)
+                                    int /* pbc_flag */, int * /* pbc */)
 {
  int i,j,m;

--- a/src/GPU/pair_eam_gpu.cpp
+++ b/src/GPU/pair_eam_gpu.cpp
@ -195,8 +195,8 @@ void PairEAMGPU::init_style()
 /* ---------------------------------------------------------------------- */

 double PairEAMGPU::single(int i, int j, int itype, int jtype,
-                       double rsq, double factor_coul, double factor_lj,
-                       double &fforce)
+                          double rsq, double /* factor_coul */,
+                          double /* factor_lj */, double &fforce)
 {
  int m;
  double r,p,rhoip,rhojp,z2,z2p,recip,phi,phip,psip;
@ -238,7 +238,7 @@ double PairEAMGPU::single(int i, int j, int itype, int jtype,
 /* ---------------------------------------------------------------------- */

 int PairEAMGPU::pack_forward_comm(int n, int *list, double *buf,
-                                  int pbc_flag,int *pbc)
+                                  int /* pbc_flag */, int * /* pbc */)
 {
  int i,j,m;

--- a/src/GPU/pair_gayberne_gpu.cpp
+++ b/src/GPU/pair_gayberne_gpu.cpp
@ -221,8 +221,9 @@ double PairGayBerneGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairGayBerneGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                                  int *ilist, int *numneigh, int **firstneigh)
+void PairGayBerneGPU::cpu_compute(int start, int inum, int eflag,
+                                  int /* vflag */, int *ilist,
+                                  int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype;
  double evdwl,one_eng,rsq,r2inv,r6inv,forcelj,factor_lj;
--- a/src/GPU/pair_lj_charmm_coul_long_gpu.cpp
+++ b/src/GPU/pair_lj_charmm_coul_long_gpu.cpp
@ -151,11 +151,10 @@ void PairLJCharmmCoulLongGPU::init_style()

  // Repeat cutsq calculation because done after call to init_style

-  double cut;
  for (int i = 1; i <= atom->ntypes; i++) {
    for (int j = i; j <= atom->ntypes; j++) {
      if (setflag[i][j] != 0 || (setflag[i][i] != 0 && setflag[j][j] != 0))
-        cut = init_one(i,j);
+        init_one(i,j);
    }
  }

@ -219,7 +218,7 @@ double PairLJCharmmCoulLongGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairLJCharmmCoulLongGPU::cpu_compute(int start, int inum, int eflag,
-                                          int vflag, int *ilist,
+                                          int /* vflag */, int *ilist,
                                          int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype,itable;
--- a/src/GPU/pair_lj_class2_coul_long_gpu.cpp
+++ b/src/GPU/pair_lj_class2_coul_long_gpu.cpp
@ -197,8 +197,8 @@ double PairLJClass2CoulLongGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairLJClass2CoulLongGPU::cpu_compute(int start, int inum, int eflag,
-                                       int vflag, int *ilist, int *numneigh,
-                                       int **firstneigh)
+                                          int /* vflag */, int *ilist,
+                                          int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
--- a/src/GPU/pair_lj_cut_coul_long_gpu.cpp
+++ b/src/GPU/pair_lj_cut_coul_long_gpu.cpp
@ -215,8 +215,8 @@ double PairLJCutCoulLongGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairLJCutCoulLongGPU::cpu_compute(int start, int inum, int eflag,
-                                       int vflag, int *ilist, int *numneigh,
-                                       int **firstneigh)
+                                       int /* vflag */, int *ilist,
+                                       int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype,itable;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
--- a/src/GPU/pair_lj_cut_coul_msm_gpu.cpp
+++ b/src/GPU/pair_lj_cut_coul_msm_gpu.cpp
@ -192,8 +192,9 @@ double PairLJCutCoulMSMGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairLJCutCoulMSMGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                               int *ilist, int *numneigh, int **firstneigh) {
+void PairLJCutCoulMSMGPU::cpu_compute(int start, int inum, int eflag,
+                                      int /* vflag */, int *ilist,
+                                      int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype,itable;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
  double fraction,table;
--- a/src/GPU/pair_lj_cut_dipole_long_gpu.cpp
+++ b/src/GPU/pair_lj_cut_dipole_long_gpu.cpp
@ -216,7 +216,7 @@ void PairLJCutDipoleLongGPU::cpu_compute(int start, int inum, int eflag, int vfl
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz;
  double rsq,r,rinv,r2inv,r6inv;
  double forcecoulx,forcecouly,forcecoulz,fforce;
-  double tixcoul,tiycoul,tizcoul,tjxcoul,tjycoul,tjzcoul;
+  double tixcoul,tiycoul,tizcoul;
  double fx,fy,fz,fdx,fdy,fdz,fax,fay,faz;
  double pdotp,pidotr,pjdotr,pre1,pre2,pre3;
  double grij,expm2,t,erfc;
@ -378,14 +378,9 @@ void PairLJCutDipoleLongGPU::cpu_compute(int start, int inum, int eflag, int vfl
          tixcoul = mu[i][1]*(zdiz + zaiz) - mu[i][2]*(zdiy + zaiy);
          tiycoul = mu[i][2]*(zdix + zaix) - mu[i][0]*(zdiz + zaiz);
          tizcoul = mu[i][0]*(zdiy + zaiy) - mu[i][1]*(zdix + zaix);
-          tjxcoul = mu[j][1]*(zdjz + zajz) - mu[j][2]*(zdjy + zajy);
-          tjycoul = mu[j][2]*(zdjx + zajx) - mu[j][0]*(zdjz + zajz);
-          tjzcoul = mu[j][0]*(zdjy + zajy) - mu[j][1]*(zdjx + zajx);
-
        } else {
          forcecoulx = forcecouly = forcecoulz = 0.0;
          tixcoul = tiycoul = tizcoul = 0.0;
-          tjxcoul = tjycoul = tjzcoul = 0.0;
        }

        // LJ interaction
--- a/src/GPU/pair_lj_expand_coul_long_gpu.cpp
+++ b/src/GPU/pair_lj_expand_coul_long_gpu.cpp
@ -215,8 +215,8 @@ double PairLJExpandCoulLongGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairLJExpandCoulLongGPU::cpu_compute(int start, int inum, int eflag,
-                                       int vflag, int *ilist, int *numneigh,
-                                       int **firstneigh)
+                                       int /* vflag */, int *ilist,
+                                       int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype,itable;
  double qtmp,xtmp,ytmp,ztmp,delx,dely,delz,evdwl,ecoul,fpair;
--- a/src/GPU/pair_mie_cut_gpu.cpp
+++ b/src/GPU/pair_mie_cut_gpu.cpp
@ -172,8 +172,8 @@ double PairMIECutGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairMIECutGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                               int *ilist, int *numneigh, int **firstneigh) {
+void PairMIECutGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
+                                int *ilist, int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
  double rsq,r2inv,rgamR,rgamA,forcemie,factor_mie;
--- a/src/GPU/pair_morse_gpu.cpp
+++ b/src/GPU/pair_morse_gpu.cpp
@ -168,7 +168,7 @@ double PairMorseGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairMorseGPU::cpu_compute(int start, int inum, int eflag, int vflag,
+void PairMorseGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
                               int *ilist, int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype;
--- a/src/GPU/pair_resquared_gpu.cpp
+++ b/src/GPU/pair_resquared_gpu.cpp
@ -219,8 +219,9 @@ double PairRESquaredGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairRESquaredGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                                  int *ilist, int *numneigh, int **firstneigh)
+void PairRESquaredGPU::cpu_compute(int start, int inum, int eflag,
+                                   int /* vflag */, int *ilist,
+                                   int *numneigh, int **firstneigh)
 {
  int i,j,ii,jj,jnum,itype,jtype;
  double evdwl,one_eng,rsq,r2inv,r6inv,forcelj,factor_lj;
--- a/src/GPU/pair_soft_gpu.cpp
+++ b/src/GPU/pair_soft_gpu.cpp
@ -182,8 +182,8 @@ double PairSoftGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairSoftGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                               int *ilist, int *numneigh, int **firstneigh) {
+void PairSoftGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
+                              int *ilist, int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
  double r,rsq,arg,factor_lj;
--- a/src/GPU/pair_table_gpu.cpp
+++ b/src/GPU/pair_table_gpu.cpp
@ -246,7 +246,7 @@ double PairTableGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairTableGPU::cpu_compute(int start, int inum, int eflag, int vflag,
+void PairTableGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
                               int *ilist, int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype,itable;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
--- a/src/GPU/pair_ufm_gpu.cpp
+++ b/src/GPU/pair_ufm_gpu.cpp
@ -186,8 +186,8 @@ double PairUFMGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairUFMGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                               int *ilist, int *numneigh, int **firstneigh) {
+void PairUFMGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
+                             int *ilist, int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
  double rsq,expuf,factor_lj;
--- a/src/GPU/pair_vashishta_gpu.cpp
+++ b/src/GPU/pair_vashishta_gpu.cpp
@ -151,7 +151,7 @@ void PairVashishtaGPU::init_style()
  if (force->newton_pair != 0)
    error->all(FLERR,"Pair style vashishta/gpu requires newton pair off");

-  double *cutsq, *r0, *r0eps, *gamma, *eta;
+  double *cutsq, *r0, *gamma, *eta;
  double *lam1inv, *lam4inv, *zizj, *mbigd;
  double *dvrc, *big6w, *heta, *bigh;
  double *bigw, *c0, *costheta, *bigb;
--- a/src/GPU/pair_yukawa_colloid_gpu.cpp
+++ b/src/GPU/pair_yukawa_colloid_gpu.cpp
@ -179,8 +179,8 @@ double PairYukawaColloidGPU::memory_usage()
 /* ---------------------------------------------------------------------- */

 void PairYukawaColloidGPU::cpu_compute(int start, int inum, int eflag,
-                                       int vflag, int *ilist, int *numneigh,
-                                       int **firstneigh) {
+                                       int /* vflag */, int *ilist,
+                                       int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair,radi,radj;
  double r,rsq,rinv,screening,forceyukawa,factor;
--- a/src/GPU/pair_yukawa_gpu.cpp
+++ b/src/GPU/pair_yukawa_gpu.cpp
@ -170,8 +170,8 @@ double PairYukawaGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairYukawaGPU::cpu_compute(int start, int inum, int eflag, int vflag,
-                               int *ilist, int *numneigh, int **firstneigh) {
+void PairYukawaGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
+                                int *ilist, int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
  double rsq,r2inv,r,rinv,screening,forceyukawa,factor;
--- a/src/GPU/pair_zbl_gpu.cpp
+++ b/src/GPU/pair_zbl_gpu.cpp
@ -177,7 +177,7 @@ double PairZBLGPU::memory_usage()

 /* ---------------------------------------------------------------------- */

-void PairZBLGPU::cpu_compute(int start, int inum, int eflag, int vflag,
+void PairZBLGPU::cpu_compute(int start, int inum, int eflag, int /* vflag */,
                             int *ilist, int *numneigh, int **firstneigh) {
  int i,j,ii,jj,jnum,itype,jtype;
  double xtmp,ytmp,ztmp,delx,dely,delz,evdwl,fpair;
--- a/src/KOKKOS/Install.sh
+++ b/src/KOKKOS/Install.sh
@ -49,6 +49,8 @@ action angle_charmm_kokkos.cpp angle_charmm.cpp
 action angle_charmm_kokkos.h angle_charmm.h
 action angle_class2_kokkos.cpp angle_class2.cpp 
 action angle_class2_kokkos.h angle_class2.h
+action angle_cosine_kokkos.cpp angle_cosine.cpp 
+action angle_cosine_kokkos.h angle_cosine.h
 action angle_harmonic_kokkos.cpp angle_harmonic.cpp 
 action angle_harmonic_kokkos.h angle_harmonic.h 
 action atom_kokkos.cpp
--- a/src/KOKKOS/angle_cosine_kokkos.cpp
+++ b/src/KOKKOS/angle_cosine_kokkos.cpp
@ -0,0 +1,394 @@
+/* ----------------------------------------------------------------------
+   LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
+   http://lammps.sandia.gov, Sandia National Laboratories
+   Steve Plimpton, sjplimp@sandia.gov
+
+   Copyright (2003) Sandia Corporation.  Under the terms of Contract
+   DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
+   certain rights in this software.  This software is distributed under
+   the GNU General Public License.
+
+   See the README file in the top-level LAMMPS directory.
+------------------------------------------------------------------------- */
+
+/* ----------------------------------------------------------------------
+   Contributing author: Stan Moore (SNL)
+------------------------------------------------------------------------- */
+
+#include <cmath>
+#include <cstdlib>
+#include "angle_cosine_kokkos.h"
+#include "atom_kokkos.h"
+#include "neighbor_kokkos.h"
+#include "domain.h"
+#include "comm.h"
+#include "force.h"
+#include "math_const.h"
+#include "memory_kokkos.h"
+#include "error.h"
+#include "atom_masks.h"
+
+using namespace LAMMPS_NS;
+using namespace MathConst;
+
+#define SMALL 0.001
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+AngleCosineKokkos<DeviceType>::AngleCosineKokkos(LAMMPS *lmp) : AngleCosine(lmp)
+{
+  atomKK = (AtomKokkos *) atom;
+  neighborKK = (NeighborKokkos *) neighbor;
+  execution_space = ExecutionSpaceFromDevice<DeviceType>::space;
+  datamask_read = X_MASK | F_MASK | ENERGY_MASK | VIRIAL_MASK;
+  datamask_modify = F_MASK | ENERGY_MASK | VIRIAL_MASK;
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+AngleCosineKokkos<DeviceType>::~AngleCosineKokkos()
+{
+  if (!copymode) {
+    memoryKK->destroy_kokkos(k_eatom,eatom);
+    memoryKK->destroy_kokkos(k_vatom,vatom);
+  }
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+void AngleCosineKokkos<DeviceType>::compute(int eflag_in, int vflag_in)
+{
+  eflag = eflag_in;
+  vflag = vflag_in;
+
+  if (eflag || vflag) ev_setup(eflag,vflag,0);
+  else evflag = 0;
+
+  // reallocate per-atom arrays if necessary
+
+  if (eflag_atom) {
+    memoryKK->destroy_kokkos(k_eatom,eatom);
+    memoryKK->create_kokkos(k_eatom,eatom,maxeatom,"angle:eatom");
+    d_eatom = k_eatom.template view<DeviceType>();
+  }
+  if (vflag_atom) {
+    memoryKK->destroy_kokkos(k_vatom,vatom);
+    memoryKK->create_kokkos(k_vatom,vatom,maxvatom,6,"angle:vatom");
+    d_vatom = k_vatom.template view<DeviceType>();
+  }
+
+  //atomKK->sync(execution_space,datamask_read);
+  k_k.template sync<DeviceType>();
+  //  if (eflag || vflag) atomKK->modified(execution_space,datamask_modify);
+  //  else atomKK->modified(execution_space,F_MASK);
+
+  x = atomKK->k_x.template view<DeviceType>();
+  f = atomKK->k_f.template view<DeviceType>();
+  neighborKK->k_anglelist.template sync<DeviceType>();
+  anglelist = neighborKK->k_anglelist.template view<DeviceType>();
+  int nanglelist = neighborKK->nanglelist;
+  nlocal = atom->nlocal;
+  newton_bond = force->newton_bond;
+
+  copymode = 1;
+
+  // loop over neighbors of my atoms
+
+  EV_FLOAT ev;
+
+  if (evflag) {
+    if (newton_bond) {
+      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<1,1> >(0,nanglelist),*this,ev);
+    } else {
+      Kokkos::parallel_reduce(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<0,1> >(0,nanglelist),*this,ev);
+    }
+  } else {
+    if (newton_bond) {
+      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<1,0> >(0,nanglelist),*this);
+    } else {
+      Kokkos::parallel_for(Kokkos::RangePolicy<DeviceType, TagAngleCosineCompute<0,0> >(0,nanglelist),*this);
+    }
+  }
+
+  if (eflag_global) energy += ev.evdwl;
+  if (vflag_global) {
+    virial[0] += ev.v[0];
+    virial[1] += ev.v[1];
+    virial[2] += ev.v[2];
+    virial[3] += ev.v[3];
+    virial[4] += ev.v[4];
+    virial[5] += ev.v[5];
+  }
+
+  if (eflag_atom) {
+    k_eatom.template modify<DeviceType>();
+    k_eatom.template sync<LMPHostType>();
+  }
+
+  if (vflag_atom) {
+    k_vatom.template modify<DeviceType>();
+    k_vatom.template sync<LMPHostType>();
+  }
+
+  copymode = 0;
+}
+
+template<class DeviceType>
+template<int NEWTON_BOND, int EVFLAG>
+KOKKOS_INLINE_FUNCTION
+void AngleCosineKokkos<DeviceType>::operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int &n, EV_FLOAT& ev) const {
+
+  // The f array is atomic
+  Kokkos::View<F_FLOAT*[3], typename DAT::t_f_array::array_layout,DeviceType,Kokkos::MemoryTraits<Kokkos::Atomic|Kokkos::Unmanaged> > a_f = f;
+
+  const int i1 = anglelist(n,0);
+  const int i2 = anglelist(n,1);
+  const int i3 = anglelist(n,2);
+  const int type = anglelist(n,3);
+
+  // 1st bond
+
+  const F_FLOAT delx1 = x(i1,0) - x(i2,0);
+  const F_FLOAT dely1 = x(i1,1) - x(i2,1);
+  const F_FLOAT delz1 = x(i1,2) - x(i2,2);
+
+  const F_FLOAT rsq1 = delx1*delx1 + dely1*dely1 + delz1*delz1;
+  const F_FLOAT r1 = sqrt(rsq1);
+
+  // 2nd bond
+
+  const F_FLOAT delx2 = x(i3,0) - x(i2,0);
+  const F_FLOAT dely2 = x(i3,1) - x(i2,1);
+  const F_FLOAT delz2 = x(i3,2) - x(i2,2);
+
+  const F_FLOAT rsq2 = delx2*delx2 + dely2*dely2 + delz2*delz2;
+  const F_FLOAT r2 = sqrt(rsq2);
+
+  // c = cosine of angle
+
+  F_FLOAT c = delx1*delx2 + dely1*dely2 + delz1*delz2;
+  c /= r1*r2;
+  if (c > 1.0) c = 1.0;
+  if (c < -1.0) c = -1.0;
+
+  // force & energy
+
+  F_FLOAT eangle = 0.0;
+  if (eflag) eangle = d_k[type]*(1.0+c);
+
+  const F_FLOAT a = d_k[type];
+  const F_FLOAT a11 = a*c / rsq1;
+  const F_FLOAT a12 = -a / (r1*r2);
+  const F_FLOAT a22 = a*c / rsq2;
+
+  F_FLOAT f1[3],f3[3];
+  f1[0] = a11*delx1 + a12*delx2;
+  f1[1] = a11*dely1 + a12*dely2;
+  f1[2] = a11*delz1 + a12*delz2;
+  f3[0] = a22*delx2 + a12*delx1;
+  f3[1] = a22*dely2 + a12*dely1;
+  f3[2] = a22*delz2 + a12*delz1;
+
+  // apply force to each of 3 atoms
+
+  if (NEWTON_BOND || i1 < nlocal) {
+    a_f(i1,0) += f1[0];
+    a_f(i1,1) += f1[1];
+    a_f(i1,2) += f1[2];
+  }
+
+  if (NEWTON_BOND || i2 < nlocal) {
+    a_f(i2,0) -= f1[0] + f3[0];
+    a_f(i2,1) -= f1[1] + f3[1];
+    a_f(i2,2) -= f1[2] + f3[2];
+  }
+
+  if (NEWTON_BOND || i3 < nlocal) {
+    a_f(i3,0) += f3[0];
+    a_f(i3,1) += f3[1];
+    a_f(i3,2) += f3[2];
+  }
+
+  if (EVFLAG) ev_tally(ev,i1,i2,i3,eangle,f1,f3,
+                       delx1,dely1,delz1,delx2,dely2,delz2);
+}
+
+template<class DeviceType>
+template<int NEWTON_BOND, int EVFLAG>
+KOKKOS_INLINE_FUNCTION
+void AngleCosineKokkos<DeviceType>::operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int &n) const {
+  EV_FLOAT ev;
+  this->template operator()<NEWTON_BOND,EVFLAG>(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>(), n, ev);
+}
+
+/* ---------------------------------------------------------------------- */
+
+template<class DeviceType>
+void AngleCosineKokkos<DeviceType>::allocate()
+{
+  AngleCosine::allocate();
+
+  int n = atom->nangletypes;
+  k_k = typename ArrayTypes<DeviceType>::tdual_ffloat_1d("AngleCosine::k",n+1);
+  d_k = k_k.template view<DeviceType>();
+}
+
+/* ----------------------------------------------------------------------
+   set coeffs for one or more types
+------------------------------------------------------------------------- */
+
+template<class DeviceType>
+void AngleCosineKokkos<DeviceType>::coeff(int narg, char **arg)
+{
+  AngleCosine::coeff(narg, arg);
+
+  int n = atom->nangletypes;
+  for (int i = 1; i <= n; i++)
+    k_k.h_view[i] = k[i];
+
+  k_k.template modify<LMPHostType>();
+}
+
+/* ----------------------------------------------------------------------
+   proc 0 reads coeffs from restart file, bcasts them
+------------------------------------------------------------------------- */
+
+template<class DeviceType>
+void AngleCosineKokkos<DeviceType>::read_restart(FILE *fp)
+{
+  AngleCosine::read_restart(fp);
+
+  int n = atom->nangletypes;
+  for (int i = 1; i <= n; i++)
+    k_k.h_view[i] = k[i];
+
+  k_k.template modify<LMPHostType>();
+}
+
+/* ----------------------------------------------------------------------
+   tally energy and virial into global and per-atom accumulators
+   virial = r1F1 + r2F2 + r3F3 = (r1-r2) F1 + (r3-r2) F3 = del1*f1 + del2*f3
+------------------------------------------------------------------------- */
+
+template<class DeviceType>
+//template<int NEWTON_BOND>
+KOKKOS_INLINE_FUNCTION
+void AngleCosineKokkos<DeviceType>::ev_tally(EV_FLOAT &ev, const int i, const int j, const int k,
+                     F_FLOAT &eangle, F_FLOAT *f1, F_FLOAT *f3,
+                     const F_FLOAT &delx1, const F_FLOAT &dely1, const F_FLOAT &delz1,
+                     const F_FLOAT &delx2, const F_FLOAT &dely2, const F_FLOAT &delz2) const
+{
+  E_FLOAT eanglethird;
+  F_FLOAT v[6];
+
+  // The eatom and vatom arrays are atomic
+  Kokkos::View<E_FLOAT*, typename DAT::t_efloat_1d::array_layout,DeviceType,Kokkos::MemoryTraits<Kokkos::Atomic|Kokkos::Unmanaged> > v_eatom = k_eatom.template view<DeviceType>();
+  Kokkos::View<F_FLOAT*[6], typename DAT::t_virial_array::array_layout,DeviceType,Kokkos::MemoryTraits<Kokkos::Atomic|Kokkos::Unmanaged> > v_vatom = k_vatom.template view<DeviceType>();
+
+  if (eflag_either) {
+    if (eflag_global) {
+      if (newton_bond) ev.evdwl += eangle;
+      else {
+        eanglethird = THIRD*eangle;
+
+        if (i < nlocal) ev.evdwl += eanglethird;
+        if (j < nlocal) ev.evdwl += eanglethird;
+        if (k < nlocal) ev.evdwl += eanglethird;
+      }
+    }
+    if (eflag_atom) {
+      eanglethird = THIRD*eangle;
+
+      if (newton_bond || i < nlocal) v_eatom[i] += eanglethird;
+      if (newton_bond || j < nlocal) v_eatom[j] += eanglethird;
+      if (newton_bond || k < nlocal) v_eatom[k] += eanglethird;
+    }
+  }
+
+  if (vflag_either) {
+    v[0] = delx1*f1[0] + delx2*f3[0];
+    v[1] = dely1*f1[1] + dely2*f3[1];
+    v[2] = delz1*f1[2] + delz2*f3[2];
+    v[3] = delx1*f1[1] + delx2*f3[1];
+    v[4] = delx1*f1[2] + delx2*f3[2];
+    v[5] = dely1*f1[2] + dely2*f3[2];
+
+    if (vflag_global) {
+      if (newton_bond) {
+        ev.v[0] += v[0];
+        ev.v[1] += v[1];
+        ev.v[2] += v[2];
+        ev.v[3] += v[3];
+        ev.v[4] += v[4];
+        ev.v[5] += v[5];
+      } else {
+        if (i < nlocal) {
+          ev.v[0] += THIRD*v[0];
+          ev.v[1] += THIRD*v[1];
+          ev.v[2] += THIRD*v[2];
+          ev.v[3] += THIRD*v[3];
+          ev.v[4] += THIRD*v[4];
+          ev.v[5] += THIRD*v[5];
+        }
+        if (j < nlocal) {
+          ev.v[0] += THIRD*v[0];
+          ev.v[1] += THIRD*v[1];
+          ev.v[2] += THIRD*v[2];
+          ev.v[3] += THIRD*v[3];
+          ev.v[4] += THIRD*v[4];
+          ev.v[5] += THIRD*v[5];
+        }
+        if (k < nlocal) {
+          ev.v[0] += THIRD*v[0];
+
+          ev.v[1] += THIRD*v[1];
+          ev.v[2] += THIRD*v[2];
+          ev.v[3] += THIRD*v[3];
+          ev.v[4] += THIRD*v[4];
+          ev.v[5] += THIRD*v[5];
+        }
+      }
+    }
+
+    if (vflag_atom) {
+      if (newton_bond || i < nlocal) {
+        v_vatom(i,0) += THIRD*v[0];
+        v_vatom(i,1) += THIRD*v[1];
+        v_vatom(i,2) += THIRD*v[2];
+        v_vatom(i,3) += THIRD*v[3];
+        v_vatom(i,4) += THIRD*v[4];
+        v_vatom(i,5) += THIRD*v[5];
+      }
+      if (newton_bond || j < nlocal) {
+        v_vatom(j,0) += THIRD*v[0];
+        v_vatom(j,1) += THIRD*v[1];
+        v_vatom(j,2) += THIRD*v[2];
+        v_vatom(j,3) += THIRD*v[3];
+        v_vatom(j,4) += THIRD*v[4];
+        v_vatom(j,5) += THIRD*v[5];
+      }
+      if (newton_bond || k < nlocal) {
+        v_vatom(k,0) += THIRD*v[0];
+        v_vatom(k,1) += THIRD*v[1];
+        v_vatom(k,2) += THIRD*v[2];
+        v_vatom(k,3) += THIRD*v[3];
+        v_vatom(k,4) += THIRD*v[4];
+        v_vatom(k,5) += THIRD*v[5];
+
+      }
+    }
+  }
+}
+
+/* ---------------------------------------------------------------------- */
+
+namespace LAMMPS_NS {
+template class AngleCosineKokkos<LMPDeviceType>;
+#ifdef KOKKOS_HAVE_CUDA
+template class AngleCosineKokkos<LMPHostType>;
+#endif
+}
+
--- a/src/KOKKOS/angle_cosine_kokkos.h
+++ b/src/KOKKOS/angle_cosine_kokkos.h
@ -0,0 +1,90 @@
+/* -*- c++ -*- ----------------------------------------------------------
+   LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
+   http://lammps.sandia.gov, Sandia National Laboratories
+   Steve Plimpton, sjplimp@sandia.gov
+
+   Copyright (2003) Sandia Corporation.  Under the terms of Contract
+   DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
+   certain rights in this software.  This software is distributed under
+   the GNU General Public License.
+
+   See the README file in the top-level LAMMPS directory.
+------------------------------------------------------------------------- */
+
+#ifdef ANGLE_CLASS
+
+AngleStyle(cosine/kk,AngleCosineKokkos<LMPDeviceType>)
+AngleStyle(cosine/kk/device,AngleCosineKokkos<LMPDeviceType>)
+AngleStyle(cosine/kk/host,AngleCosineKokkos<LMPHostType>)
+
+#else
+
+#ifndef LMP_ANGLE_COSINE_KOKKOS_H
+#define LMP_ANGLE_COSINE_KOKKOS_H
+
+#include "angle_cosine.h"
+#include "kokkos_type.h"
+
+namespace LAMMPS_NS {
+
+template<int NEWTON_BOND, int EVFLAG>
+struct TagAngleCosineCompute{};
+
+template<class DeviceType>
+class AngleCosineKokkos : public AngleCosine {
+
+ public:
+  typedef DeviceType device_type;
+  typedef EV_FLOAT value_type;
+
+  AngleCosineKokkos(class LAMMPS *);
+  virtual ~AngleCosineKokkos();
+  void compute(int, int);
+  void coeff(int, char **);
+  void read_restart(FILE *);
+
+  template<int NEWTON_BOND, int EVFLAG>
+  KOKKOS_INLINE_FUNCTION
+  void operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int&, EV_FLOAT&) const;
+
+  template<int NEWTON_BOND, int EVFLAG>
+  KOKKOS_INLINE_FUNCTION
+  void operator()(TagAngleCosineCompute<NEWTON_BOND,EVFLAG>, const int&) const;
+
+  //template<int NEWTON_BOND>
+  KOKKOS_INLINE_FUNCTION
+  void ev_tally(EV_FLOAT &ev, const int i, const int j, const int k,
+                     F_FLOAT &eangle, F_FLOAT *f1, F_FLOAT *f3,
+                     const F_FLOAT &delx1, const F_FLOAT &dely1, const F_FLOAT &delz1,
+                     const F_FLOAT &delx2, const F_FLOAT &dely2, const F_FLOAT &delz2) const;
+
+ protected:
+
+  class NeighborKokkos *neighborKK;
+
+  typename ArrayTypes<DeviceType>::t_x_array_randomread x;
+  typename ArrayTypes<DeviceType>::t_f_array f;
+  typename ArrayTypes<DeviceType>::t_int_2d anglelist;
+
+  typename ArrayTypes<DeviceType>::tdual_efloat_1d k_eatom;
+  typename ArrayTypes<DeviceType>::tdual_virial_array k_vatom;
+  typename ArrayTypes<DeviceType>::t_efloat_1d d_eatom;
+  typename ArrayTypes<DeviceType>::t_virial_array d_vatom;
+
+  int nlocal,newton_bond;
+  int eflag,vflag;
+
+  typename ArrayTypes<DeviceType>::tdual_ffloat_1d k_k;
+  typename ArrayTypes<DeviceType>::t_ffloat_1d d_k;
+
+  void allocate();
+};
+
+}
+
+#endif
+#endif
+
+/* ERROR/WARNING messages:
+
+*/
--- a/src/MANYBODY/pair_bop.cpp
+++ b/src/MANYBODY/pair_bop.cpp
@ -4988,7 +4988,7 @@ void PairBOP::read_table(char *filename)
      fgets(s,MAXLINE,fp);
      nws=0;
      ws=1;
-      for(j=0;j<strlen(s);j++) {
+      for(j=0;j<(int)strlen(s);j++) {
        if(ws==1) {
          if(isspace(s[j])) {
            ws=1;
@ -5015,7 +5015,7 @@ void PairBOP::read_table(char *filename)
    nws=0;
    ws=1;
    fgets(s,MAXLINE,fp);
-    for(j=0;j<strlen(s);j++) {
+    for(j=0;j<(int)strlen(s);j++) {
      if(ws==1) {
        if(isspace(s[j])) {
          ws=1;
--- a/src/MOLECULE/angle_cosine.cpp
+++ b/src/MOLECULE/angle_cosine.cpp
@ -36,7 +36,7 @@ AngleCosine::AngleCosine(LAMMPS *lmp) : Angle(lmp) {}

 AngleCosine::~AngleCosine()
 {
-  if (allocated) {
+  if (allocated && !copymode) {
    memory->destroy(setflag);
    memory->destroy(k);
  }
--- a/src/MOLECULE/angle_cosine.h
+++ b/src/MOLECULE/angle_cosine.h
@ -30,17 +30,17 @@ class AngleCosine : public Angle {
  AngleCosine(class LAMMPS *);
  virtual ~AngleCosine();
  virtual void compute(int, int);
-  void coeff(int, char **);
+  virtual void coeff(int, char **);
  double equilibrium_angle(int);
  void write_restart(FILE *);
-  void read_restart(FILE *);
+  virtual void read_restart(FILE *);
  void write_data(FILE *);
  double single(int, int, int, int);

 protected:
  double *k;

-  void allocate();
+  virtual void allocate();
 };

 }
--- a/src/PYTHON/fix_python_invoke.cpp
+++ b/src/PYTHON/fix_python_invoke.cpp
@ -24,7 +24,7 @@
 #include "update.h"
 #include "respa.h"
 #include "error.h"
-#include "python.h"
+#include "lmppython.h"
 #include "python_compat.h"

 using namespace LAMMPS_NS;
--- a/src/PYTHON/fix_python_move.cpp
+++ b/src/PYTHON/fix_python_move.cpp
@ -25,7 +25,7 @@
 #include "force.h"
 #include "memory.h"
 #include "neigh_list.h"
-#include "python.h"
+#include "lmppython.h"
 #include "error.h"
 #include "python_compat.h"

--- a/src/PYTHON/pair_python.cpp
+++ b/src/PYTHON/pair_python.cpp
@ -26,7 +26,7 @@
 #include "memory.h"
 #include "update.h"
 #include "neigh_list.h"
-#include "python.h"
+#include "lmppython.h"
 #include "error.h"
 #include "python_compat.h"

@ -401,9 +401,9 @@ double PairPython::init_one(int, int)

 /* ---------------------------------------------------------------------- */

-double PairPython::single(int i, int j, int itype, int jtype, double rsq,
-                         double factor_coul, double factor_lj,
-                         double &fforce)
+double PairPython::single(int /* i */, int /* j */, int itype, int jtype,
+                         double rsq, double /* factor_coul */,
+                         double factor_lj, double &fforce)
 {
  // with hybrid/overlay we might get called for skipped types
  if (skip_types[itype] || skip_types[jtype]) {
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Axel Kohlmeyer	b29d526632	Merge pull request #1323 from akohlmey/next-patch-release Patch release 8 February 2019	2019-02-08 21:43:39 +01:00
Axel Kohlmeyer	77c24f1b79	Merge branch 'master' into next-patch-release # Conflicts: # .github/PULL_REQUEST_TEMPLATE.md	2019-02-08 15:34:40 -05:00
Axel Kohlmeyer	b4765683da	restore generic pull request template (for now)	2019-02-08 15:27:16 -05:00
Axel Kohlmeyer	f286155dd3	Merge pull request #1306 from stanmoore1/kk_angle_cosine Add Kokkos version of angle/cosine	2019-02-08 21:07:36 +01:00
Axel Kohlmeyer	f04d97e66e	Merge pull request #1312 from stanmoore1/kk_update Update Kokkos library to v2.8.00	2019-02-08 21:07:23 +01:00
Axel Kohlmeyer	b871b4e13d	Merge pull request #1321 from akohlmey/collected-small-fixes Collected small fixes for the next patch release	2019-02-08 17:19:01 +01:00
Axel Kohlmeyer	8b449e569b	Merge pull request #1318 from akohlmey/github-template-update New GitHub PR templates	2019-02-08 17:18:30 +01:00
Axel Kohlmeyer	be6c2d781b	Merge pull request #1322 from akohlmey/add-code-of-conduct Add a code of conduct to lammps github project	2019-02-08 17:18:05 +01:00
Axel Kohlmeyer	a35f407dfc	Patch release 8 February 2019	2019-02-08 17:15:22 +01:00
Axel Kohlmeyer	82355342bf	improve wording about avoiding namespace imports in headers	2019-02-08 17:08:15 +01:00
Stan Moore	773476634b	Update Kokkos library in LAMMPS to v2.8.00	2019-02-08 08:50:26 -07:00
Axel Kohlmeyer	1b4ed9cb8d	revert region optimization to create_atoms	2019-02-08 16:24:48 +01:00
Axel Kohlmeyer	997727d26c	add a code of conduct to github project complete the "github open source best practices" setup by adapting a code of conduct for any public communications about LAMMPS on github.	2019-02-08 14:21:37 +01:00
Axel Kohlmeyer	a71159578e	fix typo	2019-02-08 02:29:03 -05:00
Axel Kohlmeyer	ae85ca18fc	provide multiple and more specific pull request templates	2019-02-07 15:53:46 +01:00
Axel Kohlmeyer	4d192e541e	update github contribution guidelines text	2019-02-07 14:58:52 +01:00
Axel Kohlmeyer	8da5289638	add readme file adapted from https://github.com/pmla/polyhedral-template-matching/blob/master/README.md to USER-PTM package	2019-02-07 14:40:52 +01:00
Axel Kohlmeyer	ad8aeb8b00	using POSIX compatible version of strdup() (i.e. _strdup()) on windows	2019-02-07 14:11:02 +01:00
Axel Kohlmeyer	f80c577b3c	Merge pull request #1316 from akohlmey/python-portable-header Rename python.h to lmppython.h	2019-02-07 14:05:49 +01:00
Axel Kohlmeyer	510641c065	Merge pull request #1319 from lammps/akohlmey-new-issue-templates Update issue templates to have a different template for bug reports and feature requests.	2019-02-07 14:02:10 +01:00
Axel Kohlmeyer	019e534f80	some small tweaks, fixes for typos, and url corrections for the manual	2019-02-07 01:16:48 -05:00
Axel Kohlmeyer	ebcbc5bdbd	fix some formatting and spelling/grammar errors in bug report template	2019-02-06 17:32:44 -05:00
Axel Kohlmeyer	c9a7513dc6	remove outdated generic issue template	2019-02-06 17:26:59 -05:00
Axel Kohlmeyer	17cd92a3f2	Update issue templates Take advantage of the fact, that GitHub now allows to have separate issue templates for bug reports and feature requests	2019-02-06 17:15:52 -05:00
Axel Kohlmeyer	0c4e76ce84	bugfix for fix qeq/reax to make it usable without pair reax/c	2019-02-05 11:55:02 +01:00
Axel Kohlmeyer	7e1f760b72	rename python.h to lmppython.h and correspondingly python.cpp to lmppython.cpp	2019-02-04 14:24:53 -05:00
Axel Kohlmeyer	b417cfda9b	removed dead code and reduce compiler warnings about unused parameters	2019-02-03 11:36:41 -05:00
Stan Moore	64834e4a3d	Update Kokkos library	2019-02-01 12:45:54 -07:00
Stan Moore	54b9a1335c	Add Kokkos version of angle_cosine	2019-01-28 11:29:28 -07:00