Update Kokkos library in LAMMPS to v3.0
This commit is contained in:
323
lib/kokkos/BUILD.md
Normal file
323
lib/kokkos/BUILD.md
Normal file
@ -0,0 +1,323 @@
|
||||

|
||||
|
||||
# Installing and Using Kokkos
|
||||
|
||||
## Kokkos Philosophy
|
||||
Kokkos provides a modern CMake style build system.
|
||||
As C++ continues to develop for C++20 and beyond, CMake is likely to provide the most robust support
|
||||
for C++. Applications heavily leveraging Kokkos are strongly encouraged to use a CMake build system.
|
||||
|
||||
You can either use Kokkos as an installed package (encouraged) or use Kokkos in-tree in your project.
|
||||
Modern CMake is exceedingly simple at a high-level (with the devil in the details).
|
||||
Once Kokkos is installed In your `CMakeLists.txt` simply use:
|
||||
````
|
||||
find_package(Kokkos REQUIRED)
|
||||
````
|
||||
Then for every executable or library in your project:
|
||||
````
|
||||
target_link_libraries(myTarget Kokkos::kokkos)
|
||||
````
|
||||
That's it! There is no checking Kokkos preprocessor, compiler, or linker flags.
|
||||
Kokkos propagates all the necesssary flags to your project.
|
||||
This means not only is linking to Kokkos easy, but Kokkos itself can actually configure compiler and linker flags for *your*
|
||||
project. If building in-tree, there is no `find_package` and you link with `target_link_libraries(kokkos)`.
|
||||
|
||||
|
||||
## Configuring CMake
|
||||
A very basic installation is done with:
|
||||
````
|
||||
cmake ${srcdir} \
|
||||
-DCMAKE_CXX_COMPILER=g++ \
|
||||
-DCMAKE_INSTALL_PREFIX=${my_install_folder}
|
||||
````
|
||||
which builds and installed a default Kokkos when you run `make install`.
|
||||
There are numerous device backends, options, and architecture-specific optimizations that can be configured, e.g.
|
||||
````
|
||||
cmake ${srcdir} \
|
||||
-DCMAKE_CXX_COMPILER=g++ \
|
||||
-DCMAKE_INSTALL_PREFIX=${my_install_folder} \
|
||||
-DKokkos_ENABLE_OPENMP=On
|
||||
````
|
||||
which activates the OpenMP backend. All of the options controlling device backends, options, architectures, and third-party libraries (TPLs) are given below.
|
||||
|
||||
## Spack
|
||||
An alternative to manually building with the CMake is to use the Spack package manager.
|
||||
To do so, download the `kokkos-spack` git repo and add to the package list:
|
||||
````
|
||||
spack repo add $path-to-kokkos-spack
|
||||
````
|
||||
A basic installation would be done as:
|
||||
````
|
||||
spack install kokkos
|
||||
````
|
||||
Spack allows options and and compilers to be tuned in the install command.
|
||||
````
|
||||
spack install kokkos@3.0 %gcc@7.3.0 +openmp
|
||||
````
|
||||
This example illustrates the three most common parameters to Spack:
|
||||
* Variants: specified with, e.g. `+openmp`, this activates (or deactivates with, e.g. `~openmp`) certain options.
|
||||
* Version: immediately following `kokkos` the `@version` can specify a particular Kokkos to build
|
||||
* Compiler: a default compiler will be chosen if not specified, but an exact compiler version can be given with the `%`option.
|
||||
|
||||
For a complete list of Kokkos options, run:
|
||||
````
|
||||
spack info kokkos
|
||||
````
|
||||
|
||||
#### Spack Development
|
||||
Spack currently installs packages to a location determined by a unique hash. This hash name is not really "human readable".
|
||||
Generally, Spack usage should never really require you to reference the computer-generated unique install folder.
|
||||
If you must know, you can locate Spack Kokkos installations with:
|
||||
````
|
||||
spack find -p kokkos ...
|
||||
````
|
||||
where `...` is the unique spec identifying the particular Kokkos configuration and version.
|
||||
|
||||
A better way to use Spack for doing Kokkos development is the DIY feature of Spack.
|
||||
If you wish to develop Kokkos itself, go to the Kokkos source folder:
|
||||
````
|
||||
spack diy -u cmake kokkos@diy ...
|
||||
````
|
||||
where `...` is a Spack spec identifying the exact Kokkos configuration.
|
||||
This then creates a `spack-build` directory where you can run `make`.
|
||||
|
||||
If doing development on a downstream project, you can do almost exactly the same thing.
|
||||
````
|
||||
spack diy -u cmake ${myproject}@${myversion} ... ^kokkos...
|
||||
````
|
||||
where the `...` are the specs for your project and the desired Kokkos configuration.
|
||||
Again, a `spack-build` directory will be created where you can run `make`.
|
||||
|
||||
Spack has a few idiosyncracies that make building outside of Spack annoying related to Spack forcing use of a compiler wrapper. This can be worked around by having a `-DSpack_WORKAROUND=On` given your CMake. Then add the block of code to your CMakeLists.txt:
|
||||
|
||||
````
|
||||
if (Spack_WORKAROUND)
|
||||
set(SPACK_CXX $ENV{SPACK_CXX})
|
||||
if(SPACK_CXX)
|
||||
set(CMAKE_CXX_COMPILER ${SPACK_CXX} CACHE STRING "the C++ compiler" FORCE)
|
||||
set(ENV{CXX} ${SPACK_CXX})
|
||||
endif()
|
||||
endif()
|
||||
````
|
||||
|
||||
# Kokkos Keyword Listing
|
||||
|
||||
## Device Backends
|
||||
Device backends can be enabled by specifiying `-DKokkos_ENABLE_X`.
|
||||
|
||||
* Kokkos_ENABLE_CUDA
|
||||
* Whether to build CUDA backend
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_HPX
|
||||
* Whether to build HPX backend (experimental)
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_OPENMP
|
||||
* Whether to build OpenMP backend
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_PTHREAD
|
||||
* Whether to build Pthread backend
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_SERIAL
|
||||
* Whether to build serial backend
|
||||
* BOOL Default: ON
|
||||
|
||||
## Enable Options
|
||||
Options can be enabled by specifiying `-DKokkos_ENABLE_X`.
|
||||
|
||||
* Kokkos_ENABLE_AGGRESSIVE_VECTORIZATION
|
||||
* Whether to aggressively vectorize loops
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_COMPILER_WARNINGS
|
||||
* Whether to print all compiler warnings
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_CUDA_CONSTEXPR
|
||||
* Whether to activate experimental relaxed constexpr functions
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_CUDA_LAMBDA
|
||||
* Whether to activate experimental lambda features
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_CUDA_LDG_INTRINSIC
|
||||
* Whether to use CUDA LDG intrinsics
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE
|
||||
* Whether to enable relocatable device code (RDC) for CUDA
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_CUDA_UVM
|
||||
* Whether to use unified memory (UM) by default for CUDA
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_DEBUG
|
||||
* Whether to activate extra debug features - may increase compile times
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_DEBUG_BOUNDS_CHECK
|
||||
* Whether to use bounds checking - will increase runtime
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK
|
||||
* Debug check on dual views
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_DEPRECATED_CODE
|
||||
* Whether to enable deprecated code
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_HPX_ASYNC_DISPATCH
|
||||
* Whether HPX supports asynchronous dispatch
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_LARGE_MEM_TESTS
|
||||
* Whether to perform extra large memory tests
|
||||
* BOOL_Default: OFF
|
||||
* Kokkos_ENABLE_PROFILING
|
||||
* Whether to create bindings for profiling tools
|
||||
* BOOL Default: ON
|
||||
* Kokkos_ENABLE_PROFILING_LOAD_PRINT
|
||||
* Whether to print information about which profiling tools gotloaded
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_TESTS
|
||||
* Whether to build serial backend
|
||||
* BOOL Default: OFF
|
||||
|
||||
## Other Options
|
||||
* Kokkos_CXX_STANDARD
|
||||
* The C++ standard for Kokkos to use: c++11, c++14, c++17, or c++20. This should be given in CMake style as 11, 14, 17, or 20.
|
||||
* STRING Default: 11
|
||||
|
||||
## Third-party Libraries (TPLs)
|
||||
The following options control enabling TPLs:
|
||||
* Kokkos_ENABLE_HPX
|
||||
* Whether to enable the HPX library
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ENABLE_HWLOC
|
||||
* Whether to enable the HWLOC library
|
||||
* BOOL Default: Off
|
||||
* Kokkos_ENABLE_LIBNUMA
|
||||
* Whether to enable the LIBNUMA library
|
||||
* BOOL Default: Off
|
||||
* Kokkos_ENABLE_MEMKIND
|
||||
* Whether to enable the MEMKIND library
|
||||
* BOOL Default: Off
|
||||
* Kokkos_ENABLE_LIBDL
|
||||
* Whether to enable the LIBDL library
|
||||
* BOOL Default: On
|
||||
* Kokkos_ENABLE_LIBRT
|
||||
* Whether to enable the LIBRT library
|
||||
* BOOL Default: Off
|
||||
|
||||
The following options control finding and configuring non-CMake TPLs:
|
||||
* Kokkos_CUDA_DIR or CUDA_ROOT
|
||||
* Location of CUDA install prefix for libraries
|
||||
* PATH Default:
|
||||
* Kokkos_HWLOC_DIR or HWLOC_ROOT
|
||||
* Location of HWLOC install prefix
|
||||
* PATH Default:
|
||||
* Kokkos_LIBNUMA_DIR or LIBNUMA_ROOT
|
||||
* Location of LIBNUMA install prefix
|
||||
* PATH Default:
|
||||
* Kokkos_MEMKIND_DIR or MEMKIND_ROOT
|
||||
* Location of MEMKIND install prefix
|
||||
* PATH Default:
|
||||
* Kokkos_LIBDL_DIR or LIBDL_ROOT
|
||||
* Location of LIBDL install prefix
|
||||
* PATH Default:
|
||||
* Kokkos_LIBRT_DIR or LIBRT_ROOT
|
||||
* Location of LIBRT install prefix
|
||||
* PATH Default:
|
||||
|
||||
The following options control `find_package` paths for CMake-based TPLs:
|
||||
* HPX_DIR or HPX_ROOT
|
||||
* Location of HPX prefix (ROOT) or CMake config file (DIR)
|
||||
* PATH Default:
|
||||
|
||||
## Architecture Keywords
|
||||
Architecture-specific optimizations can be enabled by specifiying `-DKokkos_ARCH_X`.
|
||||
|
||||
* Kokkos_ARCH_AMDAVX
|
||||
* Whether to optimize for the AMDAVX architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_ARMV80
|
||||
* Whether to optimize for the ARMV80 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_ARMV81
|
||||
* Whether to optimize for the ARMV81 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_ARMV8_THUNDERX
|
||||
* Whether to optimize for the ARMV8_THUNDERX architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_ARMV8_TX2
|
||||
* Whether to optimize for the ARMV8_TX2 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_BDW
|
||||
* Whether to optimize for the BDW architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_BGQ
|
||||
* Whether to optimize for the BGQ architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_EPYC
|
||||
* Whether to optimize for the EPYC architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_HSW
|
||||
* Whether to optimize for the HSW architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_KEPLER30
|
||||
* Whether to optimize for the KEPLER30 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_KEPLER32
|
||||
* Whether to optimize for the KEPLER32 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_KEPLER35
|
||||
* Whether to optimize for the KEPLER35 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_KEPLER37
|
||||
* Whether to optimize for the KEPLER37 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_KNC
|
||||
* Whether to optimize for the KNC architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_KNL
|
||||
* Whether to optimize for the KNL architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_MAXWELL50
|
||||
* Whether to optimize for the MAXWELL50 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_MAXWELL52
|
||||
* Whether to optimize for the MAXWELL52 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_MAXWELL53
|
||||
* Whether to optimize for the MAXWELL53 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_PASCAL60
|
||||
* Whether to optimize for the PASCAL60 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_PASCAL61
|
||||
* Whether to optimize for the PASCAL61 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_POWER7
|
||||
* Whether to optimize for the POWER7 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_POWER8
|
||||
* Whether to optimize for the POWER8 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_POWER9
|
||||
* Whether to optimize for the POWER9 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_SKX
|
||||
* Whether to optimize for the SKX architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_SNB
|
||||
* Whether to optimize for the SNB architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_TURING75
|
||||
* Whether to optimize for the TURING75 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_VOLTA70
|
||||
* Whether to optimize for the VOLTA70 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_VOLTA72
|
||||
* Whether to optimize for the VOLTA72 architecture
|
||||
* BOOL Default: OFF
|
||||
* Kokkos_ARCH_WSM
|
||||
* Whether to optimize for the WSM architecture
|
||||
* BOOL Default: OFF
|
||||
|
||||
##### [LICENSE](https://github.com/kokkos/kokkos/blob/devel/LICENSE)
|
||||
|
||||
[](https://opensource.org/licenses/BSD-3-Clause)
|
||||
|
||||
Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
the U.S. Government retains certain rights in this software.
|
||||
@ -1,5 +1,45 @@
|
||||
# Change Log
|
||||
|
||||
## [3.0.00](https://github.com/kokkos/kokkos/tree/3.0.00) (2020-01-27)
|
||||
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.9.00...3.0.00)
|
||||
|
||||
**Implemented enhancements:**
|
||||
|
||||
- BuildSystem: Standalone Modern CMake Support [\#2104](https://github.com/kokkos/kokkos/issues/2104)
|
||||
- StyleFormat: ClangFormat Style [\#2157](https://github.com/kokkos/kokkos/issues/2157)
|
||||
- Documentation: Document build system and CMake philosophy [\#2263](https://github.com/kokkos/kokkos/issues/2263)
|
||||
- BuildSystem: Add Alias with Namespace Kokkos:: to Interal Libraries [\#2530](https://github.com/kokkos/kokkos/issues/2530)
|
||||
- BuildSystem: Universal Kokkos find\_package [\#2099](https://github.com/kokkos/kokkos/issues/2099)
|
||||
- BuildSystem: Dropping support for Kokkos\_{DEVICES,OPTIONS,ARCH} in CMake [\#2329](https://github.com/kokkos/kokkos/issues/2329)
|
||||
- BuildSystem: Set Kokkos\_DEVICES and Kokkos\_ARCH variables in exported CMake configuration [\#2193](https://github.com/kokkos/kokkos/issues/2193)
|
||||
- BuildSystem: Drop support for CUDA 7 and CUDA 8 [\#2489](https://github.com/kokkos/kokkos/issues/2489)
|
||||
- BuildSystem: Drop CMake option SEPARATE\_TESTS [\#2266](https://github.com/kokkos/kokkos/issues/2266)
|
||||
- BuildSystem: Support expt-relaxed-constexpr same as expt-extended-lambda [\#2411](https://github.com/kokkos/kokkos/issues/2411)
|
||||
- BuildSystem: Add Xnvlink to command line options allowed in nvcc\_wrapper [\#2197](https://github.com/kokkos/kokkos/issues/2197)
|
||||
- BuildSystem: Install Kokkos config files and target files to lib/cmake/Kokkos [\#2162](https://github.com/kokkos/kokkos/issues/2162)
|
||||
- BuildSystem: nvcc\_wrappers and c++ 14 [\#2035](https://github.com/kokkos/kokkos/issues/2035)
|
||||
- BuildSystem: Kokkos version major/version minor \(Feature request\) [\#1930](https://github.com/kokkos/kokkos/issues/1930)
|
||||
- BuildSystem: CMake namespaces \(and other modern cmake cleanup\) [\#1924](https://github.com/kokkos/kokkos/issues/1924)
|
||||
- BuildSystem: Remove capability to install Kokkos via GNU Makefiles [\#2332](https://github.com/kokkos/kokkos/issues/2332)
|
||||
- Documentation: Remove PDF ProgrammingGuide in Kokkos replace with link [\#2244](https://github.com/kokkos/kokkos/issues/2244)
|
||||
- View: Add Method to Resize View without Initialization [\#2048](https://github.com/kokkos/kokkos/issues/2048)
|
||||
- Vector: implement “insert” method for Kokkos\_Vector \(as a serial function on host\) [\#2437](https://github.com/kokkos/kokkos/issues/2437)
|
||||
|
||||
**Fixed bugs:**
|
||||
|
||||
- ParallelScan: Kokkos::parallel\scan fix race condition seen in inter-block fence [\#2681](https://github.com/kokkos/kokkos/issues/2681)
|
||||
- OffsetView: Kokkos::OffsetView missing constructor which takes pointer [\#2247](https://github.com/kokkos/kokkos/issues/2247)
|
||||
- OffsetView: Kokkos::OffsetView: allow offset=0 [\#2246](https://github.com/kokkos/kokkos/issues/2246)
|
||||
- DeepCopy: Missing DeepCopy instrumentation in Kokkos [\#2522](https://github.com/kokkos/kokkos/issues/2522)
|
||||
- nvcc\_wrapper: --host-only fails with mutiple -W\* flags [\#2484](https://github.com/kokkos/kokkos/issues/2484)
|
||||
- nvcc\_wrapper: taking first -std option is counterintuitive [\#2553](https://github.com/kokkos/kokkos/issues/2553)
|
||||
- Subview: Error taking subviews of views with static_extents of min rank [\#2448](https://github.com/kokkos/kokkos/issues/2448)
|
||||
- TeamPolicy: reducers with valuetypes without += broken on CUDA [\#2410](https://github.com/kokkos/kokkos/issues/2410)
|
||||
- Libs: Fix inconsistency of Kokkos library names in Kokkos and Trilinos [\#1902](https://github.com/kokkos/kokkos/issues/1902)
|
||||
- Complex: operator\>\> for complex\<T\> uses std::ostream, not std::istream [\#2313](https://github.com/kokkos/kokkos/issues/2313)
|
||||
- Macros: Restrict not honored for non-intel compilers [\#1922](https://github.com/kokkos/kokkos/issues/1922)
|
||||
|
||||
|
||||
## [2.9.00](https://github.com/kokkos/kokkos/tree/2.9.00) (2019-06-24)
|
||||
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.8.00...2.9.00)
|
||||
|
||||
|
||||
@ -1,128 +1,218 @@
|
||||
# Is this a build as part of Trilinos?
|
||||
|
||||
IF(COMMAND TRIBITS_PACKAGE_DECL)
|
||||
SET(KOKKOS_HAS_TRILINOS ON CACHE BOOL "")
|
||||
ELSE()
|
||||
SET(KOKKOS_HAS_TRILINOS OFF CACHE BOOL "")
|
||||
ENDIF()
|
||||
|
||||
IF(NOT KOKKOS_HAS_TRILINOS)
|
||||
cmake_minimum_required(VERSION 3.3 FATAL_ERROR)
|
||||
|
||||
# Define Project Name if this is a standalone build
|
||||
IF(NOT DEFINED ${PROJECT_NAME})
|
||||
project(Kokkos CXX)
|
||||
# We want to determine if options are given with the wrong case
|
||||
# In order to detect which arguments are given to compare against
|
||||
# the list of valid arguments, at the beginning here we need to
|
||||
# form a list of all the given variables. If it begins with any
|
||||
# case of KoKkOS, we add it to the list.
|
||||
|
||||
|
||||
GET_CMAKE_PROPERTY(_variableNames VARIABLES)
|
||||
SET(KOKKOS_GIVEN_VARIABLES)
|
||||
FOREACH (var ${_variableNames})
|
||||
STRING(TOUPPER ${var} UC_VAR)
|
||||
STRING(FIND ${UC_VAR} KOKKOS IDX)
|
||||
IF (${IDX} EQUAL 0)
|
||||
LIST(APPEND KOKKOS_GIVEN_VARIABLES ${var})
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
|
||||
# Basic initialization (Used in KOKKOS_SETTINGS)
|
||||
set(KOKKOS_SRC_PATH ${Kokkos_SOURCE_DIR})
|
||||
set(KOKKOS_PATH ${KOKKOS_SRC_PATH})
|
||||
SET(Kokkos_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
SET(KOKKOS_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
SET(KOKKOS_SRC_PATH ${Kokkos_SOURCE_DIR})
|
||||
SET(KOKKOS_PATH ${Kokkos_SOURCE_DIR})
|
||||
SET(KOKKOS_TOP_BUILD_DIR ${CMAKE_CURRENT_BINARY_DIR})
|
||||
|
||||
#------------ COMPILER AND FEATURE CHECKS ------------------------------------
|
||||
include(${KOKKOS_SRC_PATH}/cmake/kokkos_functions.cmake)
|
||||
set_kokkos_cxx_compiler()
|
||||
set_kokkos_cxx_standard()
|
||||
|
||||
#------------ GET OPTIONS AND KOKKOS_SETTINGS --------------------------------
|
||||
# Add Kokkos' modules to CMake's module path.
|
||||
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${Kokkos_SOURCE_DIR}/cmake/Modules/")
|
||||
|
||||
set(KOKKOS_CMAKE_VERBOSE True)
|
||||
include(${KOKKOS_SRC_PATH}/cmake/kokkos_options.cmake)
|
||||
|
||||
include(${KOKKOS_SRC_PATH}/cmake/kokkos_settings.cmake)
|
||||
|
||||
#------------ GENERATE HEADER AND SOURCE FILES -------------------------------
|
||||
execute_process(
|
||||
COMMAND ${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} PREFIX=${CMAKE_INSTALL_PREFIX} generate_build_settings
|
||||
WORKING_DIRECTORY "${Kokkos_BINARY_DIR}"
|
||||
OUTPUT_FILE ${Kokkos_BINARY_DIR}/core_src_make.out
|
||||
RESULT_VARIABLE GEN_SETTINGS_RESULT
|
||||
)
|
||||
if (GEN_SETTINGS_RESULT)
|
||||
message(FATAL_ERROR "Kokkos settings generation failed:\n"
|
||||
"${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings")
|
||||
endif()
|
||||
include(${Kokkos_BINARY_DIR}/kokkos_generated_settings.cmake)
|
||||
install(FILES ${Kokkos_BINARY_DIR}/kokkos_generated_settings.cmake DESTINATION lib/cmake/Kokkos)
|
||||
install(FILES ${Kokkos_BINARY_DIR}/kokkos_generated_settings.cmake DESTINATION ${CMAKE_INSTALL_PREFIX})
|
||||
string(REPLACE " " ";" KOKKOS_TPL_INCLUDE_DIRS "${KOKKOS_GMAKE_TPL_INCLUDE_DIRS}")
|
||||
string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_DIRS "${KOKKOS_GMAKE_TPL_LIBRARY_DIRS}")
|
||||
string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_NAMES "${KOKKOS_GMAKE_TPL_LIBRARY_NAMES}")
|
||||
list(REMOVE_ITEM KOKKOS_TPL_INCLUDE_DIRS "")
|
||||
list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_DIRS "")
|
||||
list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_NAMES "")
|
||||
set_kokkos_srcs(KOKKOS_SRC ${KOKKOS_SRC})
|
||||
|
||||
#------------ NOW BUILD ------------------------------------------------------
|
||||
include(${KOKKOS_SRC_PATH}/cmake/kokkos_build.cmake)
|
||||
|
||||
#------------ Add in Fake Tribits Handling to allow unit test builds- --------
|
||||
|
||||
include(${KOKKOS_SRC_PATH}/cmake/tribits.cmake)
|
||||
|
||||
TRIBITS_PACKAGE_DECL(Kokkos)
|
||||
|
||||
ADD_SUBDIRECTORY(core)
|
||||
ADD_SUBDIRECTORY(containers)
|
||||
ADD_SUBDIRECTORY(algorithms)
|
||||
# Needed to simplify syntax of if statements
|
||||
CMAKE_POLICY(SET CMP0054 NEW)
|
||||
|
||||
# Is this a build as part of Trilinos?
|
||||
IF(COMMAND TRIBITS_PACKAGE_DECL)
|
||||
SET(KOKKOS_HAS_TRILINOS ON)
|
||||
ELSE()
|
||||
SET(KOKKOS_HAS_TRILINOS OFF)
|
||||
ENDIF()
|
||||
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_functions.cmake)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_pick_cxx_std.cmake)
|
||||
|
||||
SET(KOKKOS_ENABLED_OPTIONS) #exported in config file
|
||||
SET(KOKKOS_ENABLED_DEVICES) #exported in config file
|
||||
SET(KOKKOS_ENABLED_TPLS) #exported in config file
|
||||
SET(KOKKOS_ENABLED_ARCH_LIST) #exported in config file
|
||||
|
||||
#These are helper flags used for sanity checks during config
|
||||
#Certain features should depend on other features being configured first
|
||||
SET(KOKKOS_CFG_DAG_NONE On) #sentinel to indicate no dependencies
|
||||
SET(KOKKOS_CFG_DAG_DEVICES_DONE Off)
|
||||
SET(KOKKOS_CFG_DAG_OPTIONS_DONE Off)
|
||||
SET(KOKKOS_CFG_DAG_ARCH_DONE Off)
|
||||
SET(KOKKOS_CFG_DAG_CXX_STD_DONE Off)
|
||||
SET(KOKKOS_CFG_DAG_COMPILER_ID_DONE Off)
|
||||
FUNCTION(KOKKOS_CFG_DEPENDS SUCCESSOR PRECURSOR)
|
||||
SET(PRE_FLAG KOKKOS_CFG_DAG_${PRECURSOR})
|
||||
SET(POST_FLAG KOKKOS_CFG_DAG_${SUCCESSOR})
|
||||
IF (NOT ${PRE_FLAG})
|
||||
MESSAGE(FATAL_ERROR "Bad CMake refactor: feature ${SUCCESSOR} cannot be configured until ${PRECURSOR} is configured")
|
||||
ENDIF()
|
||||
GLOBAL_SET(${POST_FLAG} On)
|
||||
ENDFUNCTION()
|
||||
|
||||
|
||||
LIST(APPEND CMAKE_MODULE_PATH cmake/Modules)
|
||||
|
||||
IF(NOT KOKKOS_HAS_TRILINOS)
|
||||
cmake_minimum_required(VERSION 3.10 FATAL_ERROR)
|
||||
set(CMAKE_DISABLE_SOURCE_CHANGES ON)
|
||||
set(CMAKE_DISABLE_IN_SOURCE_BUILD ON)
|
||||
IF (Spack_WORKAROUND)
|
||||
#if we are explicitly using Spack for development,
|
||||
#nuke the Spack compiler
|
||||
SET(SPACK_CXX $ENV{SPACK_CXX})
|
||||
IF(SPACK_CXX)
|
||||
SET(CMAKE_CXX_COMPILER ${SPACK_CXX} CACHE STRING "the C++ compiler" FORCE)
|
||||
SET(ENV{CXX} ${SPACK_CXX})
|
||||
ENDIF()
|
||||
ENDif()
|
||||
IF(NOT DEFINED ${PROJECT_NAME})
|
||||
PROJECT(Kokkos CXX)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF (NOT CMAKE_SIZEOF_VOID_P)
|
||||
STRING(FIND ${CMAKE_CXX_COMPILER} nvcc_wrapper FIND_IDX)
|
||||
IF (NOT FIND_IDX STREQUAL -1)
|
||||
MESSAGE(FATAL_ERROR "Kokkos did not configure correctly and failed to validate compiler. The most likely cause is CUDA linkage using nvcc_wrapper. Please ensure your CUDA environment is correctly configured.")
|
||||
ELSE()
|
||||
MESSAGE(FATAL_ERROR "Kokkos did not configure correctly and failed to validate compiler. The most likely cause is linkage errors during CMake compiler validation. Please consult the CMake error log shown below for the exact error during compiler validation")
|
||||
ENDIF()
|
||||
ELSEIF (NOT CMAKE_SIZEOF_VOID_P EQUAL 8)
|
||||
MESSAGE(FATAL_ERROR "Kokkos assumes a 64-bit build; i.e., 8-byte pointers, but found ${CMAKE_SIZEOF_VOID_P}-byte pointers instead")
|
||||
ENDIF()
|
||||
|
||||
|
||||
set(Kokkos_VERSION_MAJOR 3)
|
||||
set(Kokkos_VERSION_MINOR 0)
|
||||
set(Kokkos_VERSION_PATCH 0)
|
||||
set(Kokkos_VERSION "${Kokkos_VERSION_MAJOR}.${Kokkos_VERSION_MINOR}.${Kokkos_VERSION_PATCH}")
|
||||
|
||||
IF(${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.12.0")
|
||||
MESSAGE(STATUS "Setting policy CMP0074 to use <Package>_ROOT variables")
|
||||
CMAKE_POLICY(SET CMP0074 NEW)
|
||||
ENDIF()
|
||||
|
||||
# Load either the real TriBITS or a TriBITS wrapper
|
||||
# for certain utility functions that are universal (like GLOBAL_SET)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/fake_tribits.cmake)
|
||||
|
||||
IF (Kokkos_ENABLE_CUDA AND ${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.14.0")
|
||||
#If we are building CUDA, we have tricked CMake because we declare a CXX project
|
||||
#If the default C++ standard for a given compiler matches the requested
|
||||
#standard, then CMake just omits the -std flag in later versions of CMake
|
||||
#This breaks CUDA compilation (CUDA compiler can have a different default
|
||||
#-std then the underlying host compiler by itself). Setting this variable
|
||||
#forces CMake to always add the -std flag even if it thinks it doesn't need it
|
||||
GLOBAL_SET(CMAKE_CXX_STANDARD_DEFAULT 98)
|
||||
ENDIF()
|
||||
|
||||
# These are the variables we will append to as we go
|
||||
# I really wish these were regular variables
|
||||
# but scoping issues can make it difficult
|
||||
GLOBAL_RESET(KOKKOS_COMPILE_OPTIONS)
|
||||
GLOBAL_RESET(KOKKOS_LINK_OPTIONS)
|
||||
GLOBAL_RESET(KOKKOS_CUDA_OPTIONS)
|
||||
GLOBAL_RESET(KOKKOS_CUDAFE_OPTIONS)
|
||||
GLOBAL_RESET(KOKKOS_XCOMPILER_OPTIONS)
|
||||
# We need to append text here for making sure TPLs
|
||||
# we import are available for an installed Kokkos
|
||||
GLOBAL_RESET(KOKKOS_TPL_EXPORTS)
|
||||
# We need these for controlling the exact -std flag
|
||||
GLOBAL_RESET(KOKKOS_DONT_ALLOW_EXTENSIONS)
|
||||
GLOBAL_RESET(KOKKOS_USE_CXX_EXTENSIONS)
|
||||
GLOBAL_RESET(KOKKOS_CXX_STANDARD_FEATURE)
|
||||
|
||||
# Include a set of Kokkos-specific wrapper functions that
|
||||
# will either call raw CMake or TriBITS
|
||||
# These are functions like KOKKOS_INCLUDE_DIRECTORIES
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_tribits.cmake)
|
||||
|
||||
|
||||
# The build environment setup goes in the following steps
|
||||
# 1) Check all the enable options. This includes checking Kokkos_DEVICES
|
||||
# 2) Check the compiler ID (type and version)
|
||||
# 3) Check the CXX standard and select important CXX flags
|
||||
# 4) Check for any third-party libraries (TPLs) like hwloc
|
||||
# 5) Check if optimizing for a particular architecture and add arch-specific flags
|
||||
KOKKOS_SETUP_BUILD_ENVIRONMENT()
|
||||
|
||||
# Finish off the build
|
||||
# 6) Recurse into subdirectories and configure individual libraries
|
||||
# 7) Export and install targets
|
||||
|
||||
OPTION(BUILD_SHARED_LIBS "Build shared libraries" OFF)
|
||||
# Workaround for building position independent code.
|
||||
IF(BUILD_SHARED_LIBS)
|
||||
SET(CMAKE_POSITION_INDEPENDENT_CODE ON)
|
||||
ENDIF()
|
||||
|
||||
SET(KOKKOS_EXT_LIBRARIES Kokkos::kokkos Kokkos::kokkoscore Kokkos::kokkoscontainers Kokkos::kokkosalgorithms)
|
||||
SET(KOKKOS_INT_LIBRARIES kokkos kokkoscore kokkoscontainers kokkosalgorithms)
|
||||
SET_PROPERTY(GLOBAL PROPERTY KOKKOS_INT_LIBRARIES ${KOKKOS_INT_LIBRARIES})
|
||||
|
||||
GET_DIRECTORY_PROPERTY(HAS_PARENT PARENT_DIRECTORY)
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
SET(TRILINOS_INCDIR ${CMAKE_INSTALL_PREFIX}/${${PROJECT_NAME}_INSTALL_INCLUDE_DIR})
|
||||
SET(KOKKOS_HEADER_DIR ${TRILINOS_INCDIR})
|
||||
SET(KOKKOS_IS_SUBDIRECTORY TRUE)
|
||||
ELSEIF(HAS_PARENT)
|
||||
SET(KOKKOS_HEADER_DIR "include/kokkos")
|
||||
SET(KOKKOS_IS_SUBDIRECTORY TRUE)
|
||||
ELSE()
|
||||
SET(KOKKOS_HEADER_DIR "${CMAKE_INSTALL_INCLUDEDIR}")
|
||||
SET(KOKKOS_IS_SUBDIRECTORY FALSE)
|
||||
ENDIF()
|
||||
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
#
|
||||
# A) Forward declare the package so that certain options are also defined for
|
||||
# subpackages
|
||||
#
|
||||
|
||||
TRIBITS_PACKAGE_DECL(Kokkos) # ENABLE_SHADOWING_WARNINGS)
|
||||
## This restores the old behavior of ProjectCompilerPostConfig.cmake
|
||||
# It sets the CMAKE_CXX_FLAGS globally to those used by Kokkos
|
||||
# We must do this before KOKKOS_PACKAGE_DECL
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
# Overwrite the old flags at the top-level
|
||||
# Because Tribits doesn't use lists, it uses spaces for the list of CXX flags
|
||||
# we have to match the annoying behavior
|
||||
STRING(REPLACE ";" " " KOKKOSCORE_COMPILE_OPTIONS "${KOKKOS_COMPILE_OPTIONS}")
|
||||
STRING(REPLACE ";" " " KOKKOSCORE_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONS}")
|
||||
FOREACH(CUDAFE_FLAG ${KOKKOS_CUDAFE_OPTIONS})
|
||||
SET(KOKKOSCORE_CUDAFE_OPTIONS "${KOKKOSCORE_CUDAFE_OPTIONS} -Xcudafe ${CUDAFE_FLAG}")
|
||||
ENDFOREACH()
|
||||
FOREACH(XCOMP_FLAG ${KOKKOS_XCOMPILER_OPTIONS})
|
||||
SET(KOKKOSCORE_XCOMPILER_OPTIONS "${KOKKOSCORE_XCOMPILER_OPTIONS} -Xcompiler ${XCOMP_FLAG}")
|
||||
ENDFOREACH()
|
||||
SET(KOKKOSCORE_CXX_FLAGS "${KOKKOSCORE_COMPILE_OPTIONS} ${CMAKE_CXX${KOKKOS_CXX_STANDARD}_STANDARD_COMPILE_OPTION} ${KOKKOSCORE_CUDA_OPTIONS} ${KOKKOSCORE_CUDAFE_OPTIONS} ${KOKKOSCORE_XCOMPILER_OPTIONS}")
|
||||
# Both parent scope and this package
|
||||
# In ProjectCompilerPostConfig.cmake, we capture the "global" flags Trilinos wants in
|
||||
# TRILINOS_TOPLEVEL_CXX_FLAGS
|
||||
SET(CMAKE_CXX_FLAGS "${TRILINOS_TOPLEVEL_CXX_FLAGS} ${KOKKOSCORE_CXX_FLAGS}" PARENT_SCOPE)
|
||||
SET(CMAKE_CXX_FLAGS "${TRILINOS_TOPLEVEL_CXX_FLAGS} ${KOKKOSCORE_CXX_FLAGS}")
|
||||
#CMAKE_CXX_FLAGS will get added to Kokkos and Kokkos dependencies automatically here
|
||||
#These flags get set up in KOKKOS_PACKAGE_DECL, which means they
|
||||
#must be configured before KOKKOS_PACKAGE_DECL
|
||||
ENDIF()
|
||||
|
||||
KOKKOS_PACKAGE_DECL()
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
#
|
||||
# B) Install Kokkos' build files
|
||||
# D) Process the subpackages (subdirectories) for Kokkos
|
||||
#
|
||||
# If using the Makefile-generated files, then need to set things up.
|
||||
# Here, assume that TriBITS has been run from ProjectCompilerPostConfig.cmake
|
||||
# and already generated KokkosCore_config.h and kokkos_generated_settings.cmake
|
||||
# in the previously define Kokkos_GEN_DIR
|
||||
# We need to copy them over to the correct place and source the cmake file
|
||||
|
||||
if(NOT KOKKOS_LEGACY_TRIBITS)
|
||||
set(Kokkos_GEN_DIR ${CMAKE_BINARY_DIR})
|
||||
file(COPY "${Kokkos_GEN_DIR}/KokkosCore_config.h"
|
||||
DESTINATION "${CMAKE_CURRENT_BINARY_DIR}" USE_SOURCE_PERMISSIONS)
|
||||
install(FILES "${Kokkos_GEN_DIR}/KokkosCore_config.h"
|
||||
DESTINATION include)
|
||||
file(COPY "${Kokkos_GEN_DIR}/kokkos_generated_settings.cmake"
|
||||
DESTINATION "${CMAKE_CURRENT_BINARY_DIR}" USE_SOURCE_PERMISSIONS)
|
||||
|
||||
include(${CMAKE_CURRENT_BINARY_DIR}/kokkos_generated_settings.cmake)
|
||||
# Sources come from makefile-generated kokkos_generated_settings.cmake file
|
||||
# Enable using the individual sources if needed
|
||||
set_kokkos_srcs(KOKKOS_SRC ${KOKKOS_SRC})
|
||||
endif ()
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
#
|
||||
# C) Install Kokkos' executable scripts
|
||||
#
|
||||
|
||||
# nvcc_wrapper is Kokkos' wrapper for NVIDIA's NVCC CUDA compiler.
|
||||
# Kokkos needs nvcc_wrapper in order to build. Other libraries and
|
||||
# executables also need nvcc_wrapper. Thus, we need to install it.
|
||||
# If the argument of DESTINATION is a relative path, CMake computes it
|
||||
# as relative to ${CMAKE_INSTALL_PATH}.
|
||||
|
||||
INSTALL(PROGRAMS ${CMAKE_CURRENT_SOURCE_DIR}/bin/nvcc_wrapper DESTINATION bin)
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
#
|
||||
# D) Process the subpackages for Kokkos
|
||||
#
|
||||
|
||||
TRIBITS_PROCESS_SUBPACKAGES()
|
||||
KOKKOS_PROCESS_SUBPACKAGES()
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
@ -130,10 +220,39 @@ TRIBITS_PROCESS_SUBPACKAGES()
|
||||
# E) If Kokkos itself is enabled, process the Kokkos package
|
||||
#
|
||||
|
||||
TRIBITS_PACKAGE_DEF()
|
||||
KOKKOS_PACKAGE_DEF()
|
||||
KOKKOS_EXCLUDE_AUTOTOOLS_FILES()
|
||||
KOKKOS_PACKAGE_POSTPROCESS()
|
||||
|
||||
TRIBITS_EXCLUDE_AUTOTOOLS_FILES()
|
||||
|
||||
TRIBITS_PACKAGE_POSTPROCESS()
|
||||
#We are ready to configure the header
|
||||
CONFIGURE_FILE(cmake/KokkosCore_config.h.in KokkosCore_config.h @ONLY)
|
||||
|
||||
IF (NOT KOKKOS_HAS_TRILINOS)
|
||||
ADD_LIBRARY(kokkos INTERFACE)
|
||||
#Make sure in-tree projects can reference this as Kokkos::
|
||||
#to match the installed target names
|
||||
ADD_LIBRARY(Kokkos::kokkos ALIAS kokkos)
|
||||
TARGET_LINK_LIBRARIES(kokkos INTERFACE kokkoscore kokkoscontainers kokkosalgorithms)
|
||||
KOKKOS_INTERNAL_ADD_LIBRARY_INSTALL(kokkos)
|
||||
ENDIF()
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_install.cmake)
|
||||
|
||||
# nvcc_wrapper is Kokkos' wrapper for NVIDIA's NVCC CUDA compiler.
|
||||
# Kokkos needs nvcc_wrapper in order to build. Other libraries and
|
||||
# executables also need nvcc_wrapper. Thus, we need to install it.
|
||||
# If the argument of DESTINATION is a relative path, CMake computes it
|
||||
# as relative to ${CMAKE_INSTALL_PATH}.
|
||||
INSTALL(PROGRAMS ${CMAKE_CURRENT_SOURCE_DIR}/bin/nvcc_wrapper DESTINATION ${CMAKE_INSTALL_BINDIR})
|
||||
INSTALL(FILES "${CMAKE_CURRENT_BINARY_DIR}/KokkosCore_config.h" DESTINATION ${CMAKE_INSTALL_INCLUDEDIR})
|
||||
|
||||
|
||||
# Finally - if we are a subproject - make sure the enabled devices are visible
|
||||
IF (HAS_PARENT)
|
||||
FOREACH(DEV Kokkos_ENABLED_DEVICES)
|
||||
#I would much rather not make these cache variables or global properties, but I can't
|
||||
#make any guarantees on whether PARENT_SCOPE is good enough to make
|
||||
#these variables visible where I need them
|
||||
SET(Kokkos_ENABLE_${DEV} ON PARENT_SCOPE)
|
||||
SET_PROPERTY(GLOBAL PROPERTY Kokkos_ENABLE_${DEV} ON)
|
||||
ENDFOREACH()
|
||||
ENDIF()
|
||||
|
||||
14
lib/kokkos/CONTRIBUTING.md
Normal file
14
lib/kokkos/CONTRIBUTING.md
Normal file
@ -0,0 +1,14 @@
|
||||
# Contributing to Kokkos
|
||||
|
||||
## Pull Requests
|
||||
We actively welcome pull requests.
|
||||
1. Fork the repo and create your branch from `develop`.
|
||||
2. If you've added code that should be tested, add tests.
|
||||
3. If you've changed APIs, update the documentation.
|
||||
4. Ensure the test suite passes.
|
||||
|
||||
## Issues
|
||||
We use GitHub issues to track public bugs. Please ensure your description is clear and has sufficient instructions to be able to reproduce the issue.
|
||||
|
||||
## License
|
||||
By contributing to Kokkos, you agree that your contributions will be licensed under the LICENSE file in the root directory of this source tree.
|
||||
@ -1,10 +1,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -22,10 +23,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
|
||||
@ -1,10 +1,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Kokkos is licensed under 3-clause BSD terms of use:
|
||||
@ -24,10 +25,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
|
||||
@ -23,14 +23,16 @@ KOKKOS_DEBUG ?= "no"
|
||||
KOKKOS_USE_TPLS ?= ""
|
||||
# Options: c++11,c++14,c++1y,c++17,c++1z,c++2a
|
||||
KOKKOS_CXX_STANDARD ?= "c++11"
|
||||
# Options: aggressive_vectorization,disable_profiling,enable_deprecated_code,disable_deprecated_code,enable_large_mem_tests
|
||||
# Options: aggressive_vectorization,disable_profiling,enable_deprecated_code,disable_deprecated_code,enable_large_mem_tests,disable_complex_align
|
||||
KOKKOS_OPTIONS ?= ""
|
||||
# Option for setting ETI path
|
||||
KOKKOS_ETI_PATH ?= ${KOKKOS_PATH}/core/src/eti
|
||||
KOKKOS_CMAKE ?= "no"
|
||||
KOKKOS_TRIBITS ?= "no"
|
||||
KOKKOS_STANDALONE_CMAKE ?= "no"
|
||||
|
||||
# Default settings specific options.
|
||||
# Options: force_uvm,use_ldg,rdc,enable_lambda
|
||||
# Options: force_uvm,use_ldg,rdc,enable_lambda,enable_constexpr
|
||||
KOKKOS_CUDA_OPTIONS ?= "enable_lambda"
|
||||
|
||||
# Default settings specific options.
|
||||
@ -47,7 +49,8 @@ kokkos_has_string=$(if $(findstring $2,$1),1,0)
|
||||
# Will return a 1 if /path/to/file exists
|
||||
kokkos_path_exists=$(if $(wildcard $1),1,0)
|
||||
|
||||
# Check for general settings.
|
||||
# Check for general settings
|
||||
|
||||
KOKKOS_INTERNAL_ENABLE_DEBUG := $(call kokkos_has_string,$(KOKKOS_DEBUG),yes)
|
||||
KOKKOS_INTERNAL_ENABLE_CXX11 := $(call kokkos_has_string,$(KOKKOS_CXX_STANDARD),c++11)
|
||||
KOKKOS_INTERNAL_ENABLE_CXX14 := $(call kokkos_has_string,$(KOKKOS_CXX_STANDARD),c++14)
|
||||
@ -67,6 +70,7 @@ KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION := $(call kokkos_has_string,$
|
||||
KOKKOS_INTERNAL_DISABLE_PROFILING := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_profiling)
|
||||
KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_deprecated_code)
|
||||
KOKKOS_INTERNAL_ENABLE_DEPRECATED_CODE := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_deprecated_code)
|
||||
KOKKOS_INTERNAL_DISABLE_COMPLEX_ALIGN := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_complex_align)
|
||||
KOKKOS_INTERNAL_DISABLE_DUALVIEW_MODIFY_CHECK := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_dualview_modify_check)
|
||||
KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_profile_load_print)
|
||||
KOKKOS_INTERNAL_ENABLE_LARGE_MEM_TESTS := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_large_mem_tests)
|
||||
@ -74,6 +78,7 @@ KOKKOS_INTERNAL_CUDA_USE_LDG := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),
|
||||
KOKKOS_INTERNAL_CUDA_USE_UVM := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),force_uvm)
|
||||
KOKKOS_INTERNAL_CUDA_USE_RELOC := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),rdc)
|
||||
KOKKOS_INTERNAL_CUDA_USE_LAMBDA := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),enable_lambda)
|
||||
KOKKOS_INTERNAL_CUDA_USE_CONSTEXPR := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),enable_constexpr)
|
||||
KOKKOS_INTERNAL_HPX_ENABLE_ASYNC_DISPATCH := $(call kokkos_has_string,$(KOKKOS_HPX_OPTIONS),enable_async_dispatch)
|
||||
KOKKOS_INTERNAL_ENABLE_ETI := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_eti)
|
||||
|
||||
@ -123,7 +128,7 @@ KOKKOS_INTERNAL_COMPILER_INTEL := $(call kokkos_has_string,$(KOKKOS_CXX_VE
|
||||
KOKKOS_INTERNAL_COMPILER_PGI := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),PGI)
|
||||
KOKKOS_INTERNAL_COMPILER_XL := $(strip $(shell $(CXX) -qversion 2>&1 | grep XL | wc -l))
|
||||
KOKKOS_INTERNAL_COMPILER_CRAY := $(strip $(shell $(CXX) -craype-verbose 2>&1 | grep "CC-" | wc -l))
|
||||
KOKKOS_INTERNAL_COMPILER_NVCC := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); $(CXX) --version 2>&1 | grep nvcc | wc -l))
|
||||
KOKKOS_INTERNAL_COMPILER_NVCC := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); echo "$(shell $(CXX) --version 2>&1 | grep nvcc | wc -l)>0" | bc))
|
||||
KOKKOS_INTERNAL_COMPILER_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),clang)
|
||||
KOKKOS_INTERNAL_COMPILER_APPLE_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),Apple LLVM)
|
||||
KOKKOS_INTERNAL_COMPILER_HCC := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),HCC)
|
||||
@ -383,10 +388,10 @@ endif
|
||||
|
||||
# Generating the list of Flags.
|
||||
|
||||
#CPPFLAGS is now unused
|
||||
KOKKOS_CPPFLAGS =
|
||||
KOKKOS_LIBDIRS =
|
||||
ifneq ($(KOKKOS_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS = -I./ -I$(KOKKOS_PATH)/core/src -I$(KOKKOS_PATH)/containers/src -I$(KOKKOS_PATH)/algorithms/src -I$(KOKKOS_ETI_PATH)
|
||||
KOKKOS_CPPFLAGS = -I./ -I$(KOKKOS_PATH)/core/src -I$(KOKKOS_PATH)/containers/src -I$(KOKKOS_PATH)/algorithms/src -I$(KOKKOS_ETI_PATH)
|
||||
endif
|
||||
KOKKOS_TPL_INCLUDE_DIRS =
|
||||
KOKKOS_TPL_LIBRARY_DIRS =
|
||||
@ -399,7 +404,7 @@ endif
|
||||
KOKKOS_LIBS = -ldl
|
||||
KOKKOS_TPL_LIBRARY_NAMES += dl
|
||||
ifneq ($(KOKKOS_CMAKE), yes)
|
||||
KOKKOS_LDFLAGS = -L$(shell pwd)
|
||||
KOKKOS_LIBDIRS = -L$(shell pwd)
|
||||
# CXXLDFLAGS is used together with CXXFLAGS in a combined compile/link command
|
||||
KOKKOS_CXXLDFLAGS = -L$(shell pwd)
|
||||
endif
|
||||
@ -492,28 +497,38 @@ ifeq ($(KOKKOS_INTERNAL_USE_ISA_POWERPCBE), 1)
|
||||
tmp := $(call kokkos_append_header,"\#endif")
|
||||
endif
|
||||
|
||||
#only add the c++ standard flags if this is not CMake
|
||||
tmp := $(call kokkos_append_header,"/* General Settings */")
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_CXX11), 1)
|
||||
ifneq ($(KOKKOS_STANDALONE_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CXX11_FLAG)
|
||||
endif
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CXX11")
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_CXX14), 1)
|
||||
ifneq ($(KOKKOS_STANDALONE_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CXX14_FLAG)
|
||||
endif
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CXX14")
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_CXX1Y), 1)
|
||||
#I cannot make CMake add this in a good way - so add it here
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CXX1Y_FLAG)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CXX14")
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_CXX17), 1)
|
||||
ifneq ($(KOKKOS_STANDALONE_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CXX17_FLAG)
|
||||
endif
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CXX17")
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_CXX1Z), 1)
|
||||
#I cannot make CMake add this in a good way - so add it here
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CXX1Z_FLAG)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CXX17")
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_CXX2A), 1)
|
||||
#I cannot make CMake add this in a good way - so add it here
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CXX2A_FLAG)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CXX20")
|
||||
endif
|
||||
@ -531,23 +546,26 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_DEBUG), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK")
|
||||
endif
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_DISABLE_COMPLEX_ALIGN), 0)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_COMPLEX_ALIGN")
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_PROFILING_LOAD_PRINT")
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_HWLOC), 1)
|
||||
ifneq ($(HWLOC_PATH),)
|
||||
ifneq ($(KOKKOS_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += -I$(HWLOC_PATH)/include
|
||||
endif
|
||||
KOKKOS_LDFLAGS += -L$(HWLOC_PATH)/lib
|
||||
ifneq ($(HWLOC_PATH),)
|
||||
KOKKOS_CPPFLAGS += -I$(HWLOC_PATH)/include
|
||||
KOKKOS_LIBDIRS += -L$(HWLOC_PATH)/lib
|
||||
KOKKOS_CXXLDFLAGS += -L$(HWLOC_PATH)/lib
|
||||
KOKKOS_TPL_INCLUDE_DIRS += $(HWLOC_PATH)/include
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(HWLOC_PATH)/lib
|
||||
endif
|
||||
KOKKOS_LIBS += -lhwloc
|
||||
KOKKOS_TPL_LIBRARY_NAMES += hwloc
|
||||
endif
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_HWLOC")
|
||||
endif
|
||||
|
||||
@ -558,17 +576,17 @@ ifeq ($(KOKKOS_INTERNAL_USE_LIBRT), 1)
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1)
|
||||
ifneq ($(MEMKIND_PATH),)
|
||||
ifneq ($(KOKKOS_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += -I$(MEMKIND_PATH)/include
|
||||
endif
|
||||
KOKKOS_LDFLAGS += -L$(MEMKIND_PATH)/lib
|
||||
ifneq ($(MEMKIND_PATH),)
|
||||
KOKKOS_CPPFLAGS += -I$(MEMKIND_PATH)/include
|
||||
KOKKOS_LIBDIRS += -L$(MEMKIND_PATH)/lib
|
||||
KOKKOS_CXXLDFLAGS += -L$(MEMKIND_PATH)/lib
|
||||
KOKKOS_TPL_INCLUDE_DIRS += $(MEMKIND_PATH)/include
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(MEMKIND_PATH)/lib
|
||||
endif
|
||||
KOKKOS_LIBS += -lmemkind -lnuma
|
||||
KOKKOS_TPL_LIBRARY_NAMES += memkind numa
|
||||
endif
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_HBWSPACE")
|
||||
endif
|
||||
|
||||
@ -580,9 +598,6 @@ ifeq ($(KOKKOS_INTERNAL_USE_HPX), 0)
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_DEPRECATED_CODE), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_DEPRECATED_CODE")
|
||||
endif
|
||||
ifeq ($(KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE), 0)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_DEPRECATED_CODE")
|
||||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_ETI), 1)
|
||||
@ -648,6 +663,21 @@ ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
|
||||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_CUDA_USE_CONSTEXPR), 1)
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_NVCC), 1)
|
||||
ifeq ($(shell test $(KOKKOS_INTERNAL_COMPILER_NVCC_VERSION) -ge 80; echo $$?),0)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CUDA_CONSTEXPR")
|
||||
KOKKOS_CXXFLAGS += -expt-relaxed-constexpr
|
||||
else
|
||||
$(warning Warning: Cuda relaxed constexpr support was requested but NVCC version is too low. This requires NVCC for Cuda version 8.0 or higher. Disabling relaxed constexpr support now.)
|
||||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_CUDA_CONSTEXPR")
|
||||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
|
||||
tmp := $(call kokkos_append_header,"\#define KOKKOS_IMPL_CUDA_CLANG_WORKAROUND")
|
||||
endif
|
||||
@ -1089,15 +1119,13 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_ETI), 1)
|
||||
endif
|
||||
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.hpp)
|
||||
ifneq ($(CUDA_PATH),)
|
||||
ifneq ($(KOKKOS_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += -I$(CUDA_PATH)/include
|
||||
endif
|
||||
KOKKOS_CPPLAGS += -I$(CUDA_PATH)/include
|
||||
ifeq ($(call kokkos_path_exists,$(CUDA_PATH)/lib64), 1)
|
||||
KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib64
|
||||
KOKKOS_LIBDIRS += -L$(CUDA_PATH)/lib64
|
||||
KOKKOS_CXXLDFLAGS += -L$(CUDA_PATH)/lib64
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(CUDA_PATH)/lib64
|
||||
else ifeq ($(call kokkos_path_exists,$(CUDA_PATH)/lib), 1)
|
||||
KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib
|
||||
KOKKOS_LIBDIRS += -L$(CUDA_PATH)/lib
|
||||
KOKKOS_CXXLDFLAGS += -L$(CUDA_PATH)/lib
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(CUDA_PATH)/lib
|
||||
else
|
||||
@ -1153,11 +1181,10 @@ endif
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_QTHREADS), 1)
|
||||
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.cpp)
|
||||
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.hpp)
|
||||
ifneq ($(QTHREADS_PATH),)
|
||||
ifneq ($(KOKKOS_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += -I$(QTHREADS_PATH)/include
|
||||
endif
|
||||
KOKKOS_LDFLAGS += -L$(QTHREADS_PATH)/lib
|
||||
ifneq ($(QTHREADS_PATH),)
|
||||
KOKKOS_CPPFLAGS += -I$(QTHREADS_PATH)/include
|
||||
KOKKOS_LIBDIRS += -L$(QTHREADS_PATH)/lib
|
||||
KOKKOS_CXXLDFLAGS += -L$(QTHREADS_PATH)/lib
|
||||
KOKKOS_TPL_INCLUDE_DIRS += $(QTHREADS_PATH)/include
|
||||
KOKKOS_TPL_LIBRARY_DIRS += $(QTHREADS_PATH)/lib64
|
||||
@ -1165,6 +1192,7 @@ ifeq ($(KOKKOS_INTERNAL_USE_QTHREADS), 1)
|
||||
KOKKOS_LIBS += -lqthread
|
||||
KOKKOS_TPL_LIBRARY_NAMES += qthread
|
||||
endif
|
||||
endif
|
||||
|
||||
ifeq ($(KOKKOS_INTERNAL_USE_HPX), 1)
|
||||
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/HPX/*.cpp)
|
||||
@ -1173,21 +1201,21 @@ ifeq ($(KOKKOS_INTERNAL_USE_HPX), 1)
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_DEBUG), 1)
|
||||
KOKKOS_CXXFLAGS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --cflags hpx_application_debug)
|
||||
KOKKOS_CXXLDFLAGS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --libs hpx_application_debug)
|
||||
KOKKOS_LDFLAGS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --libs hpx_application_debug)
|
||||
KOKKOS_LIBS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --libs hpx_application_debug)
|
||||
else
|
||||
KOKKOS_CXXFLAGS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --cflags hpx_application)
|
||||
KOKKOS_CXXLDFLAGS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --libs hpx_application)
|
||||
KOKKOS_LDFLAGS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --libs hpx_application)
|
||||
KOKKOS_LIBS += $(shell PKG_CONFIG_PATH=$(HPX_PATH)/lib64/pkgconfig pkg-config --libs hpx_application)
|
||||
endif
|
||||
else
|
||||
ifeq ($(KOKKOS_INTERNAL_ENABLE_DEBUG), 1)
|
||||
KOKKOS_CXXFLAGS += $(shell pkg-config --cflags hpx_application_debug)
|
||||
KOKKOS_CXXLDFLAGS += $(shell pkg-config --libs hpx_application_debug)
|
||||
KOKKOS_LDFLAGS += $(shell pkg-config --libs hpx_application_debug)
|
||||
KOKKOS_LIBS += $(shell pkg-config --libs hpx_application_debug)
|
||||
else
|
||||
KOKKOS_CXXFLAGS += $(shell pkg-config --cflags hpx_application)
|
||||
KOKKOS_CXXLDFLAGS += $(shell pkg-config --libs hpx_application)
|
||||
KOKKOS_LDFLAGS += $(shell pkg-config --libs hpx_application)
|
||||
KOKKOS_LIBS += $(shell pkg-config --libs hpx_application)
|
||||
endif
|
||||
endif
|
||||
KOKKOS_TPL_LIBRARY_NAMES += hpx
|
||||
@ -1248,4 +1276,16 @@ libkokkos.a: $(KOKKOS_OBJ_LINK) $(KOKKOS_SRC) $(KOKKOS_HEADERS)
|
||||
ar cr libkokkos.a $(KOKKOS_OBJ_LINK)
|
||||
ranlib libkokkos.a
|
||||
|
||||
print-cxx-flags:
|
||||
echo "$(KOKKOS_CXXFLAGS)"
|
||||
|
||||
KOKKOS_LINK_DEPENDS=libkokkos.a
|
||||
|
||||
#we have carefully separated LDFLAGS from LIBS and LIBDIRS
|
||||
#we have also separated CPPFLAGS from CXXFLAGS
|
||||
#if this is not cmake, for backwards compatibility
|
||||
#we just jam everything together into the CXXFLAGS and LDFLAGS
|
||||
ifneq ($(KOKKOS_CMAKE), yes)
|
||||
KOKKOS_CXXFLAGS += $(KOKKOS_CPPFLAGS)
|
||||
KOKKOS_LDFLAGS += $(KOKKOS_LIBDIRS)
|
||||
endif
|
||||
|
||||
@ -6,6 +6,8 @@ Kokkos_CPUDiscovery.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/impl/Kokkos
|
||||
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/impl/Kokkos_CPUDiscovery.cpp
|
||||
Kokkos_Error.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/impl/Kokkos_Error.cpp
|
||||
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/impl/Kokkos_Error.cpp
|
||||
Kokkos_Stacktrace.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/impl/Kokkos_Stacktrace.cpp
|
||||
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/impl/Kokkos_Stacktrace.cpp
|
||||
Kokkos_ExecPolicy.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/impl/Kokkos_ExecPolicy.cpp
|
||||
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/impl/Kokkos_ExecPolicy.cpp
|
||||
Kokkos_HostSpace.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/impl/Kokkos_HostSpace.cpp
|
||||
|
||||
@ -1,193 +0,0 @@
|
||||
Kokkos Core implements a programming model in C++ for writing performance portable
|
||||
applications targeting all major HPC platforms. For that purpose it provides
|
||||
abstractions for both parallel execution of code and data management.
|
||||
Kokkos is designed to target complex node architectures with N-level memory
|
||||
hierarchies and multiple types of execution resources. It currently can use
|
||||
OpenMP, Pthreads and CUDA as backend programming models.
|
||||
|
||||
Kokkos Core is part of the Kokkos C++ Performance Portability Programming EcoSystem,
|
||||
which also provides math kernels (https://github.com/kokkos/kokkos-kernels), as well as
|
||||
profiling and debugging tools (https://github.com/kokkos/kokkos-tools).
|
||||
|
||||
# Learning about Kokkos
|
||||
|
||||
A programming guide can be found on the Wiki, the API reference is under development.
|
||||
|
||||
For questions find us on Slack: https://kokkosteam.slack.com or open a github issue.
|
||||
|
||||
For non-public questions send an email to
|
||||
crtrott(at)sandia.gov
|
||||
|
||||
A separate repository with extensive tutorial material can be found under
|
||||
https://github.com/kokkos/kokkos-tutorials.
|
||||
|
||||
Furthermore, the 'example/tutorial' directory provides step by step tutorial
|
||||
examples which explain many of the features of Kokkos. They work with
|
||||
simple Makefiles. To build with g++ and OpenMP simply type 'make'
|
||||
in the 'example/tutorial' directory. This will build all examples in the
|
||||
subfolders. To change the build options refer to the Programming Guide
|
||||
in the compilation section.
|
||||
|
||||
To learn more about Kokkos consider watching one of our presentations:
|
||||
* GTC 2015:
|
||||
- http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
|
||||
- http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf
|
||||
|
||||
|
||||
# Contributing to Kokkos
|
||||
|
||||
We are open and try to encourage contributions from external developers.
|
||||
To do so please first open an issue describing the contribution and then issue
|
||||
a pull request against the develop branch. For larger features it may be good
|
||||
to get guidance from the core development team first through the github issue.
|
||||
|
||||
Note that Kokkos Core is licensed under standard 3-clause BSD terms of use.
|
||||
Which means contributing to Kokkos allows anyone else to use your contributions
|
||||
not just for public purposes but also for closed source commercial projects.
|
||||
For specifics see the LICENSE file contained in the repository or distribution.
|
||||
|
||||
# Requirements
|
||||
|
||||
### Primary tested compilers on X86 are:
|
||||
* GCC 4.8.4
|
||||
* GCC 4.9.3
|
||||
* GCC 5.1.0
|
||||
* GCC 5.5.0
|
||||
* GCC 6.1.0
|
||||
* GCC 7.2.0
|
||||
* GCC 7.3.0
|
||||
* GCC 8.1.0
|
||||
* Intel 15.0.2
|
||||
* Intel 16.0.1
|
||||
* Intel 17.0.1
|
||||
* Intel 17.4.196
|
||||
* Intel 18.2.128
|
||||
* Clang 3.6.1
|
||||
* Clang 3.7.1
|
||||
* Clang 3.8.1
|
||||
* Clang 3.9.0
|
||||
* Clang 4.0.0
|
||||
* Clang 6.0.0 for CUDA (CUDA Toolkit 9.0)
|
||||
* Clang 7.0.0 for CUDA (CUDA Toolkit 9.1)
|
||||
* PGI 18.7
|
||||
* NVCC 7.5 for CUDA (with gcc 4.8.4)
|
||||
* NVCC 8.0.44 for CUDA (with gcc 5.3.0)
|
||||
* NVCC 9.1 for CUDA (with gcc 6.1.0)
|
||||
* NVCC 9.2 for CUDA (with gcc 7.2.0)
|
||||
* NVCC 10.0 for CUDA (with gcc 7.4.0)
|
||||
|
||||
### Primary tested compilers on Power 8 are:
|
||||
* GCC 6.4.0 (OpenMP,Serial)
|
||||
* GCC 7.2.0 (OpenMP,Serial)
|
||||
* IBM XL 16.1.0 (OpenMP, Serial)
|
||||
* NVCC 9.2.88 for CUDA (with gcc 7.2.0 and XL 16.1.0)
|
||||
|
||||
### Primary tested compilers on Intel KNL are:
|
||||
* Intel 16.4.258 (with gcc 4.7.2)
|
||||
* Intel 17.2.174 (with gcc 4.9.3)
|
||||
* Intel 18.2.199 (with gcc 4.9.3)
|
||||
|
||||
### Primary tested compilers on ARM (Cavium ThunderX2)
|
||||
* GCC 7.2.0
|
||||
* ARM/Clang 18.4.0
|
||||
|
||||
### Other compilers working:
|
||||
* X86:
|
||||
- Cygwin 2.1.0 64bit with gcc 4.9.3
|
||||
- GCC 8.1.0 (not warning free)
|
||||
|
||||
### Known non-working combinations:
|
||||
* Power8:
|
||||
- Pthreads backend
|
||||
* ARM
|
||||
- Pthreads backend
|
||||
|
||||
|
||||
Primary tested compiler are passing in release mode
|
||||
with warnings as errors. They also are tested with a comprehensive set of
|
||||
backend combinations (i.e. OpenMP, Pthreads, Serial, OpenMP+Serial, ...).
|
||||
We are using the following set of flags:
|
||||
GCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits
|
||||
-Wignored-qualifiers -Wempty-body -Wclobbered -Wuninitialized
|
||||
Intel: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
|
||||
Clang: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
|
||||
NVCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitialized
|
||||
|
||||
Other compilers are tested occasionally, in particular when pushing from develop to
|
||||
master branch, without -Werror and only for a select set of backends.
|
||||
|
||||
# Running Unit Tests
|
||||
|
||||
To run the unit tests create a build directory and run the following commands
|
||||
|
||||
KOKKOS_PATH/generate_makefile.bash
|
||||
make build-test
|
||||
make test
|
||||
|
||||
Run KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
|
||||
changing the device type for which to build.
|
||||
|
||||
# Installing the library
|
||||
|
||||
To install Kokkos as a library create a build directory and run the following
|
||||
|
||||
KOKKOS_PATH/generate_makefile.bash --prefix=INSTALL_PATH
|
||||
make kokkoslib
|
||||
make install
|
||||
|
||||
KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
|
||||
changing the device type for which to build.
|
||||
|
||||
Note that in many cases it is preferable to build Kokkos inline with an
|
||||
application. The main reason is that you may otherwise need many different
|
||||
configurations of Kokkos installed depending on the required compile time
|
||||
features an application needs. For example there is only one default
|
||||
execution space, which means you need different installations to have OpenMP
|
||||
or Pthreads as the default space. Also for the CUDA backend there are certain
|
||||
choices, such as allowing relocatable device code, which must be made at
|
||||
installation time. Building Kokkos inline uses largely the same process
|
||||
as compiling an application against an installed Kokkos library. See for
|
||||
example benchmarks/bytes_and_flops/Makefile which can be used with an installed
|
||||
library and for an inline build.
|
||||
|
||||
### CMake
|
||||
|
||||
Kokkos supports being build as part of a CMake applications. An example can
|
||||
be found in example/cmake_build.
|
||||
|
||||
# Kokkos and CUDA UVM
|
||||
|
||||
Kokkos does support UVM as a specific memory space called CudaUVMSpace.
|
||||
Allocations made with that space are accessible from host and device.
|
||||
You can tell Kokkos to use that as the default space for Cuda allocations.
|
||||
In either case UVM comes with a number of restrictions:
|
||||
(i) You can't access allocations on the host while a kernel is potentially
|
||||
running. This will lead to segfaults. To avoid that you either need to
|
||||
call Kokkos::Cuda::fence() (or just Kokkos::fence()), after kernels, or
|
||||
you can set the environment variable CUDA_LAUNCH_BLOCKING=1.
|
||||
Furthermore in multi socket multi GPU machines without NVLINK, UVM defaults
|
||||
to using zero copy allocations for technical reasons related to using multiple
|
||||
GPUs from the same process. If an executable doesn't do that (e.g. each
|
||||
MPI rank of an application uses a single GPU [can be the same GPU for
|
||||
multiple MPI ranks]) you can set CUDA_MANAGED_FORCE_DEVICE_ALLOC=1.
|
||||
This will enforce proper UVM allocations, but can lead to errors if
|
||||
more than a single GPU is used by a single process.
|
||||
|
||||
|
||||
# Citing Kokkos
|
||||
|
||||
If you publish work which mentions Kokkos, please cite the following paper:
|
||||
|
||||
@article{CarterEdwards20143202,
|
||||
title = "Kokkos: Enabling manycore performance portability through polymorphic memory access patterns ",
|
||||
journal = "Journal of Parallel and Distributed Computing ",
|
||||
volume = "74",
|
||||
number = "12",
|
||||
pages = "3202 - 3216",
|
||||
year = "2014",
|
||||
note = "Domain-Specific Languages and High-Level Frameworks for High-Performance Computing ",
|
||||
issn = "0743-7315",
|
||||
doi = "https://doi.org/10.1016/j.jpdc.2014.07.003",
|
||||
url = "http://www.sciencedirect.com/science/article/pii/S0743731514001257",
|
||||
author = "H. Carter Edwards and Christian R. Trott and Daniel Sunderland"
|
||||
}
|
||||
299
lib/kokkos/README.md
Normal file
299
lib/kokkos/README.md
Normal file
@ -0,0 +1,299 @@
|
||||

|
||||
|
||||
# Kokkos: Core Libraries
|
||||
|
||||
Kokkos Core implements a programming model in C++ for writing performance portable
|
||||
applications targeting all major HPC platforms. For that purpose it provides
|
||||
abstractions for both parallel execution of code and data management.
|
||||
Kokkos is designed to target complex node architectures with N-level memory
|
||||
hierarchies and multiple types of execution resources. It currently can use
|
||||
CUDA, HPX, OpenMP and Pthreads as backend programming models with several other
|
||||
backends in development.
|
||||
|
||||
Kokkos Core is part of the Kokkos C++ Performance Portability Programming EcoSystem,
|
||||
which also provides math kernels (https://github.com/kokkos/kokkos-kernels), as well as
|
||||
profiling and debugging tools (https://github.com/kokkos/kokkos-tools).
|
||||
|
||||
# Learning about Kokkos
|
||||
|
||||
A programming guide can be found on the Wiki, the API reference is under development.
|
||||
|
||||
For questions find us on Slack: https://kokkosteam.slack.com or open a github issue.
|
||||
|
||||
For non-public questions send an email to
|
||||
crtrott(at)sandia.gov
|
||||
|
||||
A separate repository with extensive tutorial material can be found under
|
||||
https://github.com/kokkos/kokkos-tutorials.
|
||||
|
||||
Furthermore, the 'example/tutorial' directory provides step by step tutorial
|
||||
examples which explain many of the features of Kokkos. They work with
|
||||
simple Makefiles. To build with g++ and OpenMP simply type 'make'
|
||||
in the 'example/tutorial' directory. This will build all examples in the
|
||||
subfolders. To change the build options refer to the Programming Guide
|
||||
in the compilation section.
|
||||
|
||||
To learn more about Kokkos consider watching one of our presentations:
|
||||
* GTC 2015:
|
||||
- http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
|
||||
- http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf
|
||||
|
||||
|
||||
# Contributing to Kokkos
|
||||
|
||||
We are open and try to encourage contributions from external developers.
|
||||
To do so please first open an issue describing the contribution and then issue
|
||||
a pull request against the develop branch. For larger features it may be good
|
||||
to get guidance from the core development team first through the github issue.
|
||||
|
||||
Note that Kokkos Core is licensed under standard 3-clause BSD terms of use.
|
||||
Which means contributing to Kokkos allows anyone else to use your contributions
|
||||
not just for public purposes but also for closed source commercial projects.
|
||||
For specifics see the LICENSE file contained in the repository or distribution.
|
||||
|
||||
# Requirements
|
||||
|
||||
### Primary tested compilers on X86 are:
|
||||
* GCC 4.8.4
|
||||
* GCC 4.9.3
|
||||
* GCC 5.1.0
|
||||
* GCC 5.4.0
|
||||
* GCC 5.5.0
|
||||
* GCC 6.1.0
|
||||
* GCC 7.2.0
|
||||
* GCC 7.3.0
|
||||
* GCC 8.1.0
|
||||
* Intel 15.0.2
|
||||
* Intel 16.0.1
|
||||
* Intel 17.0.1
|
||||
* Intel 17.4.196
|
||||
* Intel 18.2.128
|
||||
* Clang 3.6.1
|
||||
* Clang 3.7.1
|
||||
* Clang 3.8.1
|
||||
* Clang 3.9.0
|
||||
* Clang 4.0.0
|
||||
* Clang 6.0.0 for CUDA (CUDA Toolkit 9.0)
|
||||
* Clang 7.0.0 for CUDA (CUDA Toolkit 9.1)
|
||||
* Clang 8.0.0 for CUDA (CUDA Toolkit 9.2)
|
||||
* PGI 18.7
|
||||
* NVCC 9.1 for CUDA (with gcc 6.1.0)
|
||||
* NVCC 9.2 for CUDA (with gcc 7.2.0)
|
||||
* NVCC 10.0 for CUDA (with gcc 7.4.0)
|
||||
* NVCC 10.1 for CUDA (with gcc 7.4.0)
|
||||
|
||||
### Primary tested compilers on Power 8 are:
|
||||
* GCC 6.4.0 (OpenMP,Serial)
|
||||
* GCC 7.2.0 (OpenMP,Serial)
|
||||
* IBM XL 16.1.0 (OpenMP, Serial)
|
||||
* NVCC 9.2.88 for CUDA (with gcc 7.2.0 and XL 16.1.0)
|
||||
|
||||
### Primary tested compilers on Intel KNL are:
|
||||
* Intel 16.4.258 (with gcc 4.7.2)
|
||||
* Intel 17.2.174 (with gcc 4.9.3)
|
||||
* Intel 18.2.199 (with gcc 4.9.3)
|
||||
|
||||
### Primary tested compilers on ARM (Cavium ThunderX2)
|
||||
* GCC 7.2.0
|
||||
* ARM/Clang 18.4.0
|
||||
|
||||
### Other compilers working:
|
||||
* X86:
|
||||
* Cygwin 2.1.0 64bit with gcc 4.9.3
|
||||
* GCC 8.1.0 (not warning free)
|
||||
|
||||
### Known non-working combinations:
|
||||
* Power8:
|
||||
* Pthreads backend
|
||||
* ARM
|
||||
* Pthreads backend
|
||||
|
||||
|
||||
Primary tested compiler are passing in release mode
|
||||
with warnings as errors. They also are tested with a comprehensive set of
|
||||
backend combinations (i.e. OpenMP, Pthreads, Serial, OpenMP+Serial, ...).
|
||||
We are using the following set of flags:
|
||||
* GCC:
|
||||
````
|
||||
-Wall -Wshadow -pedantic
|
||||
-Werror -Wsign-compare -Wtype-limits
|
||||
-Wignored-qualifiers -Wempty-body
|
||||
-Wclobbered -Wuninitialized
|
||||
````
|
||||
* Intel:
|
||||
````
|
||||
-Wall -Wshadow -pedantic
|
||||
-Werror -Wsign-compare -Wtype-limits
|
||||
-Wuninitialized
|
||||
````
|
||||
* Clang:
|
||||
````
|
||||
-Wall -Wshadow -pedantic
|
||||
-Werror -Wsign-compare -Wtype-limits
|
||||
-Wuninitialized
|
||||
````
|
||||
|
||||
* NVCC:
|
||||
````
|
||||
-Wall -Wshadow -pedantic
|
||||
-Werror -Wsign-compare -Wtype-limits
|
||||
-Wuninitialized
|
||||
````
|
||||
|
||||
Other compilers are tested occasionally, in particular when pushing from develop to
|
||||
master branch. These are tested less rigorously without `-Werror` and only for a select set of backends.
|
||||
|
||||
# Building and Installing Kokkos
|
||||
Kokkos provide a CMake build system and a raw Makefile build system.
|
||||
The CMake build system is strongly encouraged and will be the most rigorously supported in future releases.
|
||||
Full details are given in the [build instructions](BUILD.md). Basic setups are shown here:
|
||||
|
||||
## CMake
|
||||
|
||||
The best way to install Kokkos is using the CMake build system. Assuming Kokkos lives in `$srcdir`:
|
||||
````
|
||||
cmake $srcdir \
|
||||
-DCMAKE_CXX_COMPILER=$path_to_compiler \
|
||||
-DCMAKE_INSTALL_PREFIX=$path_to_install \
|
||||
-DKokkos_ENABLE_OPENMP=On \
|
||||
-DKokkos_ARCH_HSW=On \
|
||||
-DKokkos_ENABLE_HWLOC=On \
|
||||
-DKokkos_HWLOC_DIR=$path_to_hwloc
|
||||
````
|
||||
then simply type `make install`. The Kokkos CMake package will then be installed in `$path_to_install` to be used by downstream packages.
|
||||
|
||||
To validate the Kokkos build, configure with
|
||||
````
|
||||
-DKokkos_ENABLE_TESTS=On
|
||||
````
|
||||
and run `make test` after completing the build.
|
||||
|
||||
For your CMake project using Kokkos, code such as the following:
|
||||
|
||||
````
|
||||
find_package(Kokkos)
|
||||
...
|
||||
target_link_libraries(myTarget Kokkos::kokkos)
|
||||
````
|
||||
should be added to your CMakeLists.txt. Your configure should additionally include
|
||||
````
|
||||
-DKokkos_DIR=$path_to_install/cmake/lib/Kokkos
|
||||
````
|
||||
or
|
||||
````
|
||||
-DKokkos_ROOT=$path_to_install
|
||||
````
|
||||
for the install location given above.
|
||||
|
||||
## Spack
|
||||
An alternative to manually building with the CMake is to use the Spack package manager.
|
||||
To do so, download the `kokkos-spack` git repo and add to the package list:
|
||||
````
|
||||
spack repo add $path-to-kokkos-spack
|
||||
````
|
||||
A basic installation would be done as:
|
||||
````
|
||||
spack install kokkos
|
||||
````
|
||||
Spack allows options and and compilers to be tuned in the install command.
|
||||
````
|
||||
spack install kokkos@3.0 %gcc@7.3.0 +openmp
|
||||
````
|
||||
This example illustrates the three most common parameters to Spack:
|
||||
* Variants: specified with, e.g. `+openmp`, this activates (or deactivates with, e.g. `~openmp`) certain options.
|
||||
* Version: immediately following `kokkos` the `@version` can specify a particular Kokkos to build
|
||||
* Compiler: a default compiler will be chosen if not specified, but an exact compiler version can be given with the `%`option.
|
||||
|
||||
For a complete list of Kokkos options, run:
|
||||
````
|
||||
spack info kokkos
|
||||
````
|
||||
Spack currently installs packages to a location determined by a unique hash. This hash name is not really "human readable".
|
||||
Generally, Spack usage should never really require you to reference the computer-generated unique install folder.
|
||||
More details are given in the [build instructions](BUILD.md). If you must know, you can locate Spack Kokkos installations with:
|
||||
````
|
||||
spack find -p kokkos ...
|
||||
````
|
||||
where `...` is the unique spec identifying the particular Kokkos configuration and version.
|
||||
|
||||
|
||||
## Raw Makefile
|
||||
A bash script is provided to generate raw makefiles.
|
||||
To install Kokkos as a library create a build directory and run the following
|
||||
````
|
||||
$KOKKOS_PATH/generate_makefile.bash --prefix=$path_to_install
|
||||
````
|
||||
Once the Makefile is generated, run:
|
||||
````
|
||||
make kokkoslib
|
||||
make install
|
||||
````
|
||||
To additionally run the unit tests:
|
||||
````
|
||||
make build-test
|
||||
make test
|
||||
````
|
||||
Run `generate_makefile.bash --help` for more detailed options such as
|
||||
changing the device type for which to build.
|
||||
|
||||
## Inline Builds vs. Installed Package
|
||||
For individual projects, it may be preferable to build Kokkos inline rather than link to an installed package.
|
||||
The main reason is that you may otherwise need many different
|
||||
configurations of Kokkos installed depending on the required compile time
|
||||
features an application needs. For example there is only one default
|
||||
execution space, which means you need different installations to have OpenMP
|
||||
or Pthreads as the default space. Also for the CUDA backend there are certain
|
||||
choices, such as allowing relocatable device code, which must be made at
|
||||
installation time. Building Kokkos inline uses largely the same process
|
||||
as compiling an application against an installed Kokkos library.
|
||||
|
||||
For CMake, this means copying over the Kokkos source code into your project and adding `add_subdirectory(kokkos)` to your CMakeLists.txt.
|
||||
|
||||
For raw Makefiles, see the example benchmarks/bytes_and_flops/Makefile which can be used with an installed library and or an inline build.
|
||||
|
||||
# Kokkos and CUDA UVM
|
||||
|
||||
Kokkos does support UVM as a specific memory space called CudaUVMSpace.
|
||||
Allocations made with that space are accessible from host and device.
|
||||
You can tell Kokkos to use that as the default space for Cuda allocations.
|
||||
In either case UVM comes with a number of restrictions:
|
||||
* You can't access allocations on the host while a kernel is potentially
|
||||
running. This will lead to segfaults. To avoid that you either need to
|
||||
call Kokkos::Cuda::fence() (or just Kokkos::fence()), after kernels, or
|
||||
you can set the environment variable CUDA_LAUNCH_BLOCKING=1.
|
||||
* In multi socket multi GPU machines without NVLINK, UVM defaults
|
||||
to using zero copy allocations for technical reasons related to using multiple
|
||||
GPUs from the same process. If an executable doesn't do that (e.g. each
|
||||
MPI rank of an application uses a single GPU [can be the same GPU for
|
||||
multiple MPI ranks]) you can set CUDA_MANAGED_FORCE_DEVICE_ALLOC=1.
|
||||
This will enforce proper UVM allocations, but can lead to errors if
|
||||
more than a single GPU is used by a single process.
|
||||
|
||||
|
||||
# Citing Kokkos
|
||||
|
||||
If you publish work which mentions Kokkos, please cite the following paper:
|
||||
|
||||
````
|
||||
@article{CarterEdwards20143202,
|
||||
title = "Kokkos: Enabling manycore performance portability through polymorphic memory access patterns ",
|
||||
journal = "Journal of Parallel and Distributed Computing ",
|
||||
volume = "74",
|
||||
number = "12",
|
||||
pages = "3202 - 3216",
|
||||
year = "2014",
|
||||
note = "Domain-Specific Languages and High-Level Frameworks for High-Performance Computing ",
|
||||
issn = "0743-7315",
|
||||
doi = "https://doi.org/10.1016/j.jpdc.2014.07.003",
|
||||
url = "http://www.sciencedirect.com/science/article/pii/S0743731514001257",
|
||||
author = "H. Carter Edwards and Christian R. Trott and Daniel Sunderland"
|
||||
}
|
||||
````
|
||||
|
||||
##### [LICENSE](https://github.com/kokkos/kokkos/blob/master/LICENSE)
|
||||
|
||||
[](https://opensource.org/licenses/BSD-3-Clause)
|
||||
|
||||
Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
the U.S. Government retains certain rights in this software.
|
||||
|
||||
@ -1,12 +1,12 @@
|
||||
|
||||
|
||||
TRIBITS_SUBPACKAGE(Algorithms)
|
||||
KOKKOS_SUBPACKAGE(Algorithms)
|
||||
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
ADD_SUBDIRECTORY(src)
|
||||
ENDIF()
|
||||
|
||||
TRIBITS_ADD_TEST_DIRECTORIES(unit_tests)
|
||||
#TRIBITS_ADD_TEST_DIRECTORIES(performance_tests)
|
||||
KOKKOS_ADD_TEST_DIRECTORIES(unit_tests)
|
||||
|
||||
KOKKOS_SUBPACKAGE_POSTPROCESS()
|
||||
|
||||
|
||||
|
||||
TRIBITS_SUBPACKAGE_POSTPROCESS()
|
||||
|
||||
@ -1,8 +1,9 @@
|
||||
|
||||
TRIBITS_CONFIGURE_FILE(${PACKAGE_NAME}_config.h)
|
||||
KOKKOS_CONFIGURE_FILE(${PACKAGE_NAME}_config.h)
|
||||
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR})
|
||||
#I have to leave these here for tribits
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR})
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
@ -12,10 +13,18 @@ LIST(APPEND HEADERS ${CMAKE_CURRENT_BINARY_DIR}/${PACKAGE_NAME}_config.h)
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
TRIBITS_ADD_LIBRARY(
|
||||
# We have to pass the sources in here for Tribits
|
||||
# These will get ignored for standalone CMake and a true interface library made
|
||||
KOKKOS_ADD_INTERFACE_LIBRARY(
|
||||
kokkosalgorithms
|
||||
HEADERS ${HEADERS}
|
||||
SOURCES ${SOURCES}
|
||||
DEPLIBS
|
||||
)
|
||||
KOKKOS_LIB_INCLUDE_DIRECTORIES(kokkosalgorithms
|
||||
${KOKKOS_TOP_BUILD_DIR}
|
||||
${CMAKE_CURRENT_BINARY_DIR}
|
||||
${CMAKE_CURRENT_SOURCE_DIR}
|
||||
)
|
||||
|
||||
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -41,7 +42,6 @@
|
||||
//@HEADER
|
||||
*/
|
||||
|
||||
|
||||
#ifndef KOKKOS_SORT_HPP_
|
||||
#define KOKKOS_SORT_HPP_
|
||||
|
||||
@ -53,15 +53,14 @@ namespace Kokkos {
|
||||
|
||||
namespace Impl {
|
||||
|
||||
template< class DstViewType , class SrcViewType
|
||||
, int Rank = DstViewType::Rank >
|
||||
template <class DstViewType, class SrcViewType, int Rank = DstViewType::Rank>
|
||||
struct CopyOp;
|
||||
|
||||
template <class DstViewType, class SrcViewType>
|
||||
struct CopyOp<DstViewType, SrcViewType, 1> {
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
static void copy(DstViewType const& dst, size_t i_dst,
|
||||
SrcViewType const& src, size_t i_src ) {
|
||||
static void copy(DstViewType const& dst, size_t i_dst, SrcViewType const& src,
|
||||
size_t i_src) {
|
||||
dst(i_dst) = src(i_src);
|
||||
}
|
||||
};
|
||||
@ -69,38 +68,33 @@ namespace Kokkos {
|
||||
template <class DstViewType, class SrcViewType>
|
||||
struct CopyOp<DstViewType, SrcViewType, 2> {
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
static void copy(DstViewType const& dst, size_t i_dst,
|
||||
SrcViewType const& src, size_t i_src ) {
|
||||
for(int j = 0;j< (int) dst.extent(1); j++)
|
||||
dst(i_dst,j) = src(i_src,j);
|
||||
static void copy(DstViewType const& dst, size_t i_dst, SrcViewType const& src,
|
||||
size_t i_src) {
|
||||
for (int j = 0; j < (int)dst.extent(1); j++) dst(i_dst, j) = src(i_src, j);
|
||||
}
|
||||
};
|
||||
|
||||
template <class DstViewType, class SrcViewType>
|
||||
struct CopyOp<DstViewType, SrcViewType, 3> {
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
static void copy(DstViewType const& dst, size_t i_dst,
|
||||
SrcViewType const& src, size_t i_src ) {
|
||||
static void copy(DstViewType const& dst, size_t i_dst, SrcViewType const& src,
|
||||
size_t i_src) {
|
||||
for (int j = 0; j < dst.extent(1); j++)
|
||||
for (int k = 0; k < dst.extent(2); k++)
|
||||
dst(i_dst, j, k) = src(i_src, j, k);
|
||||
}
|
||||
};
|
||||
}
|
||||
} // namespace Impl
|
||||
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
template< class KeyViewType
|
||||
, class BinSortOp
|
||||
, class Space = typename KeyViewType::device_type
|
||||
, class SizeType = typename KeyViewType::memory_space::size_type
|
||||
>
|
||||
template <class KeyViewType, class BinSortOp,
|
||||
class Space = typename KeyViewType::device_type,
|
||||
class SizeType = typename KeyViewType::memory_space::size_type>
|
||||
class BinSort {
|
||||
public:
|
||||
|
||||
template <class DstViewType, class SrcViewType>
|
||||
struct copy_functor {
|
||||
|
||||
typedef typename SrcViewType::const_type src_view_type;
|
||||
|
||||
typedef Impl::CopyOp<DstViewType, src_view_type> copy_op;
|
||||
@ -109,14 +103,11 @@ public:
|
||||
src_view_type src_values;
|
||||
int dst_offset;
|
||||
|
||||
copy_functor( DstViewType const & dst_values_
|
||||
, int const & dst_offset_
|
||||
, SrcViewType const & src_values_
|
||||
)
|
||||
: dst_values( dst_values_ )
|
||||
, src_values( src_values_ )
|
||||
, dst_offset( dst_offset_ )
|
||||
{}
|
||||
copy_functor(DstViewType const& dst_values_, int const& dst_offset_,
|
||||
SrcViewType const& src_values_)
|
||||
: dst_values(dst_values_),
|
||||
src_values(src_values_),
|
||||
dst_offset(dst_offset_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int& i) const {
|
||||
@ -124,24 +115,18 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
template< class DstViewType
|
||||
, class PermuteViewType
|
||||
, class SrcViewType
|
||||
>
|
||||
template <class DstViewType, class PermuteViewType, class SrcViewType>
|
||||
struct copy_permute_functor {
|
||||
|
||||
// If a Kokkos::View then can generate constant random access
|
||||
// otherwise can only use the constant type.
|
||||
|
||||
typedef typename std::conditional
|
||||
< Kokkos::is_view< SrcViewType >::value
|
||||
, Kokkos::View< typename SrcViewType::const_data_type
|
||||
, typename SrcViewType::array_layout
|
||||
, typename SrcViewType::device_type
|
||||
, Kokkos::MemoryTraits<Kokkos::RandomAccess>
|
||||
>
|
||||
, typename SrcViewType::const_type
|
||||
>::type src_view_type ;
|
||||
typedef typename std::conditional<
|
||||
Kokkos::is_view<SrcViewType>::value,
|
||||
Kokkos::View<typename SrcViewType::const_data_type,
|
||||
typename SrcViewType::array_layout,
|
||||
typename SrcViewType::device_type,
|
||||
Kokkos::MemoryTraits<Kokkos::RandomAccess> >,
|
||||
typename SrcViewType::const_type>::type src_view_type;
|
||||
|
||||
typedef typename PermuteViewType::const_type perm_view_type;
|
||||
|
||||
@ -152,16 +137,13 @@ public:
|
||||
src_view_type src_values;
|
||||
int src_offset;
|
||||
|
||||
copy_permute_functor( DstViewType const & dst_values_
|
||||
, PermuteViewType const & sort_order_
|
||||
, SrcViewType const & src_values_
|
||||
, int const & src_offset_
|
||||
)
|
||||
: dst_values( dst_values_ )
|
||||
, sort_order( sort_order_ )
|
||||
, src_values( src_values_ )
|
||||
, src_offset( src_offset_ )
|
||||
{}
|
||||
copy_permute_functor(DstViewType const& dst_values_,
|
||||
PermuteViewType const& sort_order_,
|
||||
SrcViewType const& src_values_, int const& src_offset_)
|
||||
: dst_values(dst_values_),
|
||||
sort_order(sort_order_),
|
||||
src_values(src_values_),
|
||||
src_offset(src_offset_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int& i) const {
|
||||
@ -178,7 +160,6 @@ public:
|
||||
struct bin_sort_bins_tag {};
|
||||
|
||||
public:
|
||||
|
||||
typedef SizeType size_type;
|
||||
typedef size_type value_type;
|
||||
|
||||
@ -190,27 +171,25 @@ public:
|
||||
// If a Kokkos::View then can generate constant random access
|
||||
// otherwise can only use the constant type.
|
||||
|
||||
typedef typename std::conditional
|
||||
< Kokkos::is_view< KeyViewType >::value
|
||||
, Kokkos::View< typename KeyViewType::const_data_type,
|
||||
typedef typename std::conditional<
|
||||
Kokkos::is_view<KeyViewType>::value,
|
||||
Kokkos::View<typename KeyViewType::const_data_type,
|
||||
typename KeyViewType::array_layout,
|
||||
typename KeyViewType::device_type,
|
||||
Kokkos::MemoryTraits<Kokkos::RandomAccess> >
|
||||
, const_key_view_type
|
||||
>::type const_rnd_key_view_type;
|
||||
Kokkos::MemoryTraits<Kokkos::RandomAccess> >,
|
||||
const_key_view_type>::type const_rnd_key_view_type;
|
||||
|
||||
typedef typename KeyViewType::non_const_value_type non_const_key_scalar;
|
||||
typedef typename KeyViewType::const_value_type const_key_scalar;
|
||||
|
||||
typedef Kokkos::View<int*, Space, Kokkos::MemoryTraits<Kokkos::Atomic> > bin_count_atomic_type ;
|
||||
typedef Kokkos::View<int*, Space, Kokkos::MemoryTraits<Kokkos::Atomic> >
|
||||
bin_count_atomic_type;
|
||||
|
||||
private:
|
||||
|
||||
const_key_view_type keys;
|
||||
const_rnd_key_view_type keys_rnd;
|
||||
|
||||
public:
|
||||
|
||||
BinSortOp bin_op;
|
||||
offset_type bin_offsets;
|
||||
bin_count_atomic_type bin_count_atomic;
|
||||
@ -222,62 +201,72 @@ public:
|
||||
bool sort_within_bins;
|
||||
|
||||
public:
|
||||
|
||||
BinSort() {}
|
||||
|
||||
//----------------------------------------
|
||||
// Constructor: takes the keys, the binning_operator and optionally whether to sort within bins (default false)
|
||||
BinSort( const_key_view_type keys_
|
||||
, int range_begin_
|
||||
, int range_end_
|
||||
, BinSortOp bin_op_
|
||||
, bool sort_within_bins_ = false
|
||||
)
|
||||
: keys(keys_)
|
||||
, keys_rnd(keys_)
|
||||
, bin_op(bin_op_)
|
||||
, bin_offsets()
|
||||
, bin_count_atomic()
|
||||
, bin_count_const()
|
||||
, sort_order()
|
||||
, range_begin( range_begin_ )
|
||||
, range_end( range_end_ )
|
||||
, sort_within_bins( sort_within_bins_ )
|
||||
{
|
||||
bin_count_atomic = Kokkos::View<int*, Space >("Kokkos::SortImpl::BinSortFunctor::bin_count",bin_op.max_bins());
|
||||
// Constructor: takes the keys, the binning_operator and optionally whether to
|
||||
// sort within bins (default false)
|
||||
BinSort(const_key_view_type keys_, int range_begin_, int range_end_,
|
||||
BinSortOp bin_op_, bool sort_within_bins_ = false)
|
||||
: keys(keys_),
|
||||
keys_rnd(keys_),
|
||||
bin_op(bin_op_),
|
||||
bin_offsets(),
|
||||
bin_count_atomic(),
|
||||
bin_count_const(),
|
||||
sort_order(),
|
||||
range_begin(range_begin_),
|
||||
range_end(range_end_),
|
||||
sort_within_bins(sort_within_bins_) {
|
||||
bin_count_atomic = Kokkos::View<int*, Space>(
|
||||
"Kokkos::SortImpl::BinSortFunctor::bin_count", bin_op.max_bins());
|
||||
bin_count_const = bin_count_atomic;
|
||||
bin_offsets = offset_type(ViewAllocateWithoutInitializing("Kokkos::SortImpl::BinSortFunctor::bin_offsets"),bin_op.max_bins());
|
||||
sort_order = offset_type(ViewAllocateWithoutInitializing("Kokkos::SortImpl::BinSortFunctor::sort_order"),range_end-range_begin);
|
||||
bin_offsets =
|
||||
offset_type(ViewAllocateWithoutInitializing(
|
||||
"Kokkos::SortImpl::BinSortFunctor::bin_offsets"),
|
||||
bin_op.max_bins());
|
||||
sort_order =
|
||||
offset_type(ViewAllocateWithoutInitializing(
|
||||
"Kokkos::SortImpl::BinSortFunctor::sort_order"),
|
||||
range_end - range_begin);
|
||||
}
|
||||
|
||||
BinSort( const_key_view_type keys_
|
||||
, BinSortOp bin_op_
|
||||
, bool sort_within_bins_ = false
|
||||
)
|
||||
BinSort(const_key_view_type keys_, BinSortOp bin_op_,
|
||||
bool sort_within_bins_ = false)
|
||||
: BinSort(keys_, 0, keys_.extent(0), bin_op_, sort_within_bins_) {}
|
||||
|
||||
//----------------------------------------
|
||||
// Create the permutation vector, the bin_offset array and the bin_count array. Can be called again if keys changed
|
||||
// Create the permutation vector, the bin_offset array and the bin_count
|
||||
// array. Can be called again if keys changed
|
||||
void create_permute_vector() {
|
||||
const size_t len = range_end - range_begin;
|
||||
Kokkos::parallel_for ("Kokkos::Sort::BinCount",Kokkos::RangePolicy<execution_space,bin_count_tag> (0,len),*this);
|
||||
Kokkos::parallel_scan("Kokkos::Sort::BinOffset",Kokkos::RangePolicy<execution_space,bin_offset_tag> (0,bin_op.max_bins()) ,*this);
|
||||
Kokkos::parallel_for(
|
||||
"Kokkos::Sort::BinCount",
|
||||
Kokkos::RangePolicy<execution_space, bin_count_tag>(0, len), *this);
|
||||
Kokkos::parallel_scan("Kokkos::Sort::BinOffset",
|
||||
Kokkos::RangePolicy<execution_space, bin_offset_tag>(
|
||||
0, bin_op.max_bins()),
|
||||
*this);
|
||||
|
||||
Kokkos::deep_copy(bin_count_atomic, 0);
|
||||
Kokkos::parallel_for ("Kokkos::Sort::BinBinning",Kokkos::RangePolicy<execution_space,bin_binning_tag> (0,len),*this);
|
||||
Kokkos::parallel_for(
|
||||
"Kokkos::Sort::BinBinning",
|
||||
Kokkos::RangePolicy<execution_space, bin_binning_tag>(0, len), *this);
|
||||
|
||||
if (sort_within_bins)
|
||||
Kokkos::parallel_for ("Kokkos::Sort::BinSort",Kokkos::RangePolicy<execution_space,bin_sort_bins_tag>(0,bin_op.max_bins()) ,*this);
|
||||
Kokkos::parallel_for(
|
||||
"Kokkos::Sort::BinSort",
|
||||
Kokkos::RangePolicy<execution_space, bin_sort_bins_tag>(
|
||||
0, bin_op.max_bins()),
|
||||
*this);
|
||||
}
|
||||
|
||||
// Sort a subset of a view with respect to the first dimension using the permutation array
|
||||
// Sort a subset of a view with respect to the first dimension using the
|
||||
// permutation array
|
||||
template <class ValuesViewType>
|
||||
void sort( ValuesViewType const & values
|
||||
, int values_range_begin
|
||||
, int values_range_end) const
|
||||
{
|
||||
typedef
|
||||
Kokkos::View< typename ValuesViewType::data_type,
|
||||
void sort(ValuesViewType const& values, int values_range_begin,
|
||||
int values_range_end) const {
|
||||
typedef Kokkos::View<typename ValuesViewType::data_type,
|
||||
typename ValuesViewType::array_layout,
|
||||
typename ValuesViewType::device_type>
|
||||
scratch_view_type;
|
||||
@ -285,56 +274,64 @@ public:
|
||||
const size_t len = range_end - range_begin;
|
||||
const size_t values_len = values_range_end - values_range_begin;
|
||||
if (len != values_len) {
|
||||
Kokkos::abort("BinSort::sort: values range length != permutation vector length");
|
||||
Kokkos::abort(
|
||||
"BinSort::sort: values range length != permutation vector length");
|
||||
}
|
||||
|
||||
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
|
||||
scratch_view_type
|
||||
sorted_values(ViewAllocateWithoutInitializing("Kokkos::SortImpl::BinSortFunctor::sorted_values"),
|
||||
len,
|
||||
values.extent(1),
|
||||
values.extent(2),
|
||||
values.extent(3),
|
||||
values.extent(4),
|
||||
values.extent(5),
|
||||
values.extent(6),
|
||||
values.extent(7));
|
||||
scratch_view_type sorted_values(
|
||||
ViewAllocateWithoutInitializing(
|
||||
"Kokkos::SortImpl::BinSortFunctor::sorted_values"),
|
||||
len, values.extent(1), values.extent(2), values.extent(3),
|
||||
values.extent(4), values.extent(5), values.extent(6), values.extent(7));
|
||||
#else
|
||||
scratch_view_type
|
||||
sorted_values(ViewAllocateWithoutInitializing("Kokkos::SortImpl::BinSortFunctor::sorted_values"),
|
||||
scratch_view_type sorted_values(
|
||||
ViewAllocateWithoutInitializing(
|
||||
"Kokkos::SortImpl::BinSortFunctor::sorted_values"),
|
||||
values.rank_dynamic > 0 ? len : KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 1 ? values.extent(1) : KOKKOS_IMPL_CTOR_DEFAULT_ARG ,
|
||||
values.rank_dynamic > 2 ? values.extent(2) : KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 3 ? values.extent(3) : KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 4 ? values.extent(4) : KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 5 ? values.extent(5) : KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 6 ? values.extent(6) : KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 7 ? values.extent(7) : KOKKOS_IMPL_CTOR_DEFAULT_ARG);
|
||||
values.rank_dynamic > 1 ? values.extent(1)
|
||||
: KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 2 ? values.extent(2)
|
||||
: KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 3 ? values.extent(3)
|
||||
: KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 4 ? values.extent(4)
|
||||
: KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 5 ? values.extent(5)
|
||||
: KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 6 ? values.extent(6)
|
||||
: KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||
values.rank_dynamic > 7 ? values.extent(7)
|
||||
: KOKKOS_IMPL_CTOR_DEFAULT_ARG);
|
||||
#endif
|
||||
|
||||
{
|
||||
copy_permute_functor<scratch_view_type /* DstViewType */
|
||||
, offset_type /* PermuteViewType */
|
||||
, ValuesViewType /* SrcViewType */
|
||||
,
|
||||
offset_type /* PermuteViewType */
|
||||
,
|
||||
ValuesViewType /* SrcViewType */
|
||||
>
|
||||
functor( sorted_values , sort_order , values, values_range_begin - range_begin );
|
||||
functor(sorted_values, sort_order, values,
|
||||
values_range_begin - range_begin);
|
||||
|
||||
parallel_for("Kokkos::Sort::CopyPermute", Kokkos::RangePolicy<execution_space>(0,len),functor);
|
||||
parallel_for("Kokkos::Sort::CopyPermute",
|
||||
Kokkos::RangePolicy<execution_space>(0, len), functor);
|
||||
}
|
||||
|
||||
{
|
||||
copy_functor< ValuesViewType , scratch_view_type >
|
||||
functor( values , range_begin , sorted_values );
|
||||
copy_functor<ValuesViewType, scratch_view_type> functor(
|
||||
values, range_begin, sorted_values);
|
||||
|
||||
parallel_for("Kokkos::Sort::Copy", Kokkos::RangePolicy<execution_space>(0,len),functor);
|
||||
parallel_for("Kokkos::Sort::Copy",
|
||||
Kokkos::RangePolicy<execution_space>(0, len), functor);
|
||||
}
|
||||
|
||||
Kokkos::fence();
|
||||
}
|
||||
|
||||
template <class ValuesViewType>
|
||||
void sort( ValuesViewType const & values ) const
|
||||
{
|
||||
void sort(ValuesViewType const& values) const {
|
||||
this->sort(values, 0, /*values.extent(0)*/ range_end - range_begin);
|
||||
}
|
||||
|
||||
@ -351,7 +348,6 @@ public:
|
||||
bin_count_type get_bin_count() const { return bin_count_const; }
|
||||
|
||||
public:
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const bin_count_tag& tag, const int& i) const {
|
||||
const int j = range_begin + i;
|
||||
@ -359,7 +355,8 @@ public:
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (const bin_offset_tag& tag, const int& i, value_type& offset, const bool& final) const {
|
||||
void operator()(const bin_offset_tag& tag, const int& i, value_type& offset,
|
||||
const bool& final) const {
|
||||
if (final) {
|
||||
bin_offsets(i) = offset;
|
||||
}
|
||||
@ -410,32 +407,34 @@ struct BinOp1D {
|
||||
typename KeyViewType::const_value_type range_;
|
||||
typename KeyViewType::const_value_type min_;
|
||||
|
||||
BinOp1D():max_bins_(0),mul_(0.0),
|
||||
BinOp1D()
|
||||
: max_bins_(0),
|
||||
mul_(0.0),
|
||||
range_(typename KeyViewType::const_value_type()),
|
||||
min_(typename KeyViewType::const_value_type()) {}
|
||||
|
||||
// Construct BinOp with number of bins, minimum value and maxuimum value
|
||||
BinOp1D(int max_bins__, typename KeyViewType::const_value_type min,
|
||||
typename KeyViewType::const_value_type max)
|
||||
:max_bins_(max_bins__+1),mul_(1.0*max_bins__/(max-min)),range_(max-min),min_(min) {}
|
||||
: max_bins_(max_bins__ + 1),
|
||||
mul_(1.0 * max_bins__ / (max - min)),
|
||||
range_(max - min),
|
||||
min_(min) {}
|
||||
|
||||
// Determine bin index from key value
|
||||
template <class ViewType>
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
int bin(ViewType& keys, const int& i) const {
|
||||
KOKKOS_INLINE_FUNCTION int bin(ViewType& keys, const int& i) const {
|
||||
return int(mul_ * (keys(i) - min_));
|
||||
}
|
||||
|
||||
// Return maximum bin index + 1
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
int max_bins() const {
|
||||
return max_bins_;
|
||||
}
|
||||
int max_bins() const { return max_bins_; }
|
||||
|
||||
// Compare to keys within a bin if true new_val will be put before old_val
|
||||
template <class ViewType, typename iType1, typename iType2>
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
bool operator()(ViewType& keys, iType1& i1, iType2& i2) const {
|
||||
KOKKOS_INLINE_FUNCTION bool operator()(ViewType& keys, iType1& i1,
|
||||
iType2& i2) const {
|
||||
return keys(i1) < keys(i2);
|
||||
}
|
||||
};
|
||||
@ -450,8 +449,7 @@ struct BinOp3D {
|
||||
BinOp3D() {}
|
||||
|
||||
BinOp3D(int max_bins__[], typename KeyViewType::const_value_type min[],
|
||||
typename KeyViewType::const_value_type max[] )
|
||||
{
|
||||
typename KeyViewType::const_value_type max[]) {
|
||||
max_bins_[0] = max_bins__[0];
|
||||
max_bins_[1] = max_bins__[1];
|
||||
max_bins_[2] = max_bins__[2];
|
||||
@ -467,24 +465,24 @@ struct BinOp3D {
|
||||
}
|
||||
|
||||
template <class ViewType>
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
int bin(ViewType& keys, const int& i) const {
|
||||
KOKKOS_INLINE_FUNCTION int bin(ViewType& keys, const int& i) const {
|
||||
return int((((int(mul_[0] * (keys(i, 0) - min_[0])) * max_bins_[1]) +
|
||||
int(mul_[1]*(keys(i,1)-min_[1])))*max_bins_[2]) +
|
||||
int(mul_[1] * (keys(i, 1) - min_[1]))) *
|
||||
max_bins_[2]) +
|
||||
int(mul_[2] * (keys(i, 2) - min_[2])));
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
int max_bins() const {
|
||||
return max_bins_[0]*max_bins_[1]*max_bins_[2];
|
||||
}
|
||||
int max_bins() const { return max_bins_[0] * max_bins_[1] * max_bins_[2]; }
|
||||
|
||||
template <class ViewType, typename iType1, typename iType2>
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
bool operator()(ViewType& keys, iType1& i1 , iType2& i2) const {
|
||||
if (keys(i1,0)>keys(i2,0)) return true;
|
||||
KOKKOS_INLINE_FUNCTION bool operator()(ViewType& keys, iType1& i1,
|
||||
iType2& i2) const {
|
||||
if (keys(i1, 0) > keys(i2, 0))
|
||||
return true;
|
||||
else if (keys(i1, 0) == keys(i2, 0)) {
|
||||
if (keys(i1,1)>keys(i2,1)) return true;
|
||||
if (keys(i1, 1) > keys(i2, 1))
|
||||
return true;
|
||||
else if (keys(i1, 1) == keys(i2, 1)) {
|
||||
if (keys(i1, 2) > keys(i2, 2)) return true;
|
||||
}
|
||||
@ -498,16 +496,11 @@ namespace Impl {
|
||||
template <class ViewType>
|
||||
bool try_std_sort(ViewType view) {
|
||||
bool possible = true;
|
||||
size_t stride[8] = { view.stride_0()
|
||||
, view.stride_1()
|
||||
, view.stride_2()
|
||||
, view.stride_3()
|
||||
, view.stride_4()
|
||||
, view.stride_5()
|
||||
, view.stride_6()
|
||||
, view.stride_7()
|
||||
};
|
||||
possible = possible && std::is_same<typename ViewType::memory_space, HostSpace>::value;
|
||||
size_t stride[8] = {view.stride_0(), view.stride_1(), view.stride_2(),
|
||||
view.stride_3(), view.stride_4(), view.stride_5(),
|
||||
view.stride_6(), view.stride_7()};
|
||||
possible = possible &&
|
||||
std::is_same<typename ViewType::memory_space, HostSpace>::value;
|
||||
possible = possible && (ViewType::Rank == 1);
|
||||
possible = possible && (stride[0] == 1);
|
||||
if (possible) {
|
||||
@ -518,7 +511,8 @@ bool try_std_sort(ViewType view) {
|
||||
|
||||
template <class ViewType>
|
||||
struct min_max_functor {
|
||||
typedef Kokkos::MinMaxScalar<typename ViewType::non_const_value_type> minmax_scalar;
|
||||
typedef Kokkos::MinMaxScalar<typename ViewType::non_const_value_type>
|
||||
minmax_scalar;
|
||||
|
||||
ViewType view;
|
||||
min_max_functor(const ViewType& view_) : view(view_) {}
|
||||
@ -530,11 +524,10 @@ struct min_max_functor {
|
||||
}
|
||||
};
|
||||
|
||||
}
|
||||
} // namespace Impl
|
||||
|
||||
template <class ViewType>
|
||||
void sort( ViewType const & view , bool const always_use_kokkos_sort = false)
|
||||
{
|
||||
void sort(ViewType const& view, bool const always_use_kokkos_sort = false) {
|
||||
if (!always_use_kokkos_sort) {
|
||||
if (Impl::try_std_sort(view)) return;
|
||||
}
|
||||
@ -542,38 +535,38 @@ void sort( ViewType const & view , bool const always_use_kokkos_sort = false)
|
||||
|
||||
Kokkos::MinMaxScalar<typename ViewType::non_const_value_type> result;
|
||||
Kokkos::MinMax<typename ViewType::non_const_value_type> reducer(result);
|
||||
parallel_reduce("Kokkos::Sort::FindExtent",Kokkos::RangePolicy<typename ViewType::execution_space>(0,view.extent(0)),
|
||||
parallel_reduce("Kokkos::Sort::FindExtent",
|
||||
Kokkos::RangePolicy<typename ViewType::execution_space>(
|
||||
0, view.extent(0)),
|
||||
Impl::min_max_functor<ViewType>(view), reducer);
|
||||
if (result.min_val == result.max_val) return;
|
||||
BinSort<ViewType, CompType> bin_sort(view,CompType(view.extent(0)/2,result.min_val,result.max_val),true);
|
||||
BinSort<ViewType, CompType> bin_sort(
|
||||
view, CompType(view.extent(0) / 2, result.min_val, result.max_val), true);
|
||||
bin_sort.create_permute_vector();
|
||||
bin_sort.sort(view);
|
||||
}
|
||||
|
||||
template <class ViewType>
|
||||
void sort( ViewType view
|
||||
, size_t const begin
|
||||
, size_t const end
|
||||
)
|
||||
{
|
||||
void sort(ViewType view, size_t const begin, size_t const end) {
|
||||
typedef Kokkos::RangePolicy<typename ViewType::execution_space> range_policy;
|
||||
typedef BinOp1D<ViewType> CompType;
|
||||
|
||||
Kokkos::MinMaxScalar<typename ViewType::non_const_value_type> result;
|
||||
Kokkos::MinMax<typename ViewType::non_const_value_type> reducer(result);
|
||||
|
||||
parallel_reduce("Kokkos::Sort::FindExtent", range_policy( begin , end )
|
||||
, Impl::min_max_functor<ViewType>(view),reducer );
|
||||
parallel_reduce("Kokkos::Sort::FindExtent", range_policy(begin, end),
|
||||
Impl::min_max_functor<ViewType>(view), reducer);
|
||||
|
||||
if (result.min_val == result.max_val) return;
|
||||
|
||||
BinSort<ViewType, CompType>
|
||||
bin_sort(view,begin,end,CompType((end-begin)/2,result.min_val,result.max_val),true);
|
||||
BinSort<ViewType, CompType> bin_sort(
|
||||
view, begin, end,
|
||||
CompType((end - begin) / 2, result.min_val, result.max_val), true);
|
||||
|
||||
bin_sort.create_permute_vector();
|
||||
bin_sort.sort(view, begin, end);
|
||||
}
|
||||
|
||||
}
|
||||
} // namespace Kokkos
|
||||
|
||||
#endif
|
||||
|
||||
@ -1,18 +1,12 @@
|
||||
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
INCLUDE_DIRECTORIES(REQUIRED_DURING_INSTALLATION_TESTING ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR}/../src )
|
||||
#Leave these here for now - I don't need transitive deps anyway
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
KOKKOS_INCLUDE_DIRECTORIES(REQUIRED_DURING_INSTALLATION_TESTING ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR}/../src )
|
||||
|
||||
IF(NOT KOKKOS_HAS_TRILINOS)
|
||||
IF(KOKKOS_SEPARATE_LIBS)
|
||||
set(TEST_LINK_TARGETS kokkoscore)
|
||||
ELSE()
|
||||
set(TEST_LINK_TARGETS kokkos)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
SET(GTEST_SOURCE_DIR ${${PARENT_PACKAGE_NAME}_SOURCE_DIR}/tpls/gtest)
|
||||
INCLUDE_DIRECTORIES(${GTEST_SOURCE_DIR})
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${GTEST_SOURCE_DIR})
|
||||
|
||||
# mfh 03 Nov 2017: The gtest library used here must have a different
|
||||
# name than that of the gtest library built in KokkosCore. We can't
|
||||
@ -20,23 +14,20 @@ INCLUDE_DIRECTORIES(${GTEST_SOURCE_DIR})
|
||||
# possible to build only (e.g.,) KokkosAlgorithms tests, without
|
||||
# building KokkosCore tests.
|
||||
|
||||
SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DGTEST_HAS_PTHREAD=0")
|
||||
|
||||
TRIBITS_ADD_LIBRARY(
|
||||
KOKKOS_ADD_TEST_LIBRARY(
|
||||
kokkosalgorithms_gtest
|
||||
HEADERS ${GTEST_SOURCE_DIR}/gtest/gtest.h
|
||||
SOURCES ${GTEST_SOURCE_DIR}/gtest/gtest-all.cc
|
||||
TESTONLY
|
||||
)
|
||||
KOKKOS_TARGET_COMPILE_DEFINITIONS(kokkosalgorithms_gtest PUBLIC "-DGTEST_HAS_PTHREAD=0")
|
||||
|
||||
SET(SOURCES
|
||||
UnitTestMain.cpp
|
||||
TestCuda.cpp
|
||||
)
|
||||
|
||||
SET(LIBRARIES kokkoscore)
|
||||
|
||||
IF(Kokkos_ENABLE_OpenMP)
|
||||
IF(Kokkos_ENABLE_OPENMP)
|
||||
LIST( APPEND SOURCES
|
||||
TestOpenMP.cpp
|
||||
)
|
||||
@ -48,23 +39,19 @@ IF(Kokkos_ENABLE_HPX)
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF(Kokkos_ENABLE_Serial)
|
||||
IF(Kokkos_ENABLE_SERIAL)
|
||||
LIST( APPEND SOURCES
|
||||
TestSerial.cpp
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF(Kokkos_ENABLE_Pthread)
|
||||
IF(Kokkos_ENABLE_PTHREAD)
|
||||
LIST( APPEND SOURCES
|
||||
TestThreads.cpp
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
TRIBITS_ADD_EXECUTABLE_AND_TEST(
|
||||
KOKKOS_ADD_EXECUTABLE_AND_TEST(
|
||||
UnitTest
|
||||
SOURCES ${SOURCES}
|
||||
COMM serial mpi
|
||||
NUM_MPI_PROCS 1
|
||||
FAIL_REGULAR_EXPRESSION " FAILED "
|
||||
TESTONLYLIBS kokkosalgorithms_gtest ${TEST_LINK_TARGETS}
|
||||
)
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -57,41 +58,22 @@
|
||||
|
||||
namespace Test {
|
||||
|
||||
class cuda : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
}
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
}
|
||||
};
|
||||
|
||||
void cuda_test_random_xorshift64( int num_draws )
|
||||
{
|
||||
void cuda_test_random_xorshift64(int num_draws) {
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::Cuda> >(num_draws);
|
||||
}
|
||||
|
||||
void cuda_test_random_xorshift1024( int num_draws )
|
||||
{
|
||||
void cuda_test_random_xorshift1024(int num_draws) {
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::Cuda> >(num_draws);
|
||||
}
|
||||
|
||||
|
||||
#define CUDA_RANDOM_XORSHIFT64(num_draws) \
|
||||
TEST_F( cuda, Random_XorShift64 ) { \
|
||||
cuda_test_random_xorshift64(num_draws); \
|
||||
}
|
||||
TEST(cuda, Random_XorShift64) { cuda_test_random_xorshift64(num_draws); }
|
||||
|
||||
#define CUDA_RANDOM_XORSHIFT1024(num_draws) \
|
||||
TEST_F( cuda, Random_XorShift1024 ) { \
|
||||
cuda_test_random_xorshift1024(num_draws); \
|
||||
}
|
||||
TEST(cuda, Random_XorShift1024) { cuda_test_random_xorshift1024(num_draws); }
|
||||
|
||||
#define CUDA_SORT_UNSIGNED(size) \
|
||||
TEST_F( cuda, SortUnsigned ) { \
|
||||
Impl::test_sort< Kokkos::Cuda, unsigned >(size); \
|
||||
}
|
||||
TEST(cuda, SortUnsigned) { Impl::test_sort<Kokkos::Cuda, unsigned>(size); }
|
||||
|
||||
CUDA_RANDOM_XORSHIFT64(132141141)
|
||||
CUDA_RANDOM_XORSHIFT1024(52428813)
|
||||
@ -100,8 +82,7 @@ CUDA_SORT_UNSIGNED(171)
|
||||
#undef CUDA_RANDOM_XORSHIFT64
|
||||
#undef CUDA_RANDOM_XORSHIFT1024
|
||||
#undef CUDA_SORT_UNSIGNED
|
||||
}
|
||||
} // namespace Test
|
||||
#else
|
||||
void KOKKOS_ALGORITHMS_UNITTESTS_TESTCUDA_PREVENT_LINK_ERROR() {}
|
||||
#endif /* #ifdef KOKKOS_ENABLE_CUDA */
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -41,7 +42,6 @@
|
||||
//@HEADER
|
||||
*/
|
||||
|
||||
|
||||
#include <Kokkos_Macros.hpp>
|
||||
#ifdef KOKKOS_ENABLE_HPX
|
||||
|
||||
@ -55,30 +55,22 @@
|
||||
|
||||
namespace Test {
|
||||
|
||||
class hpx : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
}
|
||||
};
|
||||
|
||||
#define HPX_RANDOM_XORSHIFT64(num_draws) \
|
||||
TEST_F( hpx, Random_XorShift64 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::Experimental::HPX> >(num_draws); \
|
||||
TEST(hpx, Random_XorShift64) { \
|
||||
Impl::test_random< \
|
||||
Kokkos::Random_XorShift64_Pool<Kokkos::Experimental::HPX> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define HPX_RANDOM_XORSHIFT1024(num_draws) \
|
||||
TEST_F( hpx, Random_XorShift1024 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::Experimental::HPX> >(num_draws); \
|
||||
TEST(hpx, Random_XorShift1024) { \
|
||||
Impl::test_random< \
|
||||
Kokkos::Random_XorShift1024_Pool<Kokkos::Experimental::HPX> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define HPX_SORT_UNSIGNED(size) \
|
||||
TEST_F( hpx, SortUnsigned ) { \
|
||||
TEST(hpx, SortUnsigned) { \
|
||||
Impl::test_sort<Kokkos::Experimental::HPX, unsigned>(size); \
|
||||
}
|
||||
|
||||
@ -89,8 +81,7 @@ HPX_SORT_UNSIGNED(171)
|
||||
#undef HPX_RANDOM_XORSHIFT64
|
||||
#undef HPX_RANDOM_XORSHIFT1024
|
||||
#undef HPX_SORT_UNSIGNED
|
||||
} // namespace test
|
||||
} // namespace Test
|
||||
#else
|
||||
void KOKKOS_ALGORITHMS_UNITTESTS_TESTHPX_PREVENT_LINK_ERROR() {}
|
||||
#endif
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -41,7 +42,6 @@
|
||||
//@HEADER
|
||||
*/
|
||||
|
||||
|
||||
#include <Kokkos_Macros.hpp>
|
||||
#ifdef KOKKOS_ENABLE_OPENMP
|
||||
|
||||
@ -55,30 +55,20 @@
|
||||
|
||||
namespace Test {
|
||||
|
||||
class openmp : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
}
|
||||
};
|
||||
|
||||
#define OPENMP_RANDOM_XORSHIFT64(num_draws) \
|
||||
TEST_F( openmp, Random_XorShift64 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::OpenMP> >(num_draws); \
|
||||
TEST(openmp, Random_XorShift64) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::OpenMP> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define OPENMP_RANDOM_XORSHIFT1024(num_draws) \
|
||||
TEST_F( openmp, Random_XorShift1024 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::OpenMP> >(num_draws); \
|
||||
TEST(openmp, Random_XorShift1024) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::OpenMP> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define OPENMP_SORT_UNSIGNED(size) \
|
||||
TEST_F( openmp, SortUnsigned ) { \
|
||||
TEST(openmp, SortUnsigned) { \
|
||||
Impl::test_sort<Kokkos::OpenMP, unsigned>(size); \
|
||||
}
|
||||
|
||||
@ -89,8 +79,7 @@ OPENMP_SORT_UNSIGNED(171)
|
||||
#undef OPENMP_RANDOM_XORSHIFT64
|
||||
#undef OPENMP_RANDOM_XORSHIFT1024
|
||||
#undef OPENMP_SORT_UNSIGNED
|
||||
} // namespace test
|
||||
} // namespace Test
|
||||
#else
|
||||
void KOKKOS_ALGORITHMS_UNITTESTS_TESTOPENMP_PREVENT_LINK_ERROR() {}
|
||||
#endif
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -57,40 +58,24 @@
|
||||
|
||||
namespace Test {
|
||||
|
||||
class rocm : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
}
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
}
|
||||
};
|
||||
|
||||
void rocm_test_random_xorshift64( int num_draws )
|
||||
{
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::Experimental::ROCm> >(num_draws);
|
||||
void rocm_test_random_xorshift64(int num_draws) {
|
||||
Impl::test_random<
|
||||
Kokkos::Random_XorShift64_Pool<Kokkos::Experimental::ROCm> >(num_draws);
|
||||
}
|
||||
|
||||
void rocm_test_random_xorshift1024( int num_draws )
|
||||
{
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::Experimental::ROCm> >(num_draws);
|
||||
void rocm_test_random_xorshift1024(int num_draws) {
|
||||
Impl::test_random<
|
||||
Kokkos::Random_XorShift1024_Pool<Kokkos::Experimental::ROCm> >(num_draws);
|
||||
}
|
||||
|
||||
|
||||
#define ROCM_RANDOM_XORSHIFT64(num_draws) \
|
||||
TEST_F( rocm, Random_XorShift64 ) { \
|
||||
rocm_test_random_xorshift64(num_draws); \
|
||||
}
|
||||
TEST(rocm, Random_XorShift64) { rocm_test_random_xorshift64(num_draws); }
|
||||
|
||||
#define ROCM_RANDOM_XORSHIFT1024(num_draws) \
|
||||
TEST_F( rocm, Random_XorShift1024 ) { \
|
||||
rocm_test_random_xorshift1024(num_draws); \
|
||||
}
|
||||
TEST(rocm, Random_XorShift1024) { rocm_test_random_xorshift1024(num_draws); }
|
||||
|
||||
#define ROCM_SORT_UNSIGNED(size) \
|
||||
TEST_F( rocm, SortUnsigned ) { \
|
||||
TEST(rocm, SortUnsigned) { \
|
||||
Impl::test_sort<Kokkos::Experimental::ROCm, unsigned>(size); \
|
||||
}
|
||||
|
||||
@ -101,8 +86,7 @@ ROCM_SORT_UNSIGNED(171)
|
||||
#undef ROCM_RANDOM_XORSHIFT64
|
||||
#undef ROCM_RANDOM_XORSHIFT1024
|
||||
#undef ROCM_SORT_UNSIGNED
|
||||
}
|
||||
} // namespace Test
|
||||
#else
|
||||
void KOKKOS_ALGORITHMS_UNITTESTS_TESTROCM_PREVENT_LINK_ERROR() {}
|
||||
#endif /* #ifdef KOKKOS_ENABLE_ROCM */
|
||||
|
||||
|
||||
@ -1,10 +1,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -22,10 +23,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -61,8 +62,9 @@ namespace Impl{
|
||||
// (i) mean: the mean is expected to be 0.5*RAND_MAX
|
||||
// (ii) variance: the variance is 1/3*mean*mean
|
||||
// (iii) covariance: the covariance is 0
|
||||
// (iv) 1-tupledistr: the mean, variance and covariance of a 1D Histrogram of random numbers
|
||||
// (v) 3-tupledistr: the mean, variance and covariance of a 3D Histrogram of random numbers
|
||||
// (iv) 1-tupledistr: the mean, variance and covariance of a 1D Histrogram
|
||||
// of random numbers (v) 3-tupledistr: the mean, variance and covariance of
|
||||
// a 3D Histrogram of random numbers
|
||||
|
||||
#define HIST_DIM3D 24
|
||||
#define HIST_DIM1D (HIST_DIM3D * HIST_DIM3D * HIST_DIM3D)
|
||||
@ -123,17 +125,19 @@ struct test_random_functor {
|
||||
// implementations might violate this upper bound, due to rounding
|
||||
// error. Just in case, we leave an extra space at the end of each
|
||||
// dimension, in the View types below.
|
||||
typedef Kokkos::View<int[HIST_DIM1D+1],typename GeneratorPool::device_type> type_1d;
|
||||
typedef Kokkos::View<int[HIST_DIM1D + 1], typename GeneratorPool::device_type>
|
||||
type_1d;
|
||||
type_1d density_1d;
|
||||
typedef Kokkos::View<int[HIST_DIM3D+1][HIST_DIM3D+1][HIST_DIM3D+1],typename GeneratorPool::device_type> type_3d;
|
||||
typedef Kokkos::View<int[HIST_DIM3D + 1][HIST_DIM3D + 1][HIST_DIM3D + 1],
|
||||
typename GeneratorPool::device_type>
|
||||
type_3d;
|
||||
type_3d density_3d;
|
||||
|
||||
test_random_functor (GeneratorPool rand_pool_, type_1d d1d, type_3d d3d) :
|
||||
rand_pool (rand_pool_),
|
||||
test_random_functor(GeneratorPool rand_pool_, type_1d d1d, type_3d d3d)
|
||||
: rand_pool(rand_pool_),
|
||||
mean(0.5 * Kokkos::rand<rnd_type, Scalar>::max()),
|
||||
density_1d(d1d),
|
||||
density_3d (d3d)
|
||||
{}
|
||||
density_3d(d3d) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(int i, RandomProperties& prop) const {
|
||||
@ -171,13 +175,19 @@ struct test_random_functor {
|
||||
|
||||
const Scalar theMax = Kokkos::rand<rnd_type, Scalar>::max();
|
||||
|
||||
const uint64_t ind1_1d = static_cast<uint64_t> (1.0 * HIST_DIM1D * tmp / theMax);
|
||||
const uint64_t ind2_1d = static_cast<uint64_t> (1.0 * HIST_DIM1D * tmp2 / theMax);
|
||||
const uint64_t ind3_1d = static_cast<uint64_t> (1.0 * HIST_DIM1D * tmp3 / theMax);
|
||||
const uint64_t ind1_1d =
|
||||
static_cast<uint64_t>(1.0 * HIST_DIM1D * tmp / theMax);
|
||||
const uint64_t ind2_1d =
|
||||
static_cast<uint64_t>(1.0 * HIST_DIM1D * tmp2 / theMax);
|
||||
const uint64_t ind3_1d =
|
||||
static_cast<uint64_t>(1.0 * HIST_DIM1D * tmp3 / theMax);
|
||||
|
||||
const uint64_t ind1_3d = static_cast<uint64_t> (1.0 * HIST_DIM3D * tmp / theMax);
|
||||
const uint64_t ind2_3d = static_cast<uint64_t> (1.0 * HIST_DIM3D * tmp2 / theMax);
|
||||
const uint64_t ind3_3d = static_cast<uint64_t> (1.0 * HIST_DIM3D * tmp3 / theMax);
|
||||
const uint64_t ind1_3d =
|
||||
static_cast<uint64_t>(1.0 * HIST_DIM3D * tmp / theMax);
|
||||
const uint64_t ind2_3d =
|
||||
static_cast<uint64_t>(1.0 * HIST_DIM3D * tmp2 / theMax);
|
||||
const uint64_t ind3_3d =
|
||||
static_cast<uint64_t>(1.0 * HIST_DIM3D * tmp3 / theMax);
|
||||
|
||||
atomic_fetch_add(&density_1d(ind1_1d), 1);
|
||||
atomic_fetch_add(&density_1d(ind2_1d), 1);
|
||||
@ -204,16 +214,11 @@ struct test_histogram1d_functor {
|
||||
type_1d density_1d;
|
||||
double mean;
|
||||
|
||||
test_histogram1d_functor (type_1d d1d, int num_draws) :
|
||||
density_1d (d1d),
|
||||
mean (1.0*num_draws/HIST_DIM1D*3)
|
||||
{
|
||||
}
|
||||
test_histogram1d_functor(type_1d d1d, int num_draws)
|
||||
: density_1d(d1d), mean(1.0 * num_draws / HIST_DIM1D * 3) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION void
|
||||
operator() (const typename memory_space::size_type i,
|
||||
RandomProperties& prop) const
|
||||
{
|
||||
KOKKOS_INLINE_FUNCTION void operator()(
|
||||
const typename memory_space::size_type i, RandomProperties& prop) const {
|
||||
typedef typename memory_space::size_type size_type;
|
||||
const double count = density_1d(i);
|
||||
prop.mean += count;
|
||||
@ -239,27 +244,26 @@ struct test_histogram3d_functor {
|
||||
// implementations might violate this upper bound, due to rounding
|
||||
// error. Just in case, we leave an extra space at the end of each
|
||||
// dimension, in the View type below.
|
||||
typedef Kokkos::View<int[HIST_DIM3D+1][HIST_DIM3D+1][HIST_DIM3D+1], memory_space> type_3d;
|
||||
typedef Kokkos::View<int[HIST_DIM3D + 1][HIST_DIM3D + 1][HIST_DIM3D + 1],
|
||||
memory_space>
|
||||
type_3d;
|
||||
type_3d density_3d;
|
||||
double mean;
|
||||
|
||||
test_histogram3d_functor (type_3d d3d, int num_draws) :
|
||||
density_3d (d3d),
|
||||
mean (1.0*num_draws/HIST_DIM1D)
|
||||
{}
|
||||
test_histogram3d_functor(type_3d d3d, int num_draws)
|
||||
: density_3d(d3d), mean(1.0 * num_draws / HIST_DIM1D) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION void
|
||||
operator() (const typename memory_space::size_type i,
|
||||
RandomProperties& prop) const
|
||||
{
|
||||
KOKKOS_INLINE_FUNCTION void operator()(
|
||||
const typename memory_space::size_type i, RandomProperties& prop) const {
|
||||
typedef typename memory_space::size_type size_type;
|
||||
const double count = density_3d(i/(HIST_DIM3D*HIST_DIM3D),
|
||||
(i % (HIST_DIM3D*HIST_DIM3D))/HIST_DIM3D,
|
||||
i % HIST_DIM3D);
|
||||
const double count = density_3d(
|
||||
i / (HIST_DIM3D * HIST_DIM3D),
|
||||
(i % (HIST_DIM3D * HIST_DIM3D)) / HIST_DIM3D, i % HIST_DIM3D);
|
||||
prop.mean += count;
|
||||
prop.variance += (count - mean) * (count - mean);
|
||||
if (i < static_cast<size_type>(HIST_DIM1D - 1)) {
|
||||
const double count_next = density_3d((i+1)/(HIST_DIM3D*HIST_DIM3D),
|
||||
const double count_next =
|
||||
density_3d((i + 1) / (HIST_DIM3D * HIST_DIM3D),
|
||||
((i + 1) % (HIST_DIM3D * HIST_DIM3D)) / HIST_DIM3D,
|
||||
(i + 1) % HIST_DIM3D);
|
||||
prop.covariance += (count - mean) * (count_next - mean);
|
||||
@ -278,122 +282,128 @@ struct test_random_scalar {
|
||||
int pass_hist1d_mean, pass_hist1d_var, pass_hist1d_covar;
|
||||
int pass_hist3d_mean, pass_hist3d_var, pass_hist3d_covar;
|
||||
|
||||
test_random_scalar (typename test_random_functor<RandomGenerator,int>::type_1d& density_1d,
|
||||
test_random_scalar(
|
||||
typename test_random_functor<RandomGenerator, int>::type_1d& density_1d,
|
||||
typename test_random_functor<RandomGenerator, int>::type_3d& density_3d,
|
||||
RandomGenerator& pool,
|
||||
unsigned int num_draws)
|
||||
{
|
||||
RandomGenerator& pool, unsigned int num_draws) {
|
||||
using Kokkos::parallel_reduce;
|
||||
using std::cout;
|
||||
using std::endl;
|
||||
using Kokkos::parallel_reduce;
|
||||
|
||||
{
|
||||
cout << " -- Testing randomness properties" << endl;
|
||||
|
||||
RandomProperties result;
|
||||
typedef test_random_functor<RandomGenerator, Scalar> functor_type;
|
||||
parallel_reduce (num_draws/1024, functor_type (pool, density_1d, density_3d), result);
|
||||
parallel_reduce(num_draws / 1024,
|
||||
functor_type(pool, density_1d, density_3d), result);
|
||||
|
||||
//printf("Result: %lf %lf %lf\n",result.mean/num_draws/3,result.variance/num_draws/3,result.covariance/num_draws/2);
|
||||
// printf("Result: %lf %lf
|
||||
// %lf\n",result.mean/num_draws/3,result.variance/num_draws/3,result.covariance/num_draws/2);
|
||||
double tolerance = 1.6 * std::sqrt(1.0 / num_draws);
|
||||
double mean_expect = 0.5 * Kokkos::rand<rnd_type, Scalar>::max();
|
||||
double variance_expect = 1.0 / 3.0 * mean_expect * mean_expect;
|
||||
double mean_eps = mean_expect / (result.mean / num_draws / 3) - 1.0;
|
||||
double variance_eps = variance_expect/(result.variance/num_draws/3)-1.0;
|
||||
double covariance_eps = result.covariance/num_draws/2/variance_expect;
|
||||
pass_mean = ((-tolerance < mean_eps) &&
|
||||
( tolerance > mean_eps)) ? 1:0;
|
||||
double variance_eps =
|
||||
variance_expect / (result.variance / num_draws / 3) - 1.0;
|
||||
double covariance_eps =
|
||||
result.covariance / num_draws / 2 / variance_expect;
|
||||
pass_mean = ((-tolerance < mean_eps) && (tolerance > mean_eps)) ? 1 : 0;
|
||||
pass_var = ((-1.5 * tolerance < variance_eps) &&
|
||||
( 1.5*tolerance > variance_eps)) ? 1:0;
|
||||
(1.5 * tolerance > variance_eps))
|
||||
? 1
|
||||
: 0;
|
||||
pass_covar = ((-2.0 * tolerance < covariance_eps) &&
|
||||
( 2.0*tolerance > covariance_eps)) ? 1:0;
|
||||
cout << "Pass: " << pass_mean
|
||||
<< " " << pass_var
|
||||
<< " " << mean_eps
|
||||
<< " " << variance_eps
|
||||
<< " " << covariance_eps
|
||||
<< " || " << tolerance << endl;
|
||||
(2.0 * tolerance > covariance_eps))
|
||||
? 1
|
||||
: 0;
|
||||
cout << "Pass: " << pass_mean << " " << pass_var << " " << mean_eps << " "
|
||||
<< variance_eps << " " << covariance_eps << " || " << tolerance
|
||||
<< endl;
|
||||
}
|
||||
{
|
||||
cout << " -- Testing 1-D histogram" << endl;
|
||||
|
||||
RandomProperties result;
|
||||
typedef test_histogram1d_functor<typename RandomGenerator::device_type> functor_type;
|
||||
typedef test_histogram1d_functor<typename RandomGenerator::device_type>
|
||||
functor_type;
|
||||
parallel_reduce(HIST_DIM1D, functor_type(density_1d, num_draws), result);
|
||||
|
||||
double tolerance = 6 * std::sqrt(1.0 / HIST_DIM1D);
|
||||
double mean_expect = 1.0 * num_draws * 3 / HIST_DIM1D;
|
||||
double variance_expect = 1.0*num_draws*3/HIST_DIM1D*(1.0-1.0/HIST_DIM1D);
|
||||
double variance_expect =
|
||||
1.0 * num_draws * 3 / HIST_DIM1D * (1.0 - 1.0 / HIST_DIM1D);
|
||||
double covariance_expect = -1.0 * num_draws * 3 / HIST_DIM1D / HIST_DIM1D;
|
||||
double mean_eps = mean_expect / (result.mean / HIST_DIM1D) - 1.0;
|
||||
double variance_eps = variance_expect/(result.variance/HIST_DIM1D)-1.0;
|
||||
double covariance_eps = (result.covariance/HIST_DIM1D - covariance_expect)/mean_expect;
|
||||
pass_hist1d_mean = ((-0.0001 < mean_eps) &&
|
||||
( 0.0001 > mean_eps)) ? 1:0;
|
||||
pass_hist1d_var = ((-0.07 < variance_eps) &&
|
||||
( 0.07 > variance_eps)) ? 1:0;
|
||||
pass_hist1d_covar = ((-0.06 < covariance_eps) &&
|
||||
( 0.06 > covariance_eps)) ? 1:0;
|
||||
double variance_eps =
|
||||
variance_expect / (result.variance / HIST_DIM1D) - 1.0;
|
||||
double covariance_eps =
|
||||
(result.covariance / HIST_DIM1D - covariance_expect) / mean_expect;
|
||||
pass_hist1d_mean = ((-0.0001 < mean_eps) && (0.0001 > mean_eps)) ? 1 : 0;
|
||||
pass_hist1d_var =
|
||||
((-0.07 < variance_eps) && (0.07 > variance_eps)) ? 1 : 0;
|
||||
pass_hist1d_covar =
|
||||
((-0.06 < covariance_eps) && (0.06 > covariance_eps)) ? 1 : 0;
|
||||
|
||||
cout << "Density 1D: " << mean_eps
|
||||
<< " " << variance_eps
|
||||
<< " " << (result.covariance/HIST_DIM1D/HIST_DIM1D)
|
||||
<< " || " << tolerance
|
||||
<< " " << result.min
|
||||
<< " " << result.max
|
||||
<< " || " << result.variance/HIST_DIM1D
|
||||
<< " " << 1.0*num_draws*3/HIST_DIM1D*(1.0-1.0/HIST_DIM1D)
|
||||
<< " || " << result.covariance/HIST_DIM1D
|
||||
<< " " << -1.0*num_draws*3/HIST_DIM1D/HIST_DIM1D
|
||||
<< endl;
|
||||
cout << "Density 1D: " << mean_eps << " " << variance_eps << " "
|
||||
<< (result.covariance / HIST_DIM1D / HIST_DIM1D) << " || "
|
||||
<< tolerance << " " << result.min << " " << result.max << " || "
|
||||
<< result.variance / HIST_DIM1D << " "
|
||||
<< 1.0 * num_draws * 3 / HIST_DIM1D * (1.0 - 1.0 / HIST_DIM1D)
|
||||
<< " || " << result.covariance / HIST_DIM1D << " "
|
||||
<< -1.0 * num_draws * 3 / HIST_DIM1D / HIST_DIM1D << endl;
|
||||
}
|
||||
{
|
||||
cout << " -- Testing 3-D histogram" << endl;
|
||||
|
||||
RandomProperties result;
|
||||
typedef test_histogram3d_functor<typename RandomGenerator::device_type> functor_type;
|
||||
typedef test_histogram3d_functor<typename RandomGenerator::device_type>
|
||||
functor_type;
|
||||
parallel_reduce(HIST_DIM1D, functor_type(density_3d, num_draws), result);
|
||||
|
||||
double tolerance = 6 * std::sqrt(1.0 / HIST_DIM1D);
|
||||
double mean_expect = 1.0 * num_draws / HIST_DIM1D;
|
||||
double variance_expect = 1.0*num_draws/HIST_DIM1D*(1.0-1.0/HIST_DIM1D);
|
||||
double variance_expect =
|
||||
1.0 * num_draws / HIST_DIM1D * (1.0 - 1.0 / HIST_DIM1D);
|
||||
double covariance_expect = -1.0 * num_draws / HIST_DIM1D / HIST_DIM1D;
|
||||
double mean_eps = mean_expect / (result.mean / HIST_DIM1D) - 1.0;
|
||||
double variance_eps = variance_expect/(result.variance/HIST_DIM1D)-1.0;
|
||||
double covariance_eps = (result.covariance/HIST_DIM1D - covariance_expect)/mean_expect;
|
||||
pass_hist3d_mean = ((-tolerance < mean_eps) &&
|
||||
( tolerance > mean_eps)) ? 1:0;
|
||||
double variance_eps =
|
||||
variance_expect / (result.variance / HIST_DIM1D) - 1.0;
|
||||
double covariance_eps =
|
||||
(result.covariance / HIST_DIM1D - covariance_expect) / mean_expect;
|
||||
pass_hist3d_mean =
|
||||
((-tolerance < mean_eps) && (tolerance > mean_eps)) ? 1 : 0;
|
||||
pass_hist3d_var = ((-1.2 * tolerance < variance_eps) &&
|
||||
( 1.2*tolerance > variance_eps)) ? 1:0;
|
||||
pass_hist3d_covar = ((-tolerance < covariance_eps) &&
|
||||
( tolerance > covariance_eps)) ? 1:0;
|
||||
(1.2 * tolerance > variance_eps))
|
||||
? 1
|
||||
: 0;
|
||||
pass_hist3d_covar =
|
||||
((-tolerance < covariance_eps) && (tolerance > covariance_eps)) ? 1
|
||||
: 0;
|
||||
|
||||
cout << "Density 3D: " << mean_eps
|
||||
<< " " << variance_eps
|
||||
<< " " << result.covariance/HIST_DIM1D/HIST_DIM1D
|
||||
<< " || " << tolerance
|
||||
<< " " << result.min
|
||||
<< " " << result.max << endl;
|
||||
cout << "Density 3D: " << mean_eps << " " << variance_eps << " "
|
||||
<< result.covariance / HIST_DIM1D / HIST_DIM1D << " || " << tolerance
|
||||
<< " " << result.min << " " << result.max << endl;
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
template <class RandomGenerator>
|
||||
void test_random(unsigned int num_draws)
|
||||
{
|
||||
void test_random(unsigned int num_draws) {
|
||||
using std::cout;
|
||||
using std::endl;
|
||||
typename test_random_functor<RandomGenerator, int>::type_1d density_1d("D1d");
|
||||
typename test_random_functor<RandomGenerator, int>::type_3d density_3d("D3d");
|
||||
|
||||
|
||||
uint64_t ticks = std::chrono::high_resolution_clock::now().time_since_epoch().count();
|
||||
uint64_t ticks =
|
||||
std::chrono::high_resolution_clock::now().time_since_epoch().count();
|
||||
cout << "Test Seed:" << ticks << endl;
|
||||
|
||||
RandomGenerator pool(ticks);
|
||||
|
||||
cout << "Test Scalar=int" << endl;
|
||||
test_random_scalar<RandomGenerator,int> test_int(density_1d,density_3d,pool,num_draws);
|
||||
test_random_scalar<RandomGenerator, int> test_int(density_1d, density_3d,
|
||||
pool, num_draws);
|
||||
ASSERT_EQ(test_int.pass_mean, 1);
|
||||
ASSERT_EQ(test_int.pass_var, 1);
|
||||
ASSERT_EQ(test_int.pass_covar, 1);
|
||||
@ -407,7 +417,8 @@ void test_random(unsigned int num_draws)
|
||||
deep_copy(density_3d, 0);
|
||||
|
||||
cout << "Test Scalar=unsigned int" << endl;
|
||||
test_random_scalar<RandomGenerator,unsigned int> test_uint(density_1d,density_3d,pool,num_draws);
|
||||
test_random_scalar<RandomGenerator, unsigned int> test_uint(
|
||||
density_1d, density_3d, pool, num_draws);
|
||||
ASSERT_EQ(test_uint.pass_mean, 1);
|
||||
ASSERT_EQ(test_uint.pass_var, 1);
|
||||
ASSERT_EQ(test_uint.pass_covar, 1);
|
||||
@ -421,7 +432,8 @@ void test_random(unsigned int num_draws)
|
||||
deep_copy(density_3d, 0);
|
||||
|
||||
cout << "Test Scalar=int64_t" << endl;
|
||||
test_random_scalar<RandomGenerator,int64_t> test_int64(density_1d,density_3d,pool,num_draws);
|
||||
test_random_scalar<RandomGenerator, int64_t> test_int64(
|
||||
density_1d, density_3d, pool, num_draws);
|
||||
ASSERT_EQ(test_int64.pass_mean, 1);
|
||||
ASSERT_EQ(test_int64.pass_var, 1);
|
||||
ASSERT_EQ(test_int64.pass_covar, 1);
|
||||
@ -435,7 +447,8 @@ void test_random(unsigned int num_draws)
|
||||
deep_copy(density_3d, 0);
|
||||
|
||||
cout << "Test Scalar=uint64_t" << endl;
|
||||
test_random_scalar<RandomGenerator,uint64_t> test_uint64(density_1d,density_3d,pool,num_draws);
|
||||
test_random_scalar<RandomGenerator, uint64_t> test_uint64(
|
||||
density_1d, density_3d, pool, num_draws);
|
||||
ASSERT_EQ(test_uint64.pass_mean, 1);
|
||||
ASSERT_EQ(test_uint64.pass_var, 1);
|
||||
ASSERT_EQ(test_uint64.pass_covar, 1);
|
||||
@ -449,7 +462,8 @@ void test_random(unsigned int num_draws)
|
||||
deep_copy(density_3d, 0);
|
||||
|
||||
cout << "Test Scalar=float" << endl;
|
||||
test_random_scalar<RandomGenerator,float> test_float(density_1d,density_3d,pool,num_draws);
|
||||
test_random_scalar<RandomGenerator, float> test_float(density_1d, density_3d,
|
||||
pool, num_draws);
|
||||
ASSERT_EQ(test_float.pass_mean, 1);
|
||||
ASSERT_EQ(test_float.pass_var, 1);
|
||||
ASSERT_EQ(test_float.pass_covar, 1);
|
||||
@ -463,7 +477,8 @@ void test_random(unsigned int num_draws)
|
||||
deep_copy(density_3d, 0);
|
||||
|
||||
cout << "Test Scalar=double" << endl;
|
||||
test_random_scalar<RandomGenerator,double> test_double(density_1d,density_3d,pool,num_draws);
|
||||
test_random_scalar<RandomGenerator, double> test_double(
|
||||
density_1d, density_3d, pool, num_draws);
|
||||
ASSERT_EQ(test_double.pass_mean, 1);
|
||||
ASSERT_EQ(test_double.pass_var, 1);
|
||||
ASSERT_EQ(test_double.pass_covar, 1);
|
||||
@ -474,7 +489,7 @@ void test_random(unsigned int num_draws)
|
||||
ASSERT_EQ(test_double.pass_hist3d_var, 1);
|
||||
ASSERT_EQ(test_double.pass_hist3d_covar, 1);
|
||||
}
|
||||
}
|
||||
} // namespace Impl
|
||||
|
||||
} // namespace Test
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -52,35 +53,24 @@
|
||||
#include <TestSort.hpp>
|
||||
#include <iomanip>
|
||||
|
||||
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
|
||||
namespace Test {
|
||||
|
||||
class serial : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
}
|
||||
|
||||
static void TearDownTestCase ()
|
||||
{
|
||||
}
|
||||
};
|
||||
|
||||
#define SERIAL_RANDOM_XORSHIFT64(num_draws) \
|
||||
TEST_F( serial, Random_XorShift64 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::Serial> >(num_draws); \
|
||||
TEST(serial, Random_XorShift64) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::Serial> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define SERIAL_RANDOM_XORSHIFT1024(num_draws) \
|
||||
TEST_F( serial, Random_XorShift1024 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::Serial> >(num_draws); \
|
||||
TEST(serial, Random_XorShift1024) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::Serial> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define SERIAL_SORT_UNSIGNED(size) \
|
||||
TEST_F( serial, SortUnsigned ) { \
|
||||
TEST(serial, SortUnsigned) { \
|
||||
Impl::test_sort<Kokkos::Serial, unsigned>(size); \
|
||||
}
|
||||
|
||||
@ -96,5 +86,3 @@ SERIAL_SORT_UNSIGNED(171)
|
||||
#else
|
||||
void KOKKOS_ALGORITHMS_UNITTESTS_TESTSERIAL_PREVENT_LINK_ERROR() {}
|
||||
#endif // KOKKOS_ENABLE_SERIAL
|
||||
|
||||
|
||||
|
||||
@ -1,10 +1,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -22,10 +23,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -75,9 +76,7 @@ struct sum {
|
||||
|
||||
sum(Kokkos::View<Scalar*, ExecutionSpace> keys_) : keys(keys_) {}
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator() (int i, double& count) const {
|
||||
count+=keys(i);
|
||||
}
|
||||
void operator()(int i, double& count) const { count += keys(i); }
|
||||
};
|
||||
|
||||
template <class ExecutionSpace, class Scalar>
|
||||
@ -91,9 +90,9 @@ struct bin3d_is_sorted_struct {
|
||||
Scalar min;
|
||||
Scalar max;
|
||||
|
||||
bin3d_is_sorted_struct(Kokkos::View<Scalar*[3],ExecutionSpace> keys_,int max_bins_,Scalar min_,Scalar max_):
|
||||
keys(keys_),max_bins(max_bins_),min(min_),max(max_) {
|
||||
}
|
||||
bin3d_is_sorted_struct(Kokkos::View<Scalar * [3], ExecutionSpace> keys_,
|
||||
int max_bins_, Scalar min_, Scalar max_)
|
||||
: keys(keys_), max_bins(max_bins_), min(min_), max(max_) {}
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(int i, unsigned int& count) const {
|
||||
int ix1 = int((keys(i, 0) - min) / max * max_bins);
|
||||
@ -103,10 +102,13 @@ struct bin3d_is_sorted_struct {
|
||||
int iy2 = int((keys(i + 1, 1) - min) / max * max_bins);
|
||||
int iz2 = int((keys(i + 1, 2) - min) / max * max_bins);
|
||||
|
||||
if (ix1>ix2) count++;
|
||||
if (ix1 > ix2)
|
||||
count++;
|
||||
else if (ix1 == ix2) {
|
||||
if (iy1>iy2) count++;
|
||||
else if ((iy1==iy2) && (iz1>iz2)) count++;
|
||||
if (iy1 > iy2)
|
||||
count++;
|
||||
else if ((iy1 == iy2) && (iz1 > iz2))
|
||||
count++;
|
||||
}
|
||||
}
|
||||
};
|
||||
@ -137,7 +139,9 @@ void test_1D_sort(unsigned int n,bool force_kokkos) {
|
||||
Kokkos::sort(keys, force_kokkos);
|
||||
|
||||
Kokkos::Random_XorShift64_Pool<ExecutionSpace> g(1931);
|
||||
Kokkos::fill_random(keys,g,Kokkos::Random_XorShift64_Pool<ExecutionSpace>::generator_type::MAX_URAND);
|
||||
Kokkos::fill_random(keys, g,
|
||||
Kokkos::Random_XorShift64_Pool<
|
||||
ExecutionSpace>::generator_type::MAX_URAND);
|
||||
|
||||
double sum_before = 0.0;
|
||||
double sum_after = 0.0;
|
||||
@ -148,11 +152,13 @@ void test_1D_sort(unsigned int n,bool force_kokkos) {
|
||||
Kokkos::sort(keys, force_kokkos);
|
||||
|
||||
Kokkos::parallel_reduce(n, sum<ExecutionSpace, KeyType>(keys), sum_after);
|
||||
Kokkos::parallel_reduce(n-1,is_sorted_struct<ExecutionSpace, KeyType>(keys),sort_fails);
|
||||
Kokkos::parallel_reduce(
|
||||
n - 1, is_sorted_struct<ExecutionSpace, KeyType>(keys), sort_fails);
|
||||
|
||||
double ratio = sum_before / sum_after;
|
||||
double epsilon = 1e-10;
|
||||
unsigned int equal_sum = (ratio > (1.0-epsilon)) && (ratio < (1.0+epsilon)) ? 1 : 0;
|
||||
unsigned int equal_sum =
|
||||
(ratio > (1.0 - epsilon)) && (ratio < (1.0 + epsilon)) ? 1 : 0;
|
||||
|
||||
ASSERT_EQ(sort_fails, 0);
|
||||
ASSERT_EQ(equal_sum, 1);
|
||||
@ -171,7 +177,8 @@ void test_3D_sort(unsigned int n) {
|
||||
double sum_after = 0.0;
|
||||
unsigned int sort_fails = 0;
|
||||
|
||||
Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_before);
|
||||
Kokkos::parallel_reduce(keys.extent(0), sum3D<ExecutionSpace, KeyType>(keys),
|
||||
sum_before);
|
||||
|
||||
int bin_1d = 1;
|
||||
while (bin_1d * bin_1d * bin_1d * 4 < (int)keys.extent(0)) bin_1d *= 2;
|
||||
@ -181,17 +188,21 @@ void test_3D_sort(unsigned int n) {
|
||||
|
||||
typedef Kokkos::BinOp3D<KeyViewType> BinOp;
|
||||
BinOp bin_op(bin_max, min, max);
|
||||
Kokkos::BinSort< KeyViewType , BinOp >
|
||||
Sorter(keys,bin_op,false);
|
||||
Kokkos::BinSort<KeyViewType, BinOp> Sorter(keys, bin_op, false);
|
||||
Sorter.create_permute_vector();
|
||||
Sorter.template sort<KeyViewType>(keys);
|
||||
|
||||
Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_after);
|
||||
Kokkos::parallel_reduce(keys.extent(0)-1,bin3d_is_sorted_struct<ExecutionSpace, KeyType>(keys,bin_1d,min[0],max[0]),sort_fails);
|
||||
Kokkos::parallel_reduce(keys.extent(0), sum3D<ExecutionSpace, KeyType>(keys),
|
||||
sum_after);
|
||||
Kokkos::parallel_reduce(keys.extent(0) - 1,
|
||||
bin3d_is_sorted_struct<ExecutionSpace, KeyType>(
|
||||
keys, bin_1d, min[0], max[0]),
|
||||
sort_fails);
|
||||
|
||||
double ratio = sum_before / sum_after;
|
||||
double epsilon = 1e-10;
|
||||
unsigned int equal_sum = (ratio > (1.0-epsilon)) && (ratio < (1.0+epsilon)) ? 1 : 0;
|
||||
unsigned int equal_sum =
|
||||
(ratio > (1.0 - epsilon)) && (ratio < (1.0 + epsilon)) ? 1 : 0;
|
||||
|
||||
if (sort_fails)
|
||||
printf("3D Sort Sum: %f %f Fails: %u\n", sum_before, sum_after, sort_fails);
|
||||
@ -203,9 +214,9 @@ void test_3D_sort(unsigned int n) {
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
template <class ExecutionSpace, typename KeyType>
|
||||
void test_dynamic_view_sort(unsigned int n )
|
||||
{
|
||||
typedef Kokkos::Experimental::DynamicView<KeyType*,ExecutionSpace> KeyDynamicViewType;
|
||||
void test_dynamic_view_sort(unsigned int n) {
|
||||
typedef Kokkos::Experimental::DynamicView<KeyType*, ExecutionSpace>
|
||||
KeyDynamicViewType;
|
||||
typedef Kokkos::View<KeyType*, ExecutionSpace> KeyViewType;
|
||||
|
||||
const size_t upper_bound = 2 * n;
|
||||
@ -223,7 +234,9 @@ void test_dynamic_view_sort(unsigned int n )
|
||||
Kokkos::sort(keys, 0 /* begin */, n /* end */);
|
||||
|
||||
Kokkos::Random_XorShift64_Pool<ExecutionSpace> g(1931);
|
||||
Kokkos::fill_random(keys_view,g,Kokkos::Random_XorShift64_Pool<ExecutionSpace>::generator_type::MAX_URAND);
|
||||
Kokkos::fill_random(keys_view, g,
|
||||
Kokkos::Random_XorShift64_Pool<
|
||||
ExecutionSpace>::generator_type::MAX_URAND);
|
||||
|
||||
ExecutionSpace().fence();
|
||||
Kokkos::deep_copy(keys, keys_view);
|
||||
@ -233,7 +246,8 @@ void test_dynamic_view_sort(unsigned int n )
|
||||
double sum_after = 0.0;
|
||||
unsigned int sort_fails = 0;
|
||||
|
||||
Kokkos::parallel_reduce(n,sum<ExecutionSpace, KeyType>(keys_view),sum_before);
|
||||
Kokkos::parallel_reduce(n, sum<ExecutionSpace, KeyType>(keys_view),
|
||||
sum_before);
|
||||
|
||||
Kokkos::sort(keys, 0 /* begin */, n /* end */);
|
||||
|
||||
@ -241,18 +255,19 @@ void test_dynamic_view_sort(unsigned int n )
|
||||
Kokkos::deep_copy(keys_view, keys);
|
||||
// ExecutionSpace().fence();
|
||||
|
||||
Kokkos::parallel_reduce(n,sum<ExecutionSpace, KeyType>(keys_view),sum_after);
|
||||
Kokkos::parallel_reduce(n-1,is_sorted_struct<ExecutionSpace, KeyType>(keys_view),sort_fails);
|
||||
Kokkos::parallel_reduce(n, sum<ExecutionSpace, KeyType>(keys_view),
|
||||
sum_after);
|
||||
Kokkos::parallel_reduce(
|
||||
n - 1, is_sorted_struct<ExecutionSpace, KeyType>(keys_view), sort_fails);
|
||||
|
||||
double ratio = sum_before / sum_after;
|
||||
double epsilon = 1e-10;
|
||||
unsigned int equal_sum = (ratio > (1.0-epsilon)) && (ratio < (1.0+epsilon)) ? 1 : 0;
|
||||
unsigned int equal_sum =
|
||||
(ratio > (1.0 - epsilon)) && (ratio < (1.0 + epsilon)) ? 1 : 0;
|
||||
|
||||
if (sort_fails != 0 || equal_sum != 1) {
|
||||
std::cout << " N = " << n
|
||||
<< " ; sum_before = " << sum_before
|
||||
<< " ; sum_after = " << sum_after
|
||||
<< " ; ratio = " << ratio
|
||||
std::cout << " N = " << n << " ; sum_before = " << sum_before
|
||||
<< " ; sum_after = " << sum_after << " ; ratio = " << ratio
|
||||
<< std::endl;
|
||||
}
|
||||
|
||||
@ -263,8 +278,7 @@ void test_dynamic_view_sort(unsigned int n )
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
template <class ExecutionSpace>
|
||||
void test_issue_1160()
|
||||
{
|
||||
void test_issue_1160() {
|
||||
Kokkos::View<int*, ExecutionSpace> element_("element", 10);
|
||||
Kokkos::View<double*, ExecutionSpace> x_("x", 10);
|
||||
Kokkos::View<double*, ExecutionSpace> v_("y", 10);
|
||||
@ -300,7 +314,8 @@ void test_issue_1160()
|
||||
auto min = h_element(end - 1);
|
||||
BinOp binner(end - begin, min, max);
|
||||
|
||||
Kokkos::BinSort<KeyViewType , BinOp > Sorter(element_,begin,end,binner,false);
|
||||
Kokkos::BinSort<KeyViewType, BinOp> Sorter(element_, begin, end, binner,
|
||||
false);
|
||||
Sorter.create_permute_vector();
|
||||
Sorter.sort(element_, begin, end);
|
||||
|
||||
@ -331,8 +346,7 @@ void test_issue_1160()
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
template <class ExecutionSpace, typename KeyType>
|
||||
void test_sort(unsigned int N)
|
||||
{
|
||||
void test_sort(unsigned int N) {
|
||||
test_1D_sort<ExecutionSpace, KeyType>(N * N * N, true);
|
||||
test_1D_sort<ExecutionSpace, KeyType>(N * N * N, false);
|
||||
#if !defined(KOKKOS_ENABLE_ROCM)
|
||||
@ -342,6 +356,6 @@ void test_sort(unsigned int N)
|
||||
test_issue_1160<ExecutionSpace>();
|
||||
}
|
||||
|
||||
}
|
||||
}
|
||||
} // namespace Impl
|
||||
} // namespace Test
|
||||
#endif /* KOKKOS_ALGORITHMS_UNITTESTS_TESTSORT_HPP */
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -52,40 +53,27 @@
|
||||
#include <TestSort.hpp>
|
||||
#include <iomanip>
|
||||
|
||||
|
||||
//----------------------------------------------------------------------------
|
||||
|
||||
|
||||
namespace Test {
|
||||
|
||||
class threads : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
}
|
||||
};
|
||||
|
||||
#define THREADS_RANDOM_XORSHIFT64(num_draws) \
|
||||
TEST_F( threads, Random_XorShift64 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::Threads> >(num_draws); \
|
||||
TEST(threads, Random_XorShift64) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift64_Pool<Kokkos::Threads> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define THREADS_RANDOM_XORSHIFT1024(num_draws) \
|
||||
TEST_F( threads, Random_XorShift1024 ) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::Threads> >(num_draws); \
|
||||
TEST(threads, Random_XorShift1024) { \
|
||||
Impl::test_random<Kokkos::Random_XorShift1024_Pool<Kokkos::Threads> >( \
|
||||
num_draws); \
|
||||
}
|
||||
|
||||
#define THREADS_SORT_UNSIGNED(size) \
|
||||
TEST_F( threads, SortUnsigned ) { \
|
||||
TEST(threads, SortUnsigned) { \
|
||||
Impl::test_sort<Kokkos::Threads, double>(size); \
|
||||
}
|
||||
|
||||
|
||||
THREADS_RANDOM_XORSHIFT64(10240000)
|
||||
THREADS_RANDOM_XORSHIFT1024(10130144)
|
||||
THREADS_SORT_UNSIGNED(171)
|
||||
@ -98,5 +86,3 @@ THREADS_SORT_UNSIGNED(171)
|
||||
#else
|
||||
void KOKKOS_ALGORITHMS_UNITTESTS_TESTTHREADS_PREVENT_LINK_ERROR() {}
|
||||
#endif
|
||||
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -51,4 +52,3 @@ int main(int argc, char *argv[]) {
|
||||
Kokkos::finalize();
|
||||
return result;
|
||||
}
|
||||
|
||||
|
||||
@ -3,16 +3,17 @@
|
||||
#include <Kokkos_Random.hpp>
|
||||
|
||||
template <class Scalar>
|
||||
double test_atomic(int L, int N, int M,int K,int R,Kokkos::View<const int*> offsets) {
|
||||
double test_atomic(int L, int N, int M, int K, int R,
|
||||
Kokkos::View<const int*> offsets) {
|
||||
Kokkos::View<Scalar*> output("Output", N);
|
||||
Kokkos::Impl::Timer timer;
|
||||
|
||||
for (int r = 0; r < R; r++)
|
||||
Kokkos::parallel_for(L, KOKKOS_LAMBDA (const int&i) {
|
||||
Kokkos::parallel_for(
|
||||
L, KOKKOS_LAMBDA(const int& i) {
|
||||
Scalar s = 2;
|
||||
for (int m = 0; m < M; m++) {
|
||||
for(int k=0;k<K;k++)
|
||||
s=s*s+s;
|
||||
for (int k = 0; k < K; k++) s = s * s + s;
|
||||
const int idx = (i + offsets(i, m)) % N;
|
||||
Kokkos::atomic_add(&output(idx), s);
|
||||
}
|
||||
@ -24,15 +25,16 @@ double test_atomic(int L, int N, int M,int K,int R,Kokkos::View<const int*> offs
|
||||
}
|
||||
|
||||
template <class Scalar>
|
||||
double test_no_atomic(int L, int N, int M,int K,int R,Kokkos::View<const int*> offsets) {
|
||||
double test_no_atomic(int L, int N, int M, int K, int R,
|
||||
Kokkos::View<const int*> offsets) {
|
||||
Kokkos::View<Scalar*> output("Output", N);
|
||||
Kokkos::Impl::Timer timer;
|
||||
for (int r = 0; r < R; r++)
|
||||
Kokkos::parallel_for(L, KOKKOS_LAMBDA (const int&i) {
|
||||
Kokkos::parallel_for(
|
||||
L, KOKKOS_LAMBDA(const int& i) {
|
||||
Scalar s = 2;
|
||||
for (int m = 0; m < M; m++) {
|
||||
for(int k=0;k<K;k++)
|
||||
s=s*s+s;
|
||||
for (int k = 0; k < K; k++) s = s * s + s;
|
||||
const int idx = (i + offsets(i, m)) % N;
|
||||
output(idx) += s;
|
||||
}
|
||||
@ -67,7 +69,6 @@ int main(int argc, char* argv[]) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
int L = atoi(argv[1]);
|
||||
int N = atoi(argv[2]);
|
||||
int M = atoi(argv[3]);
|
||||
@ -80,26 +81,18 @@ int main(int argc, char* argv[]) {
|
||||
Kokkos::Random_XorShift64_Pool<> pool(12371);
|
||||
Kokkos::fill_random(offsets, pool, D);
|
||||
double time = 0;
|
||||
if(type==1)
|
||||
time = test_atomic<int>(L,N,M,K,R,offsets);
|
||||
if(type==2)
|
||||
time = test_atomic<long>(L,N,M,K,R,offsets);
|
||||
if(type==3)
|
||||
time = test_atomic<float>(L,N,M,K,R,offsets);
|
||||
if(type==4)
|
||||
time = test_atomic<double>(L,N,M,K,R,offsets);
|
||||
if (type == 1) time = test_atomic<int>(L, N, M, K, R, offsets);
|
||||
if (type == 2) time = test_atomic<long>(L, N, M, K, R, offsets);
|
||||
if (type == 3) time = test_atomic<float>(L, N, M, K, R, offsets);
|
||||
if (type == 4) time = test_atomic<double>(L, N, M, K, R, offsets);
|
||||
if (type == 5)
|
||||
time = test_atomic<Kokkos::complex<double> >(L, N, M, K, R, offsets);
|
||||
|
||||
double time2 = 1;
|
||||
if(type==1)
|
||||
time2 = test_no_atomic<int>(L,N,M,K,R,offsets);
|
||||
if(type==2)
|
||||
time2 = test_no_atomic<long>(L,N,M,K,R,offsets);
|
||||
if(type==3)
|
||||
time2 = test_no_atomic<float>(L,N,M,K,R,offsets);
|
||||
if(type==4)
|
||||
time2 = test_no_atomic<double>(L,N,M,K,R,offsets);
|
||||
if (type == 1) time2 = test_no_atomic<int>(L, N, M, K, R, offsets);
|
||||
if (type == 2) time2 = test_no_atomic<long>(L, N, M, K, R, offsets);
|
||||
if (type == 3) time2 = test_no_atomic<float>(L, N, M, K, R, offsets);
|
||||
if (type == 4) time2 = test_no_atomic<double>(L, N, M, K, R, offsets);
|
||||
if (type == 5)
|
||||
time2 = test_no_atomic<Kokkos::complex<double> >(L, N, M, K, R, offsets);
|
||||
|
||||
@ -111,14 +104,17 @@ int main(int argc, char* argv[]) {
|
||||
if (type == 5) size = sizeof(Kokkos::complex<double>);
|
||||
|
||||
printf("%i\n", size);
|
||||
printf("Time: %s %i %i %i %i %i %i (t_atomic: %e t_nonatomic: %e ratio: %lf )( GUpdates/s: %lf GB/s: %lf )\n",
|
||||
(type==1)?"int": (
|
||||
(type==2)?"long": (
|
||||
(type==3)?"float": (
|
||||
(type==4)?"double":"complex"))),
|
||||
L,N,M,D,K,R,time,time2,time/time2,
|
||||
1.e-9*L*R*M/time, 1.0*L*R*M*2*size/time/1024/1024/1024);
|
||||
printf(
|
||||
"Time: %s %i %i %i %i %i %i (t_atomic: %e t_nonatomic: %e ratio: %lf "
|
||||
")( GUpdates/s: %lf GB/s: %lf )\n",
|
||||
(type == 1)
|
||||
? "int"
|
||||
: ((type == 2)
|
||||
? "long"
|
||||
: ((type == 3) ? "float"
|
||||
: ((type == 4) ? "double" : "complex"))),
|
||||
L, N, M, D, K, R, time, time2, time / time2, 1.e-9 * L * R * M / time,
|
||||
1.0 * L * R * M * 2 * size / time / 1024 / 1024 / 1024);
|
||||
}
|
||||
Kokkos::finalize();
|
||||
}
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -83,17 +84,10 @@ static void run(int N, int K, int R, int U, int F, int T, int S);
|
||||
|
||||
template <class Scalar>
|
||||
void run_stride_unroll(int N, int K, int R, int D, int U, int F, int T, int S) {
|
||||
if(D == 1)
|
||||
RunStride<Scalar,1>::run(N,K,R,U,F,T,S);
|
||||
if(D == 2)
|
||||
RunStride<Scalar,2>::run(N,K,R,U,F,T,S);
|
||||
if(D == 4)
|
||||
RunStride<Scalar,4>::run(N,K,R,U,F,T,S);
|
||||
if(D == 8)
|
||||
RunStride<Scalar,8>::run(N,K,R,U,F,T,S);
|
||||
if(D == 16)
|
||||
RunStride<Scalar,16>::run(N,K,R,U,F,T,S);
|
||||
if(D == 32)
|
||||
RunStride<Scalar,32>::run(N,K,R,U,F,T,S);
|
||||
if (D == 1) RunStride<Scalar, 1>::run(N, K, R, U, F, T, S);
|
||||
if (D == 2) RunStride<Scalar, 2>::run(N, K, R, U, F, T, S);
|
||||
if (D == 4) RunStride<Scalar, 4>::run(N, K, R, U, F, T, S);
|
||||
if (D == 8) RunStride<Scalar, 8>::run(N, K, R, U, F, T, S);
|
||||
if (D == 16) RunStride<Scalar, 16>::run(N, K, R, U, F, T, S);
|
||||
if (D == 32) RunStride<Scalar, 32>::run(N, K, R, U, F, T, S);
|
||||
}
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -41,7 +42,6 @@
|
||||
//@HEADER
|
||||
*/
|
||||
|
||||
|
||||
#define UNROLL 1
|
||||
#include <bench_unroll_stride.hpp>
|
||||
#undef UNROLL
|
||||
@ -121,4 +121,3 @@ static void run(int N, int K, int R, int U, int F, int T, int S) {
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -53,11 +54,14 @@ static void run(int N, int K, int R, int F, int T, int S) {
|
||||
Kokkos::deep_copy(C, Scalar(3.5));
|
||||
|
||||
Kokkos::Timer timer;
|
||||
Kokkos::parallel_for("BenchmarkKernel",Kokkos::TeamPolicy<>(N,T).set_scratch_size(0,Kokkos::PerTeam(S)),
|
||||
Kokkos::parallel_for(
|
||||
"BenchmarkKernel",
|
||||
Kokkos::TeamPolicy<>(N, T).set_scratch_size(0, Kokkos::PerTeam(S)),
|
||||
KOKKOS_LAMBDA(const Kokkos::TeamPolicy<>::member_type& team) {
|
||||
const int n = team.league_rank();
|
||||
for (int r = 0; r < R; r++) {
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team,0,K), [&] (const int& i) {
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::TeamThreadRange(team, 0, K), [&](const int& i) {
|
||||
Scalar a1 = A(n, i, 0);
|
||||
const Scalar b = B(n, i, 0);
|
||||
#if (UNROLL > 1)
|
||||
@ -82,7 +86,6 @@ static void run(int N, int K, int R, int F, int T, int S) {
|
||||
Scalar a8 = a7 * 1.1;
|
||||
#endif
|
||||
|
||||
|
||||
for (int f = 0; f < F; f++) {
|
||||
a1 += b * a1;
|
||||
#if (UNROLL > 1)
|
||||
@ -106,8 +109,6 @@ static void run(int N, int K, int R, int F, int T, int S) {
|
||||
#if (UNROLL > 7)
|
||||
a8 += b * a8;
|
||||
#endif
|
||||
|
||||
|
||||
}
|
||||
#if (UNROLL == 1)
|
||||
C(n, i, 0) = a1;
|
||||
@ -133,7 +134,6 @@ static void run(int N, int K, int R, int F, int T, int S) {
|
||||
#if (UNROLL == 8)
|
||||
C(n, i, 0) = a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8;
|
||||
#endif
|
||||
|
||||
});
|
||||
}
|
||||
});
|
||||
@ -142,7 +142,10 @@ static void run(int N, int K, int R, int F, int T, int S) {
|
||||
|
||||
double bytes = 1.0 * N * K * R * 3 * sizeof(Scalar);
|
||||
double flops = 1.0 * N * K * R * (F * 2 * UNROLL + 2 * (UNROLL - 1));
|
||||
printf("NKRUFTS: %i %i %i %i %i %i %i Time: %lfs Bandwidth: %lfGiB/s GFlop/s: %lf\n",N,K,R,UNROLL,F,T,S,seconds,1.0*bytes/seconds/1024/1024/1024,1.e-9*flops/seconds);
|
||||
printf(
|
||||
"NKRUFTS: %i %i %i %i %i %i %i Time: %lfs Bandwidth: %lfGiB/s GFlop/s: "
|
||||
"%lf\n",
|
||||
N, K, R, UNROLL, F, T, S, seconds,
|
||||
1.0 * bytes / seconds / 1024 / 1024 / 1024, 1.e-9 * flops / seconds);
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -49,7 +50,6 @@
|
||||
int main(int argc, char* argv[]) {
|
||||
Kokkos::initialize();
|
||||
|
||||
|
||||
if (argc < 10) {
|
||||
printf("Arguments: N K R D U F T S\n");
|
||||
printf(" P: Precision (1==float, 2==double)\n");
|
||||
@ -57,9 +57,12 @@ int main(int argc, char* argv[]) {
|
||||
printf(" R: how often to loop through the K dimension with each team\n");
|
||||
printf(" D: distance between loaded elements (stride)\n");
|
||||
printf(" U: how many independent flops to do per load\n");
|
||||
printf(" F: how many times to repeat the U unrolled operations before reading next element\n");
|
||||
printf(
|
||||
" F: how many times to repeat the U unrolled operations before "
|
||||
"reading next element\n");
|
||||
printf(" T: team size\n");
|
||||
printf(" S: shared memory per team (used to control occupancy on GPUs)\n");
|
||||
printf(
|
||||
" S: shared memory per team (used to control occupancy on GPUs)\n");
|
||||
printf("Example Input GPU:\n");
|
||||
printf(" Bandwidth Bound : 2 100000 1024 1 1 1 1 256 6000\n");
|
||||
printf(" Cache Bound : 2 100000 1024 64 1 1 1 512 20000\n");
|
||||
@ -70,7 +73,6 @@ int main(int argc, char* argv[]) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
int P = atoi(argv[1]);
|
||||
int N = atoi(argv[2]);
|
||||
int K = atoi(argv[3]);
|
||||
@ -81,9 +83,18 @@ int main(int argc, char* argv[]) {
|
||||
int T = atoi(argv[8]);
|
||||
int S = atoi(argv[9]);
|
||||
|
||||
if(U>8) {printf("U must be 1-8\n"); return 0;}
|
||||
if( (D!=1) && (D!=2) && (D!=4) && (D!=8) && (D!=16) && (D!=32)) {printf("D must be one of 1,2,4,8,16,32\n"); return 0;}
|
||||
if( (P!=1) && (P!=2) ) {printf("P must be one of 1,2\n"); return 0;}
|
||||
if (U > 8) {
|
||||
printf("U must be 1-8\n");
|
||||
return 0;
|
||||
}
|
||||
if ((D != 1) && (D != 2) && (D != 4) && (D != 8) && (D != 16) && (D != 32)) {
|
||||
printf("D must be one of 1,2,4,8,16,32\n");
|
||||
return 0;
|
||||
}
|
||||
if ((P != 1) && (P != 2)) {
|
||||
printf("P must be one of 1,2\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (P == 1) {
|
||||
run_stride_unroll<float>(N, K, R, D, U, F, T, S);
|
||||
@ -94,4 +105,3 @@ int main(int argc, char* argv[]) {
|
||||
|
||||
Kokkos::finalize();
|
||||
}
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -73,20 +74,12 @@ struct RunGather {
|
||||
|
||||
template <class Scalar>
|
||||
void run_gather_test(int N, int K, int D, int R, int U, int F) {
|
||||
if(U == 1)
|
||||
RunGather<Scalar,1>::run(N,K,D,R,F);
|
||||
if(U == 2)
|
||||
RunGather<Scalar,2>::run(N,K,D,R,F);
|
||||
if(U == 3)
|
||||
RunGather<Scalar,3>::run(N,K,D,R,F);
|
||||
if(U == 4)
|
||||
RunGather<Scalar,4>::run(N,K,D,R,F);
|
||||
if(U == 5)
|
||||
RunGather<Scalar,5>::run(N,K,D,R,F);
|
||||
if(U == 6)
|
||||
RunGather<Scalar,6>::run(N,K,D,R,F);
|
||||
if(U == 7)
|
||||
RunGather<Scalar,7>::run(N,K,D,R,F);
|
||||
if(U == 8)
|
||||
RunGather<Scalar,8>::run(N,K,D,R,F);
|
||||
if (U == 1) RunGather<Scalar, 1>::run(N, K, D, R, F);
|
||||
if (U == 2) RunGather<Scalar, 2>::run(N, K, D, R, F);
|
||||
if (U == 3) RunGather<Scalar, 3>::run(N, K, D, R, F);
|
||||
if (U == 4) RunGather<Scalar, 4>::run(N, K, D, R, F);
|
||||
if (U == 5) RunGather<Scalar, 5>::run(N, K, D, R, F);
|
||||
if (U == 6) RunGather<Scalar, 6>::run(N, K, D, R, F);
|
||||
if (U == 7) RunGather<Scalar, 7>::run(N, K, D, R, F);
|
||||
if (U == 8) RunGather<Scalar, 8>::run(N, K, D, R, F);
|
||||
}
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -57,11 +58,13 @@ static void run(int N, int K, int D, int R, int F) {
|
||||
Kokkos::deep_copy(A_in, 1.5);
|
||||
Kokkos::deep_copy(B_in, 2.0);
|
||||
|
||||
Kokkos::View<const Scalar*, Kokkos::MemoryTraits<Kokkos::RandomAccess> > A(A_in);
|
||||
Kokkos::View<const Scalar*, Kokkos::MemoryTraits<Kokkos::RandomAccess> > B(B_in);
|
||||
Kokkos::View<const Scalar*, Kokkos::MemoryTraits<Kokkos::RandomAccess> > A(
|
||||
A_in);
|
||||
Kokkos::View<const Scalar*, Kokkos::MemoryTraits<Kokkos::RandomAccess> > B(
|
||||
B_in);
|
||||
|
||||
Kokkos::parallel_for("InitKernel",N,
|
||||
KOKKOS_LAMBDA (const int& i) {
|
||||
Kokkos::parallel_for(
|
||||
"InitKernel", N, KOKKOS_LAMBDA(const int& i) {
|
||||
auto rand_gen = rand_pool.get_state();
|
||||
for (int jj = 0; jj < K; jj++) {
|
||||
connectivity(i, jj) = (rand_gen.rand(D) + i - D / 2 + N) % N;
|
||||
@ -70,11 +73,10 @@ static void run(int N, int K, int D, int R, int F) {
|
||||
});
|
||||
Kokkos::fence();
|
||||
|
||||
|
||||
Kokkos::Timer timer;
|
||||
for (int r = 0; r < R; r++) {
|
||||
Kokkos::parallel_for("BenchmarkKernel",N,
|
||||
KOKKOS_LAMBDA (const int& i) {
|
||||
Kokkos::parallel_for(
|
||||
"BenchmarkKernel", N, KOKKOS_LAMBDA(const int& i) {
|
||||
Scalar c = Scalar(0.0);
|
||||
for (int jj = 0; jj < K; jj++) {
|
||||
const int j = connectivity(i, jj);
|
||||
@ -102,7 +104,6 @@ static void run(int N, int K, int D, int R, int F) {
|
||||
Scalar a8 = a7 * Scalar(1.1);
|
||||
#endif
|
||||
|
||||
|
||||
for (int f = 0; f < F; f++) {
|
||||
a1 += b * a1;
|
||||
#if (UNROLL > 1)
|
||||
@ -126,8 +127,6 @@ static void run(int N, int K, int D, int R, int F) {
|
||||
#if (UNROLL > 7)
|
||||
a8 += b * a8;
|
||||
#endif
|
||||
|
||||
|
||||
}
|
||||
#if (UNROLL == 1)
|
||||
c += a1;
|
||||
@ -153,7 +152,6 @@ static void run(int N, int K, int D, int R, int F) {
|
||||
#if (UNROLL == 8)
|
||||
c += a1 + a2 + a3 + a4 + a5 + a6 + a7 + a8;
|
||||
#endif
|
||||
|
||||
}
|
||||
C(i) = c;
|
||||
});
|
||||
@ -161,9 +159,15 @@ static void run(int N, int K, int D, int R, int F) {
|
||||
}
|
||||
double seconds = timer.seconds();
|
||||
|
||||
double bytes = 1.0*N*K*R*(2*sizeof(Scalar)+sizeof(int)) + 1.0*N*R*sizeof(Scalar);
|
||||
double bytes = 1.0 * N * K * R * (2 * sizeof(Scalar) + sizeof(int)) +
|
||||
1.0 * N * R * sizeof(Scalar);
|
||||
double flops = 1.0 * N * K * R * (F * 2 * UNROLL + 2 * (UNROLL - 1));
|
||||
double gather_ops = 1.0 * N * K * R * 2;
|
||||
printf("SNKDRUF: %i %i %i %i %i %i %i Time: %lfs Bandwidth: %lfGiB/s GFlop/s: %lf GGather/s: %lf\n",sizeof(Scalar)/4,N,K,D,R,UNROLL,F,seconds,1.0*bytes/seconds/1024/1024/1024,1.e-9*flops/seconds,1.e-9*gather_ops/seconds);
|
||||
printf(
|
||||
"SNKDRUF: %i %i %i %i %i %i %i Time: %lfs Bandwidth: %lfGiB/s GFlop/s: "
|
||||
"%lf GGather/s: %lf\n",
|
||||
sizeof(Scalar) / 4, N, K, D, R, UNROLL, F, seconds,
|
||||
1.0 * bytes / seconds / 1024 / 1024 / 1024, 1.e-9 * flops / seconds,
|
||||
1.e-9 * gather_ops / seconds);
|
||||
}
|
||||
};
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -51,13 +52,16 @@ int main(int argc, char* argv[]) {
|
||||
|
||||
if (argc < 8) {
|
||||
printf("Arguments: S N K D\n");
|
||||
printf(" S: Scalar Type Size (1==float, 2==double, 4=complex<double>)\n");
|
||||
printf(
|
||||
" S: Scalar Type Size (1==float, 2==double, 4=complex<double>)\n");
|
||||
printf(" N: Number of entities\n");
|
||||
printf(" K: Number of things to gather per entity\n");
|
||||
printf(" D: Max distance of gathered things of an entity\n");
|
||||
printf(" R: how often to loop through the K dimension with each team\n");
|
||||
printf(" U: how many independent flops to do per load\n");
|
||||
printf(" F: how many times to repeat the U unrolled operations before reading next element\n");
|
||||
printf(
|
||||
" F: how many times to repeat the U unrolled operations before "
|
||||
"reading next element\n");
|
||||
printf("Example Input GPU:\n");
|
||||
printf(" Bandwidth Bound : 2 10000000 1 1 10 1 1\n");
|
||||
printf(" Cache Bound : 2 10000000 64 1 10 1 1\n");
|
||||
@ -68,7 +72,6 @@ int main(int argc, char* argv[]) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
int S = atoi(argv[1]);
|
||||
int N = atoi(argv[2]);
|
||||
int K = atoi(argv[3]);
|
||||
@ -77,8 +80,14 @@ int main(int argc, char* argv[]) {
|
||||
int U = atoi(argv[6]);
|
||||
int F = atoi(argv[7]);
|
||||
|
||||
if( (S!=1) && (S!=2) && (S!=4)) {printf("S must be one of 1,2,4\n"); return 0;}
|
||||
if( N<D ) {printf("N must be larger or equal to D\n"); return 0; }
|
||||
if ((S != 1) && (S != 2) && (S != 4)) {
|
||||
printf("S must be one of 1,2,4\n");
|
||||
return 0;
|
||||
}
|
||||
if (N < D) {
|
||||
printf("N must be larger or equal to D\n");
|
||||
return 0;
|
||||
}
|
||||
if (S == 1) {
|
||||
run_gather_test<float>(N, K, D, R, U, F);
|
||||
}
|
||||
@ -90,4 +99,3 @@ int main(int argc, char* argv[]) {
|
||||
}
|
||||
Kokkos::finalize();
|
||||
}
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -49,24 +50,42 @@ int main(int argc, char* argv[] ) {
|
||||
|
||||
if (argc < 10) {
|
||||
printf(" Ten arguments are needed to run this program:\n");
|
||||
printf(" (1)team_range, (2)thread_range, (3)vector_range, (4)outer_repeat, (5)thread_repeat, (6)vector_repeat, (7)team_size, (8)vector_size, (9)schedule, (10)test_type\n");
|
||||
printf(
|
||||
" (1)team_range, (2)thread_range, (3)vector_range, (4)outer_repeat, "
|
||||
"(5)thread_repeat, (6)vector_repeat, (7)team_size, (8)vector_size, "
|
||||
"(9)schedule, (10)test_type\n");
|
||||
printf(" team_range: number of teams (league_size)\n");
|
||||
printf(" thread_range: range for nested TeamThreadRange parallel_*\n");
|
||||
printf(" vector_range: range for nested ThreadVectorRange parallel_*\n");
|
||||
printf(" outer_repeat: number of repeats for outer parallel_* call\n");
|
||||
printf(" thread_repeat: number of repeats for TeamThreadRange parallel_* call\n");
|
||||
printf(" vector_repeat: number of repeats for ThreadVectorRange parallel_* call\n");
|
||||
printf(
|
||||
" thread_repeat: number of repeats for TeamThreadRange parallel_* "
|
||||
"call\n");
|
||||
printf(
|
||||
" vector_repeat: number of repeats for ThreadVectorRange parallel_* "
|
||||
"call\n");
|
||||
printf(" team_size: number of team members (team_size)\n");
|
||||
printf(" vector_size: desired vectorization (if possible)\n");
|
||||
printf(" schedule: 1 == Static 2 == Dynamic\n");
|
||||
printf(" test_type: 3-digit code XYZ for testing (nested) parallel_*\n");
|
||||
printf(" code key: XYZ X in {1,2,3,4,5}, Y in {0,1,2}, Z in {0,1,2}\n");
|
||||
printf(
|
||||
" test_type: 3-digit code XYZ for testing (nested) parallel_*\n");
|
||||
printf(
|
||||
" code key: XYZ X in {1,2,3,4,5}, Y in {0,1,2}, Z in "
|
||||
"{0,1,2}\n");
|
||||
printf(" TeamPolicy:\n");
|
||||
printf(" X: 0 = none (never used, makes no sense); 1 = parallel_for; 2 = parallel_reduce\n");
|
||||
printf(" Y: 0 = none; 1 = parallel_for; 2 = parallel_reduce\n");
|
||||
printf(" Z: 0 = none; 1 = parallel_for; 2 = parallel_reduce\n");
|
||||
printf(
|
||||
" X: 0 = none (never used, makes no sense); 1 = "
|
||||
"parallel_for; 2 = parallel_reduce\n");
|
||||
printf(
|
||||
" Y: 0 = none; 1 = parallel_for; 2 = "
|
||||
"parallel_reduce\n");
|
||||
printf(
|
||||
" Z: 0 = none; 1 = parallel_for; 2 = "
|
||||
"parallel_reduce\n");
|
||||
printf(" RangePolicy:\n");
|
||||
printf(" X: 3 = parallel_for; 4 = parallel_reduce; 5 = parallel_scan\n");
|
||||
printf(
|
||||
" X: 3 = parallel_for; 4 = parallel_reduce; 5 = "
|
||||
"parallel_scan\n");
|
||||
printf(" Y: 0 = none\n");
|
||||
printf(" Z: 0 = none\n");
|
||||
printf(" Example Input:\n");
|
||||
@ -100,11 +119,12 @@ int main(int argc, char* argv[] ) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
if ( test_type != 100 && test_type != 110 && test_type != 111 && test_type != 112 && test_type != 120 && test_type != 121 && test_type != 122
|
||||
&& test_type != 200 && test_type != 210 && test_type != 211 && test_type != 212 && test_type != 220 && test_type != 221 && test_type != 222
|
||||
&& test_type != 300 && test_type != 400 && test_type != 500
|
||||
)
|
||||
{
|
||||
if (test_type != 100 && test_type != 110 && test_type != 111 &&
|
||||
test_type != 112 && test_type != 120 && test_type != 121 &&
|
||||
test_type != 122 && test_type != 200 && test_type != 210 &&
|
||||
test_type != 211 && test_type != 212 && test_type != 220 &&
|
||||
test_type != 221 && test_type != 222 && test_type != 300 &&
|
||||
test_type != 400 && test_type != 500) {
|
||||
printf("Incorrect test_type option\n");
|
||||
Kokkos::finalize();
|
||||
return -2;
|
||||
@ -112,21 +132,26 @@ int main(int argc, char* argv[] ) {
|
||||
|
||||
double result = 0.0;
|
||||
|
||||
Kokkos::parallel_reduce( "parallel_reduce warmup", Kokkos::TeamPolicy<>(10,1),
|
||||
KOKKOS_LAMBDA(const Kokkos::TeamPolicy<>::member_type team, double& lval) {
|
||||
lval += 1;
|
||||
}, result);
|
||||
Kokkos::parallel_reduce(
|
||||
"parallel_reduce warmup", Kokkos::TeamPolicy<>(10, 1),
|
||||
KOKKOS_LAMBDA(const Kokkos::TeamPolicy<>::member_type team,
|
||||
double& lval) { lval += 1; },
|
||||
result);
|
||||
|
||||
typedef Kokkos::View<double*, Kokkos::LayoutRight> view_type_1d;
|
||||
typedef Kokkos::View<double**, Kokkos::LayoutRight> view_type_2d;
|
||||
typedef Kokkos::View<double***, Kokkos::LayoutRight> view_type_3d;
|
||||
|
||||
// Allocate view without initializing
|
||||
// Call a 'warmup' test with 1 repeat - this will initialize the corresponding view appropriately for test and should obey first-touch etc
|
||||
// Second call to test is the one we actually care about and time
|
||||
view_type_1d v_1( Kokkos::ViewAllocateWithoutInitializing("v_1"), team_range*team_size);
|
||||
view_type_2d v_2( Kokkos::ViewAllocateWithoutInitializing("v_2"), team_range*team_size, thread_range);
|
||||
view_type_3d v_3( Kokkos::ViewAllocateWithoutInitializing("v_3"), team_range*team_size, thread_range, vector_range);
|
||||
// Call a 'warmup' test with 1 repeat - this will initialize the corresponding
|
||||
// view appropriately for test and should obey first-touch etc Second call to
|
||||
// test is the one we actually care about and time
|
||||
view_type_1d v_1(Kokkos::ViewAllocateWithoutInitializing("v_1"),
|
||||
team_range * team_size);
|
||||
view_type_2d v_2(Kokkos::ViewAllocateWithoutInitializing("v_2"),
|
||||
team_range * team_size, thread_range);
|
||||
view_type_3d v_3(Kokkos::ViewAllocateWithoutInitializing("v_3"),
|
||||
team_range * team_size, thread_range, vector_range);
|
||||
|
||||
double result_computed = 0.0;
|
||||
double result_expect = 0.0;
|
||||
@ -135,32 +160,56 @@ int main(int argc, char* argv[] ) {
|
||||
if (schedule == 1) {
|
||||
if (test_type != 500) {
|
||||
// warmup - no repeat of loops
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>,int>(team_range,thread_range,vector_range,1,1,1,team_size,vector_size,test_type,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>,int>(team_range,thread_range,vector_range,outer_repeat,thread_repeat,vector_repeat,team_size,vector_size,test_type,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
}
|
||||
else {
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>, int>(
|
||||
team_range, thread_range, vector_range, 1, 1, 1, team_size,
|
||||
vector_size, test_type, v_1, v_2, v_3, result_computed, result_expect,
|
||||
time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>, int>(
|
||||
team_range, thread_range, vector_range, outer_repeat, thread_repeat,
|
||||
vector_repeat, team_size, vector_size, test_type, v_1, v_2, v_3,
|
||||
result_computed, result_expect, time);
|
||||
} else {
|
||||
// parallel_scan: initialize 1d view for parallel_scan
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>,int>(team_range,thread_range,vector_range,1,1,1,team_size,vector_size,100,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>,int>(team_range,thread_range,vector_range,outer_repeat,thread_repeat,vector_repeat,team_size,vector_size,test_type,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>, int>(
|
||||
team_range, thread_range, vector_range, 1, 1, 1, team_size,
|
||||
vector_size, 100, v_1, v_2, v_3, result_computed, result_expect,
|
||||
time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>, int>(
|
||||
team_range, thread_range, vector_range, outer_repeat, thread_repeat,
|
||||
vector_repeat, team_size, vector_size, test_type, v_1, v_2, v_3,
|
||||
result_computed, result_expect, time);
|
||||
}
|
||||
}
|
||||
if (schedule == 2) {
|
||||
if (test_type != 500) {
|
||||
// warmup - no repeat of loops
|
||||
test_policy<Kokkos::Schedule<Kokkos::Dynamic>,int>(team_range,thread_range,vector_range,1,1,1,team_size,vector_size,test_type,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Dynamic>,int>(team_range,thread_range,vector_range,outer_repeat,thread_repeat,vector_repeat,team_size,vector_size,test_type,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
}
|
||||
else {
|
||||
test_policy<Kokkos::Schedule<Kokkos::Dynamic>, int>(
|
||||
team_range, thread_range, vector_range, 1, 1, 1, team_size,
|
||||
vector_size, test_type, v_1, v_2, v_3, result_computed, result_expect,
|
||||
time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Dynamic>, int>(
|
||||
team_range, thread_range, vector_range, outer_repeat, thread_repeat,
|
||||
vector_repeat, team_size, vector_size, test_type, v_1, v_2, v_3,
|
||||
result_computed, result_expect, time);
|
||||
} else {
|
||||
// parallel_scan: initialize 1d view for parallel_scan
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>,int>(team_range,thread_range,vector_range,1,1,1,team_size,vector_size,100,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>,int>(team_range,thread_range,vector_range,outer_repeat,thread_repeat,vector_repeat,team_size,vector_size,test_type,v_1,v_2,v_3,result_computed,result_expect,time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>, int>(
|
||||
team_range, thread_range, vector_range, 1, 1, 1, team_size,
|
||||
vector_size, 100, v_1, v_2, v_3, result_computed, result_expect,
|
||||
time);
|
||||
test_policy<Kokkos::Schedule<Kokkos::Static>, int>(
|
||||
team_range, thread_range, vector_range, outer_repeat, thread_repeat,
|
||||
vector_repeat, team_size, vector_size, test_type, v_1, v_2, v_3,
|
||||
result_computed, result_expect, time);
|
||||
}
|
||||
}
|
||||
|
||||
if (disable_verbose_output == 0) {
|
||||
printf("%7i %4i %2i %9i %4i %4i %4i %2i %1i %3i %e %e %lf\n",team_range,thread_range,vector_range,outer_repeat,thread_repeat,vector_repeat,team_size,vector_size,schedule,test_type,result_computed,result_expect,time);
|
||||
}
|
||||
else {
|
||||
printf("%7i %4i %2i %9i %4i %4i %4i %2i %1i %3i %e %e %lf\n", team_range,
|
||||
thread_range, vector_range, outer_repeat, thread_repeat,
|
||||
vector_repeat, team_size, vector_size, schedule, test_type,
|
||||
result_computed, result_expect, time);
|
||||
} else {
|
||||
printf("%lf\n", time);
|
||||
}
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -48,13 +49,10 @@ struct ParallelScanFunctor {
|
||||
using value_type = double;
|
||||
ViewType v;
|
||||
|
||||
ParallelScanFunctor( const ViewType & v_ )
|
||||
: v(v_)
|
||||
{}
|
||||
ParallelScanFunctor(const ViewType& v_) : v(v_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()( const int idx, value_type& val, const bool& final ) const
|
||||
{
|
||||
void operator()(const int idx, value_type& val, const bool& final) const {
|
||||
// inclusive scan
|
||||
val += v(idx);
|
||||
if (final) {
|
||||
@ -63,21 +61,21 @@ struct ParallelScanFunctor {
|
||||
}
|
||||
};
|
||||
|
||||
template<class ScheduleType,class IndexType,class ViewType1, class ViewType2, class ViewType3>
|
||||
template <class ScheduleType, class IndexType, class ViewType1, class ViewType2,
|
||||
class ViewType3>
|
||||
void test_policy(int team_range, int thread_range, int vector_range,
|
||||
int outer_repeat, int thread_repeat, int inner_repeat,
|
||||
int team_size, int vector_size, int test_type,
|
||||
ViewType1 &v1, ViewType2 &v2, ViewType3 &v3,
|
||||
double &result, double &result_expect, double &time) {
|
||||
|
||||
int team_size, int vector_size, int test_type, ViewType1& v1,
|
||||
ViewType2& v2, ViewType3& v3, double& result,
|
||||
double& result_expect, double& time) {
|
||||
typedef Kokkos::TeamPolicy<ScheduleType, IndexType> t_policy;
|
||||
typedef typename t_policy::member_type t_team;
|
||||
Kokkos::Timer timer;
|
||||
|
||||
for (int orep = 0; orep < outer_repeat; orep++) {
|
||||
|
||||
if (test_type == 100) {
|
||||
Kokkos::parallel_for("100 outer for", t_policy(team_range,team_size),
|
||||
Kokkos::parallel_for(
|
||||
"100 outer for", t_policy(team_range, team_size),
|
||||
KOKKOS_LAMBDA(const t_team& team) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
v1(idx) = idx;
|
||||
@ -86,12 +84,15 @@ void test_policy(int team_range, int thread_range, int vector_range,
|
||||
}
|
||||
|
||||
if (test_type == 110) {
|
||||
Kokkos::parallel_for("110 outer for", t_policy(team_range,team_size),
|
||||
Kokkos::parallel_for(
|
||||
"110 outer for", t_policy(team_range, team_size),
|
||||
KOKKOS_LAMBDA(const t_team& team) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
for (int tr = 0; tr < thread_repeat; ++tr) {
|
||||
// Each team launches a parallel_for; thread_range is partitioned among team members
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t) {
|
||||
// Each team launches a parallel_for; thread_range is partitioned
|
||||
// among team members
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t) {
|
||||
v2(idx, t) = t;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
@ -99,14 +100,20 @@ void test_policy(int team_range, int thread_range, int vector_range,
|
||||
});
|
||||
}
|
||||
if (test_type == 111) {
|
||||
Kokkos::parallel_for("111 outer for", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_for(
|
||||
"111 outer for", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
for (int tr = 0; tr < thread_repeat; ++tr) {
|
||||
// Each team launches a parallel_for; thread_range is partitioned among team members
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t) {
|
||||
// Each team launches a parallel_for; thread_range is partitioned
|
||||
// among team members
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t) {
|
||||
for (int vr = 0; vr < inner_repeat; ++vr)
|
||||
Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,vector_range), [&] (const int vi) {
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi) {
|
||||
v3(idx, t, vi) = vi;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
@ -115,18 +122,23 @@ void test_policy(int team_range, int thread_range, int vector_range,
|
||||
});
|
||||
}
|
||||
if (test_type == 112) {
|
||||
Kokkos::parallel_for("112 outer for", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_for(
|
||||
"112 outer for", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
for (int tr = 0; tr < thread_repeat; ++tr) {
|
||||
// Each team launches a parallel_for; thread_range is partitioned among team members
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t) {
|
||||
// Each team launches a parallel_for; thread_range is partitioned
|
||||
// among team members
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t) {
|
||||
double vector_result = 0.0;
|
||||
for (int vr = 0; vr < inner_repeat; ++vr) {
|
||||
vector_result = 0.0;
|
||||
Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team,vector_range), [&] (const int vi, double &vval) {
|
||||
vval += 1;
|
||||
}, vector_result);
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi, double& vval) { vval += 1; },
|
||||
vector_result);
|
||||
}
|
||||
v2(idx, t) = vector_result;
|
||||
// prevent compiler optimizing loop away
|
||||
@ -135,188 +147,255 @@ void test_policy(int team_range, int thread_range, int vector_range,
|
||||
});
|
||||
}
|
||||
if (test_type == 120) {
|
||||
Kokkos::parallel_for("120 outer for", t_policy(team_range,team_size),
|
||||
Kokkos::parallel_for(
|
||||
"120 outer for", t_policy(team_range, team_size),
|
||||
KOKKOS_LAMBDA(const t_team& team) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
double team_result = 0.0;
|
||||
for (int tr = 0; tr < thread_repeat; ++tr) {
|
||||
team_result = 0.0;
|
||||
Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t, double &lval) {
|
||||
lval += 1;
|
||||
}, team_result);
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t, double& lval) { lval += 1; }, team_result);
|
||||
}
|
||||
v1(idx) = team_result;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
}
|
||||
if (test_type == 121) {
|
||||
Kokkos::parallel_for("121 outer for", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_for(
|
||||
"121 outer for", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
double team_result = 0.0;
|
||||
for (int tr = 0; tr < thread_repeat; ++tr) {
|
||||
team_result = 0.0;
|
||||
Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t, double &lval) {
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t, double& lval) {
|
||||
lval += 1;
|
||||
for (int vr = 0; vr < inner_repeat; ++vr) {
|
||||
Kokkos::parallel_for(Kokkos::ThreadVectorRange(team,vector_range), [&] (const int vi) {
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi) {
|
||||
v3(idx, t, vi) = vi;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
}
|
||||
}, team_result);
|
||||
},
|
||||
team_result);
|
||||
}
|
||||
v3(idx, 0, 0) = team_result;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
}
|
||||
if (test_type == 122) {
|
||||
Kokkos::parallel_for("122 outer for", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_for(
|
||||
"122 outer for", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
double team_result = 0.0;
|
||||
for (int tr = 0; tr < thread_repeat; ++tr) {
|
||||
Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t, double &lval) {
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t, double& lval) {
|
||||
double vector_result = 0.0;
|
||||
for (int vr = 0; vr < inner_repeat; ++vr) {
|
||||
vector_result = 0.0;
|
||||
Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team,vector_range), [&] (const int vi, double &vval) {
|
||||
vval += 1;
|
||||
}, vector_result);
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi, double& vval) { vval += 1; },
|
||||
vector_result);
|
||||
lval += vector_result;
|
||||
}
|
||||
}, team_result);
|
||||
},
|
||||
team_result);
|
||||
}
|
||||
v1(idx) = team_result;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
}
|
||||
if (test_type == 200) {
|
||||
Kokkos::parallel_reduce("200 outer reduce", t_policy(team_range,team_size),
|
||||
Kokkos::parallel_reduce(
|
||||
"200 outer reduce", t_policy(team_range, team_size),
|
||||
KOKKOS_LAMBDA(const t_team& team, double& lval) {
|
||||
lval += team.team_size() * team.league_rank() + team.team_rank();
|
||||
},result);
|
||||
result_expect = 0.5* (team_range*team_size)*(team_range*team_size-1);
|
||||
},
|
||||
result);
|
||||
result_expect =
|
||||
0.5 * (team_range * team_size) * (team_range * team_size - 1);
|
||||
// sum ( seq( [0, team_range*team_size) )
|
||||
}
|
||||
if (test_type == 210) {
|
||||
Kokkos::parallel_reduce("210 outer reduce", t_policy(team_range,team_size),
|
||||
Kokkos::parallel_reduce(
|
||||
"210 outer reduce", t_policy(team_range, team_size),
|
||||
KOKKOS_LAMBDA(const t_team& team, double& lval) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
double thread_for = 1.0;
|
||||
for (int tr = 0; tr < thread_repeat; tr++) {
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t) {
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t) {
|
||||
v2(idx, t) = t;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
}
|
||||
lval+=(team.team_size()*team.league_rank() + team.team_rank() + thread_for);
|
||||
},result);
|
||||
result_expect = 0.5* (team_range*team_size)*(team_range*team_size-1) + (team_range*team_size);
|
||||
// sum ( seq( [0, team_range*team_size) + 1 per team_member (total of team_range*team_size) )
|
||||
lval += (team.team_size() * team.league_rank() + team.team_rank() +
|
||||
thread_for);
|
||||
},
|
||||
result);
|
||||
result_expect =
|
||||
0.5 * (team_range * team_size) * (team_range * team_size - 1) +
|
||||
(team_range * team_size);
|
||||
// sum ( seq( [0, team_range*team_size) + 1 per team_member (total of
|
||||
// team_range*team_size) )
|
||||
}
|
||||
if (test_type == 211) {
|
||||
Kokkos::parallel_reduce("211 outer reduce", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_reduce(
|
||||
"211 outer reduce", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team, double& lval) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
double thread_for = 1.0;
|
||||
for (int tr = 0; tr < thread_repeat; tr++) {
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t) {
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t) {
|
||||
for (int vr = 0; vr < inner_repeat; ++vr)
|
||||
Kokkos::parallel_for(Kokkos::ThreadVectorRange(team, vector_range), [&] (const int vi) {
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi) {
|
||||
v3(idx, t, vi) = vi;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
});
|
||||
}
|
||||
lval += idx + thread_for;
|
||||
},result);
|
||||
result_expect = 0.5*(team_range*team_size)*(team_range*team_size-1) + (team_range*team_size);
|
||||
// sum ( seq( [0, team_range*team_size) + 1 per team_member (total of team_range*team_size) )
|
||||
},
|
||||
result);
|
||||
result_expect =
|
||||
0.5 * (team_range * team_size) * (team_range * team_size - 1) +
|
||||
(team_range * team_size);
|
||||
// sum ( seq( [0, team_range*team_size) + 1 per team_member (total of
|
||||
// team_range*team_size) )
|
||||
}
|
||||
if (test_type == 212) {
|
||||
Kokkos::parallel_reduce("212 outer reduce", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_reduce(
|
||||
"212 outer reduce", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team, double& lval) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
double vector_result = 0.0;
|
||||
for (int tr = 0; tr < thread_repeat; tr++) {
|
||||
// This parallel_for is executed by each team; the thread_range is partitioned among the team members
|
||||
Kokkos::parallel_for(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t) {
|
||||
// This parallel_for is executed by each team; the thread_range is
|
||||
// partitioned among the team members
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t) {
|
||||
v2(idx, t) = t;
|
||||
// prevent compiler optimizing loop away
|
||||
for (int vr = 0; vr < inner_repeat; ++vr) {
|
||||
vector_result = 0.0;
|
||||
Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team, vector_range), [&] (const int vi, double &vval) {
|
||||
vval += vi;
|
||||
}, vector_result );
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi, double& vval) { vval += vi; },
|
||||
vector_result);
|
||||
}
|
||||
});
|
||||
}
|
||||
lval += idx + vector_result;
|
||||
},result);
|
||||
result_expect = 0.5*(team_range*team_size)*(team_range*team_size-1) + (0.5*vector_range*(vector_range-1)*team_range*team_size);
|
||||
// sum ( seq( [0, team_range*team_size) + sum( seq( [0, vector_range) ) per team_member (total of team_range*team_size) )
|
||||
},
|
||||
result);
|
||||
result_expect =
|
||||
0.5 * (team_range * team_size) * (team_range * team_size - 1) +
|
||||
(0.5 * vector_range * (vector_range - 1) * team_range * team_size);
|
||||
// sum ( seq( [0, team_range*team_size) + sum( seq( [0, vector_range) )
|
||||
// per team_member (total of team_range*team_size) )
|
||||
}
|
||||
if (test_type == 220) {
|
||||
Kokkos::parallel_reduce("220 outer reduce", t_policy(team_range,team_size),
|
||||
Kokkos::parallel_reduce(
|
||||
"220 outer reduce", t_policy(team_range, team_size),
|
||||
KOKKOS_LAMBDA(const t_team& team, double& lval) {
|
||||
double team_result = 0.0;
|
||||
for (int tr = 0; tr < thread_repeat; tr++) {
|
||||
Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t, double& tval) {
|
||||
tval += t;
|
||||
},team_result);
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t, double& tval) { tval += t; }, team_result);
|
||||
}
|
||||
lval += team_result * team.league_rank(); // constant * league_rank
|
||||
},result);
|
||||
result_expect = 0.5*(team_range)*(team_range-1) * team_size * 0.5*(thread_range)*(thread_range-1);
|
||||
// sum ( seq( [0, team_range) * constant ); constant = sum( seq( [0, thread_range) )*team_size (1 per member, result for each team)
|
||||
},
|
||||
result);
|
||||
result_expect = 0.5 * (team_range) * (team_range - 1) * team_size * 0.5 *
|
||||
(thread_range) * (thread_range - 1);
|
||||
// sum ( seq( [0, team_range) * constant ); constant = sum( seq( [0,
|
||||
// thread_range) )*team_size (1 per member, result for each team)
|
||||
}
|
||||
if (test_type == 221) {
|
||||
Kokkos::parallel_reduce("221 outer reduce", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_reduce(
|
||||
"221 outer reduce", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team, double& lval) {
|
||||
long idx = team.league_rank() * team.team_size() + team.team_rank();
|
||||
double team_result = 0;
|
||||
for (int tr = 0; tr < thread_repeat; tr++) {
|
||||
Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t, double& tval) {
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t, double& tval) {
|
||||
double vector_for = 1.0;
|
||||
for (int vr = 0; vr < inner_repeat; ++vr) {
|
||||
Kokkos::parallel_for(Kokkos::ThreadVectorRange(team, vector_range), [&] (const int vi) {
|
||||
Kokkos::parallel_for(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi) {
|
||||
v3(idx, t, vi) = vi;
|
||||
// prevent compiler optimizing loop away
|
||||
});
|
||||
}
|
||||
tval += t + vector_for;
|
||||
},team_result);
|
||||
},
|
||||
team_result);
|
||||
}
|
||||
lval += team_result * team.league_rank();
|
||||
},result);
|
||||
result_expect = 0.5* (team_range)*(team_range-1) * team_size * (0.5*(thread_range) * (thread_range-1) + thread_range);
|
||||
// sum ( seq( [0, team_range) * constant ) + 1 per member per team; constant = sum( seq( [0, thread_range) )*team_size (1 per member, result for each team)
|
||||
},
|
||||
result);
|
||||
result_expect =
|
||||
0.5 * (team_range) * (team_range - 1) * team_size *
|
||||
(0.5 * (thread_range) * (thread_range - 1) + thread_range);
|
||||
// sum ( seq( [0, team_range) * constant ) + 1 per member per team;
|
||||
// constant = sum( seq( [0, thread_range) )*team_size (1 per member,
|
||||
// result for each team)
|
||||
}
|
||||
if (test_type == 222) {
|
||||
Kokkos::parallel_reduce("222 outer reduce", t_policy(team_range,team_size,vector_size),
|
||||
Kokkos::parallel_reduce(
|
||||
"222 outer reduce", t_policy(team_range, team_size, vector_size),
|
||||
KOKKOS_LAMBDA(const t_team& team, double& lval) {
|
||||
double team_result = 0.0;
|
||||
for (int tr = 0; tr < thread_repeat; tr++) {
|
||||
Kokkos::parallel_reduce(Kokkos::TeamThreadRange(team,thread_range), [&] (const int t, double& tval) {
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::TeamThreadRange(team, thread_range),
|
||||
[&](const int t, double& tval) {
|
||||
double vector_result = 0.0;
|
||||
for (int vr = 0; vr < inner_repeat; ++vr) {
|
||||
Kokkos::parallel_reduce(Kokkos::ThreadVectorRange(team, vector_range), [&] (const int vi, double& vval) {
|
||||
vval += vi;
|
||||
}, vector_result);
|
||||
Kokkos::parallel_reduce(
|
||||
Kokkos::ThreadVectorRange(team, vector_range),
|
||||
[&](const int vi, double& vval) { vval += vi; },
|
||||
vector_result);
|
||||
}
|
||||
tval += t + vector_result;
|
||||
},team_result);
|
||||
},
|
||||
team_result);
|
||||
}
|
||||
lval += team_result * team.league_rank();
|
||||
},result);
|
||||
result_expect = 0.5* (team_range)*(team_range-1) * team_size * (0.5*(thread_range) * (thread_range-1) + thread_range*0.5*(vector_range)*(vector_range-1));
|
||||
// sum ( seq( [0, team_range) * constant ) + 1 + sum( seq([0,vector_range) ) per member per team; constant = sum( seq( [0, thread_range) )*team_size (1 per member, result for each team)
|
||||
},
|
||||
result);
|
||||
result_expect =
|
||||
0.5 * (team_range) * (team_range - 1) * team_size *
|
||||
(0.5 * (thread_range) * (thread_range - 1) +
|
||||
thread_range * 0.5 * (vector_range) * (vector_range - 1));
|
||||
// sum ( seq( [0, team_range) * constant ) + 1 + sum( seq([0,vector_range)
|
||||
// ) per member per team; constant = sum( seq( [0, thread_range)
|
||||
// )*team_size (1 per member, result for each team)
|
||||
}
|
||||
|
||||
// parallel_for RangePolicy: range = team_size*team_range
|
||||
if (test_type == 300) {
|
||||
Kokkos::parallel_for("300 outer for", team_size*team_range,
|
||||
Kokkos::parallel_for(
|
||||
"300 outer for", team_size * team_range,
|
||||
KOKKOS_LAMBDA(const int idx) {
|
||||
v1(idx) = idx;
|
||||
// prevent compiler from optimizing away the loop
|
||||
@ -324,11 +403,11 @@ void test_policy(int team_range, int thread_range, int vector_range,
|
||||
}
|
||||
// parallel_reduce RangePolicy: range = team_size*team_range
|
||||
if (test_type == 400) {
|
||||
Kokkos::parallel_reduce("400 outer reduce", team_size*team_range,
|
||||
KOKKOS_LAMBDA (const int idx, double& val) {
|
||||
val += idx;
|
||||
}, result);
|
||||
result_expect = 0.5*(team_size*team_range)*(team_size*team_range-1);
|
||||
Kokkos::parallel_reduce(
|
||||
"400 outer reduce", team_size * team_range,
|
||||
KOKKOS_LAMBDA(const int idx, double& val) { val += idx; }, result);
|
||||
result_expect =
|
||||
0.5 * (team_size * team_range) * (team_size * team_range - 1);
|
||||
}
|
||||
// parallel_scan RangePolicy: range = team_size*team_range
|
||||
if (test_type == 500) {
|
||||
@ -345,8 +424,9 @@ void test_policy(int team_range, int thread_range, int vector_range,
|
||||
}
|
||||
#endif
|
||||
);
|
||||
// result = v1( team_size*team_range - 1 ); // won't work with Cuda - need to copy result back to host to print
|
||||
// result_expect = 0.5*(team_size*team_range)*(team_size*team_range-1);
|
||||
// result = v1( team_size*team_range - 1 ); // won't work with Cuda - need
|
||||
// to copy result back to host to print result_expect =
|
||||
// 0.5*(team_size*team_range)*(team_size*team_range-1);
|
||||
}
|
||||
|
||||
} // end outer for loop
|
||||
|
||||
@ -2,7 +2,7 @@
|
||||
|
||||
# Sample script for benchmarking policy performance
|
||||
|
||||
# Suggested environment variables to export prior to executing script:
|
||||
# Suggested enviroment variables to export prior to executing script:
|
||||
# KNL:
|
||||
# OMP_NUM_THREADS=256 KMP_AFFINITY=compact
|
||||
# Power:
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
|
||||
@ -383,7 +383,7 @@ fi
|
||||
# Check unknown arguments
|
||||
################################################################################
|
||||
if [[ ${#UNKNOWN_ARGS[*]} > 0 ]]; then
|
||||
echo "HPCBIND Unknown options: ${UNKNOWN_ARGS[*]}" > >(tee -a ${HPCBIND_LOG})
|
||||
echo "HPCBIND Uknown options: ${UNKNOWN_ARGS[*]}" > >(tee -a ${HPCBIND_LOG})
|
||||
exit 1
|
||||
fi
|
||||
|
||||
|
||||
@ -85,11 +85,11 @@ first_xcompiler_arg=1
|
||||
|
||||
temp_dir=${TMPDIR:-/tmp}
|
||||
|
||||
# Check if we have an optimization argument already
|
||||
optimization_applied=0
|
||||
# optimization flag added as a command-line argument
|
||||
optimization_flag=""
|
||||
|
||||
# Check if we have -std=c++X or --std=c++X already
|
||||
stdcxx_applied=0
|
||||
# std standard flag added as a command-line argument
|
||||
std_flag=""
|
||||
|
||||
# Run nvcc a second time to generate dependencies if needed
|
||||
depfile_separate=0
|
||||
@ -99,6 +99,10 @@ depfile_target_arg=""
|
||||
# Option to remove duplicate libraries and object files
|
||||
remove_duplicate_link_files=0
|
||||
|
||||
function warn_std_flag() {
|
||||
echo "nvcc_wrapper - *warning* you have set multiple standard flags (-std=c++1* or --std=c++1*), only the last is used because nvcc can only accept a single std setting"
|
||||
}
|
||||
|
||||
#echo "Arguments: $# $@"
|
||||
|
||||
while [ $# -gt 0 ]
|
||||
@ -130,12 +134,16 @@ do
|
||||
;;
|
||||
# Ensure we only have one optimization flag because NVCC doesn't allow muliple
|
||||
-O*)
|
||||
if [ $optimization_applied -eq 1 ]; then
|
||||
echo "nvcc_wrapper - *warning* you have set multiple optimization flags (-O*), only the first is used because nvcc can only accept a single optimization setting."
|
||||
else
|
||||
shared_args="$shared_args $1"
|
||||
optimization_applied=1
|
||||
if [ -n "$optimization_flag" ]; then
|
||||
echo "nvcc_wrapper - *warning* you have set multiple optimization flags (-O*), only the last is used because nvcc can only accept a single optimization setting."
|
||||
shared_args=${shared_args/ $optimization_flag/}
|
||||
fi
|
||||
if [ "$1" = "-O" ]; then
|
||||
optimization_flag="-O2"
|
||||
else
|
||||
optimization_flag=$1
|
||||
fi
|
||||
shared_args="$shared_args $optimization_flag"
|
||||
;;
|
||||
#Handle shared args (valid for both nvcc and the host compiler)
|
||||
-D*)
|
||||
@ -171,7 +179,7 @@ do
|
||||
shift
|
||||
;;
|
||||
#Handle known nvcc args
|
||||
--dryrun|--verbose|--keep|--keep-dir*|-G|--relocatable-device-code*|-lineinfo|-expt-extended-lambda|--resource-usage|-Xptxas*)
|
||||
--dryrun|--verbose|--keep|--keep-dir*|-G|--relocatable-device-code*|-lineinfo|-expt-extended-lambda|--resource-usage|-Xptxas*|--fmad*)
|
||||
cuda_args="$cuda_args $1"
|
||||
;;
|
||||
#Handle more known nvcc args
|
||||
@ -179,21 +187,43 @@ do
|
||||
cuda_args="$cuda_args $1"
|
||||
;;
|
||||
#Handle known nvcc args that have an argument
|
||||
-rdc|-maxrregcount|--default-stream)
|
||||
-rdc|-maxrregcount|--default-stream|-Xnvlink|--fmad)
|
||||
cuda_args="$cuda_args $1 $2"
|
||||
shift
|
||||
;;
|
||||
-rdc=*|-maxrregcount*|--maxrregcount*)
|
||||
cuda_args="$cuda_args $1"
|
||||
;;
|
||||
#Handle c++11
|
||||
--std=c++11|-std=c++11|--std=c++14|-std=c++14|--std=c++1y|-std=c++1y|--std=c++17|-std=c++17|--std=c++1z|-std=c++1z)
|
||||
if [ $stdcxx_applied -eq 1 ]; then
|
||||
echo "nvcc_wrapper - *warning* you have set multiple optimization flags (-std=c++1* or --std=c++1*), only the first is used because nvcc can only accept a single std setting"
|
||||
else
|
||||
shared_args="$shared_args $1"
|
||||
stdcxx_applied=1
|
||||
#Handle unsupported standard flags
|
||||
--std=c++1y|-std=c++1y|--std=c++1z|-std=c++1z|--std=gnu++1y|-std=gnu++1y|--std=gnu++1z|-std=gnu++1z|--std=c++2a|-std=c++2a|--std=c++17|-std=c++17)
|
||||
fallback_std_flag="-std=c++14"
|
||||
# this is hopefully just occurring in a downstream project during CMake feature tests
|
||||
# we really have no choice here but to accept the flag and change to an accepted C++ standard
|
||||
echo "nvcc_wrapper does not accept standard flags $1 since partial standard flags and standards after C++14 are not supported. nvcc_wrapper will use $fallback_std_flag instead. It is undefined behavior to use this flag. This should only be occurring during CMake configuration."
|
||||
if [ -n "$std_flag" ]; then
|
||||
warn_std_flag
|
||||
shared_args=${shared_args/ $std_flag/}
|
||||
fi
|
||||
std_flag=$fallback_std_flag
|
||||
shared_args="$shared_args $std_flag"
|
||||
;;
|
||||
-std=gnu*)
|
||||
corrected_std_flag=${1/gnu/c}
|
||||
echo "nvcc_wrapper has been given GNU extension standard flag $1 - reverting flag to $corrected_std_flag"
|
||||
if [ -n "$std_flag" ]; then
|
||||
warn_std_flag
|
||||
shared_args=${shared_args/ $std_flag/}
|
||||
fi
|
||||
std_flag=$corrected_std_flag
|
||||
shared_args="$shared_args $std_flag"
|
||||
;;
|
||||
--std=c++11|-std=c++11|--std=c++14|-std=c++14)
|
||||
if [ -n "$std_flag" ]; then
|
||||
warn_std_flag
|
||||
shared_args=${shared_args/ $std_flag/}
|
||||
fi
|
||||
std_flag=$1
|
||||
shared_args="$shared_args $std_flag"
|
||||
;;
|
||||
|
||||
#strip of -std=c++98 due to nvcc warnings and Tribits will place both -std=c++11 and -std=c++98
|
||||
@ -308,16 +338,6 @@ do
|
||||
shift
|
||||
done
|
||||
|
||||
#Check if nvcc exists
|
||||
if [ $host_only -ne 1 ]; then
|
||||
var=$(which nvcc )
|
||||
if [ $? -gt 0 ]; then
|
||||
echo "Could not find nvcc in PATH"
|
||||
exit $?
|
||||
fi
|
||||
fi
|
||||
|
||||
|
||||
# Only print host compiler version
|
||||
if [ $get_host_version -eq 1 ]; then
|
||||
$host_compiler --version
|
||||
@ -372,6 +392,9 @@ if [ $first_xcompiler_arg -eq 0 ]; then
|
||||
nvcc_command="$nvcc_command -Xcompiler $xcompiler_args"
|
||||
fi
|
||||
|
||||
#Replace all commas in xcompiler_args with a space for the host only command
|
||||
xcompiler_args=${xcompiler_args//,/" "}
|
||||
|
||||
#Compose host only command
|
||||
host_command="$host_compiler $shared_args $host_only_args $compile_arg $output_arg $xcompiler_args $host_linker_args $shared_versioned_libraries_host"
|
||||
|
||||
|
||||
339
lib/kokkos/cm_generate_makefile.bash
Executable file
339
lib/kokkos/cm_generate_makefile.bash
Executable file
@ -0,0 +1,339 @@
|
||||
#!/bin/bash
|
||||
|
||||
update_kokkos_devices() {
|
||||
SEARCH_TEXT="*$1*"
|
||||
if [[ $KOKKOS_DEVICES == $SEARCH_TEXT ]]; then
|
||||
echo kokkos devices already includes $SEARCH_TEXT
|
||||
else
|
||||
if [ "$KOKKOS_DEVICES" = "" ]; then
|
||||
KOKKOS_DEVICES="$1"
|
||||
echo reseting kokkos devices to $KOKKOS_DEVICES
|
||||
else
|
||||
KOKKOS_DEVICES="${KOKKOS_DEVICES},$1"
|
||||
echo appending to kokkos devices $KOKKOS_DEVICES
|
||||
fi
|
||||
fi
|
||||
}
|
||||
|
||||
get_kokkos_device_list() {
|
||||
KOKKOS_DEVICE_CMD=
|
||||
PARSE_DEVICES_LST=$(echo $KOKKOS_DEVICES | tr "," "\n")
|
||||
for DEVICE_ in $PARSE_DEVICES_LST
|
||||
do
|
||||
UC_DEVICE=$(echo $DEVICE_ | tr "[:lower:]" "[:upper:]")
|
||||
KOKKOS_DEVICE_CMD="-DKokkos_ENABLE_${UC_DEVICE}=ON ${KOKKOS_DEVICE_CMD}"
|
||||
done
|
||||
}
|
||||
|
||||
get_kokkos_arch_list() {
|
||||
KOKKOS_ARCH_CMD=
|
||||
PARSE_ARCH_LST=$(echo $KOKKOS_ARCH | tr "," "\n")
|
||||
for ARCH_ in $PARSE_ARCH_LST
|
||||
do
|
||||
UC_ARCH=$(echo $ARCH_ | tr "[:lower:]" "[:upper:]")
|
||||
KOKKOS_ARCH_CMD="-DKokkos_ARCH_${UC_ARCH}=ON ${KOKKOS_ARCH_CMD}"
|
||||
done
|
||||
}
|
||||
|
||||
get_kokkos_cuda_option_list() {
|
||||
echo parsing KOKKOS_CUDA_OPTIONS=$KOKKOS_CUDA_OPTIONS
|
||||
KOKKOS_CUDA_OPTION_CMD=
|
||||
PARSE_CUDA_LST=$(echo $KOKKOS_CUDA_OPTIONS | tr "," "\n")
|
||||
for CUDA_ in $PARSE_CUDA_LST
|
||||
do
|
||||
CUDA_OPT_NAME=
|
||||
if [ "${CUDA_}" == "enable_lambda" ]; then
|
||||
CUDA_OPT_NAME=CUDA_LAMBDA
|
||||
elif [ "${CUDA_}" == "rdc" ]; then
|
||||
CUDA_OPT_NAME=CUDA_RELOCATABLE_DEVICE_CODE
|
||||
elif [ "${CUDA_}" == "force_uvm" ]; then
|
||||
CUDA_OPT_NAME=CUDA_UVM
|
||||
elif [ "${CUDA_}" == "use_ldg" ]; then
|
||||
CUDA_OPT_NAME=CUDA_LDG_INTRINSIC
|
||||
else
|
||||
echo "${CUDA_} is not a valid cuda options..."
|
||||
fi
|
||||
if [ "${CUDA_OPT_NAME}" != "" ]; then
|
||||
KOKKOS_CUDA_OPTION_CMD="-DKokkos_ENABLE_${CUDA_OPT_NAME}=ON ${KOKKOS_CUDA_OPTION_CMD}"
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
get_kokkos_option_list() {
|
||||
echo parsing KOKKOS_OPTIONS=$KOKKOS_OPTIONS
|
||||
KOKKOS_OPTION_CMD=
|
||||
PARSE_OPTIONS_LST=$(echo $KOKKOS_OPTIONS | tr "," "\n")
|
||||
for OPT_ in $PARSE_OPTIONS_LST
|
||||
do
|
||||
UC_OPT_=$(echo $OPT_ | tr "[:lower:]" "[:upper:]")
|
||||
if [[ "$UC_OPT_" == *DISABLE* ]]; then
|
||||
FLIP_OPT_=${UC_OPT_/DISABLE/ENABLE}
|
||||
KOKKOS_OPTION_CMD="-DKokkos_${FLIP_OPT_}=OFF ${KOKKOS_OPTION_CMD}"
|
||||
elif [[ "$UC_OPT_" == *ENABLE* ]]; then
|
||||
KOKKOS_OPTION_CMD="-DKokkos_${UC_OPT_}=ON ${KOKKOS_OPTION_CMD}"
|
||||
else
|
||||
KOKKOS_OPTION_CMD="-DKokkos_ENABLE_${UC_OPT_}=ON ${KOKKOS_OPTION_CMD}"
|
||||
fi
|
||||
done
|
||||
}
|
||||
|
||||
display_help_text() {
|
||||
|
||||
echo "Kokkos configure options:"
|
||||
echo ""
|
||||
echo "--kokkos-path=/Path/To/Kokkos: Path to the Kokkos root directory."
|
||||
echo "--prefix=/Install/Path: Path to install the Kokkos library."
|
||||
echo ""
|
||||
echo "--with-cuda[=/Path/To/Cuda]: Enable Cuda and set path to Cuda Toolkit."
|
||||
echo "--with-openmp: Enable OpenMP backend."
|
||||
echo "--with-pthread: Enable Pthreads backend."
|
||||
echo "--with-serial: Enable Serial backend."
|
||||
echo "--with-devices: Explicitly add a set of backends."
|
||||
echo ""
|
||||
echo "--arch=[OPT]: Set target architectures. Options are:"
|
||||
echo " [AMD]"
|
||||
echo " AMDAVX = AMD CPU"
|
||||
echo " EPYC = AMD EPYC Zen-Core CPU"
|
||||
echo " [ARM]"
|
||||
echo " ARMv80 = ARMv8.0 Compatible CPU"
|
||||
echo " ARMv81 = ARMv8.1 Compatible CPU"
|
||||
echo " ARMv8-ThunderX = ARMv8 Cavium ThunderX CPU"
|
||||
echo " ARMv8-TX2 = ARMv8 Cavium ThunderX2 CPU"
|
||||
echo " [IBM]"
|
||||
echo " BGQ = IBM Blue Gene Q"
|
||||
echo " Power7 = IBM POWER7 and POWER7+ CPUs"
|
||||
echo " Power8 = IBM POWER8 CPUs"
|
||||
echo " Power9 = IBM POWER9 CPUs"
|
||||
echo " [Intel]"
|
||||
echo " WSM = Intel Westmere CPUs"
|
||||
echo " SNB = Intel Sandy/Ivy Bridge CPUs"
|
||||
echo " HSW = Intel Haswell CPUs"
|
||||
echo " BDW = Intel Broadwell Xeon E-class CPUs"
|
||||
echo " SKX = Intel Sky Lake Xeon E-class HPC CPUs (AVX512)"
|
||||
echo " [Intel Xeon Phi]"
|
||||
echo " KNC = Intel Knights Corner Xeon Phi"
|
||||
echo " KNL = Intel Knights Landing Xeon Phi"
|
||||
echo " [NVIDIA]"
|
||||
echo " Kepler30 = NVIDIA Kepler generation CC 3.0"
|
||||
echo " Kepler32 = NVIDIA Kepler generation CC 3.2"
|
||||
echo " Kepler35 = NVIDIA Kepler generation CC 3.5"
|
||||
echo " Kepler37 = NVIDIA Kepler generation CC 3.7"
|
||||
echo " Maxwell50 = NVIDIA Maxwell generation CC 5.0"
|
||||
echo " Maxwell52 = NVIDIA Maxwell generation CC 5.2"
|
||||
echo " Maxwell53 = NVIDIA Maxwell generation CC 5.3"
|
||||
echo " Pascal60 = NVIDIA Pascal generation CC 6.0"
|
||||
echo " Pascal61 = NVIDIA Pascal generation CC 6.1"
|
||||
echo " Volta70 = NVIDIA Volta generation CC 7.0"
|
||||
echo " Volta72 = NVIDIA Volta generation CC 7.2"
|
||||
echo ""
|
||||
echo "--compiler=/Path/To/Compiler Set the compiler."
|
||||
echo "--debug,-dbg: Enable Debugging."
|
||||
echo "--cxxflags=[FLAGS] Overwrite CXXFLAGS for library build and test"
|
||||
echo " build. This will still set certain required"
|
||||
echo " flags via KOKKOS_CXXFLAGS (such as -fopenmp,"
|
||||
echo " --std=c++11, etc.)."
|
||||
echo "--cxxstandard=[FLAGS] Overwrite KOKKOS_CXX_STANDARD for library build and test"
|
||||
echo " c++11 (default), c++14, c++17, c++1y, c++1z, c++2a"
|
||||
echo "--ldflags=[FLAGS] Overwrite LDFLAGS for library build and test"
|
||||
echo " build. This will still set certain required"
|
||||
echo " flags via KOKKOS_LDFLAGS (such as -fopenmp,"
|
||||
echo " -lpthread, etc.)."
|
||||
echo "--with-gtest=/Path/To/Gtest: Set path to gtest. (Used in unit and performance"
|
||||
echo " tests.)"
|
||||
echo "--with-hwloc=/Path/To/Hwloc: Set path to hwloc library."
|
||||
echo "--with-memkind=/Path/To/MemKind: Set path to memkind library."
|
||||
echo "--with-options=[OPT]: Additional options to Kokkos:"
|
||||
echo " compiler_warnings"
|
||||
echo " aggressive_vectorization = add ivdep on loops"
|
||||
echo " disable_profiling = do not compile with profiling hooks"
|
||||
echo " "
|
||||
echo "--with-cuda-options=[OPT]: Additional options to CUDA:"
|
||||
echo " force_uvm, use_ldg, enable_lambda, rdc"
|
||||
echo "--with-hpx-options=[OPT]: Additional options to HPX:"
|
||||
echo " enable_async_dispatch"
|
||||
echo "--gcc-toolchain=/Path/To/GccRoot: Set the gcc toolchain to use with clang (e.g. /usr)"
|
||||
echo "--make-j=[NUM]: DEPRECATED: call make with appropriate"
|
||||
echo " -j flag"
|
||||
|
||||
}
|
||||
|
||||
while [[ $# > 0 ]]
|
||||
do
|
||||
key="$1"
|
||||
|
||||
case $key in
|
||||
--kokkos-path*)
|
||||
KOKKOS_PATH="${key#*=}"
|
||||
;;
|
||||
--hpx-path*)
|
||||
HPX_PATH="${key#*=}"
|
||||
;;
|
||||
--prefix*)
|
||||
PREFIX="${key#*=}"
|
||||
;;
|
||||
--with-cuda)
|
||||
update_kokkos_devices Cuda
|
||||
CUDA_PATH_NVCC=$(command -v nvcc)
|
||||
CUDA_PATH=${CUDA_PATH_NVCC%/bin/nvcc}
|
||||
;;
|
||||
# Catch this before '--with-cuda*'
|
||||
--with-cuda-options*)
|
||||
KOKKOS_CUDA_OPTIONS="${key#*=}"
|
||||
;;
|
||||
--with-cuda*)
|
||||
update_kokkos_devices Cuda
|
||||
CUDA_PATH="${key#*=}"
|
||||
;;
|
||||
--with-openmp)
|
||||
update_kokkos_devices OpenMP
|
||||
;;
|
||||
--with-pthread)
|
||||
update_kokkos_devices Pthread
|
||||
;;
|
||||
--with-serial)
|
||||
update_kokkos_devices Serial
|
||||
;;
|
||||
--with-hpx-options*)
|
||||
KOKKOS_HPX_OPT="${key#*=}"
|
||||
;;
|
||||
--with-hpx*)
|
||||
update_kokkos_devices HPX
|
||||
if [ -z "$HPX_PATH" ]; then
|
||||
HPX_PATH="${key#*=}"
|
||||
fi
|
||||
;;
|
||||
--with-devices*)
|
||||
DEVICES="${key#*=}"
|
||||
PARSE_DEVICES=$(echo $DEVICES | tr "," "\n")
|
||||
for DEVICE_ in $PARSE_DEVICES
|
||||
do
|
||||
update_kokkos_devices $DEVICE_
|
||||
done
|
||||
;;
|
||||
--with-gtest*)
|
||||
GTEST_PATH="${key#*=}"
|
||||
;;
|
||||
--with-hwloc*)
|
||||
HWLOC_PATH="${key#*=}"
|
||||
;;
|
||||
--with-memkind*)
|
||||
MEMKIND_PATH="${key#*=}"
|
||||
;;
|
||||
--arch*)
|
||||
KOKKOS_ARCH="${key#*=}"
|
||||
;;
|
||||
--cxxflags*)
|
||||
KOKKOS_CXXFLAGS="${key#*=}"
|
||||
KOKKOS_CXXFLAGS=${KOKKOS_CXXFLAGS//,/ }
|
||||
;;
|
||||
--cxxstandard*)
|
||||
KOKKOS_CXX_STANDARD="${key#*=}"
|
||||
;;
|
||||
--ldflags*)
|
||||
KOKKOS_LDFLAGS="${key#*=}"
|
||||
;;
|
||||
--debug|-dbg)
|
||||
KOKKOS_DEBUG=yes
|
||||
;;
|
||||
--make-j*)
|
||||
echo "Warning: ${key} is deprecated"
|
||||
echo "Call make with appropriate -j flag"
|
||||
;;
|
||||
--compiler*)
|
||||
COMPILER="${key#*=}"
|
||||
CNUM=$(command -v ${COMPILER} 2>&1 >/dev/null | grep "no ${COMPILER}" | wc -l)
|
||||
if [ ${CNUM} -gt 0 ]; then
|
||||
echo "Invalid compiler by --compiler command: '${COMPILER}'"
|
||||
exit
|
||||
fi
|
||||
if [[ ! -n ${COMPILER} ]]; then
|
||||
echo "Empty compiler specified by --compiler command."
|
||||
exit
|
||||
fi
|
||||
CNUM=$(command -v ${COMPILER} | grep ${COMPILER} | wc -l)
|
||||
if [ ${CNUM} -eq 0 ]; then
|
||||
echo "Invalid compiler by --compiler command: '${COMPILER}'"
|
||||
exit
|
||||
fi
|
||||
# ... valid compiler, ensure absolute path set
|
||||
WCOMPATH=$(command -v $COMPILER)
|
||||
COMPDIR=$(dirname $WCOMPATH)
|
||||
COMPNAME=$(basename $WCOMPATH)
|
||||
COMPILER=${COMPDIR}/${COMPNAME}
|
||||
;;
|
||||
--with-options*)
|
||||
KOKKOS_OPTIONS="${key#*=}"
|
||||
;;
|
||||
--gcc-toolchain*)
|
||||
KOKKOS_GCC_TOOLCHAIN="${key#*=}"
|
||||
;;
|
||||
--help)
|
||||
display_help_text
|
||||
exit 0
|
||||
;;
|
||||
*)
|
||||
echo "warning: ignoring unknown option $key"
|
||||
;;
|
||||
esac
|
||||
|
||||
shift
|
||||
done
|
||||
|
||||
|
||||
if [ "$COMPILER" == "" ]; then
|
||||
COMPILER_CMD=
|
||||
else
|
||||
COMPILER_CMD=-DCMAKE_CXX_COMPILER=$COMPILER
|
||||
fi
|
||||
|
||||
if [ "$KOKKOS_DEBUG" == "" ]; then
|
||||
KOKKOS_DEBUG_CMD=-DCMAKE_BUILD_TYPE=RELEASE
|
||||
else
|
||||
KOKKOS_DEBUG_CMD=-DCMAKE_BUILD_TYPE=DEBUG
|
||||
fi
|
||||
|
||||
if [ ! -e ${KOKKOS_PATH}/CMakeLists.txt ]; then
|
||||
if [ "${KOKKOS_PATH}" == "" ]; then
|
||||
CM_SCRIPT=$0
|
||||
KOKKOS_PATH=`dirname $CM_SCRIPT`
|
||||
if [ ! -e ${KOKKOS_PATH}/CMakeLists.txt ]; then
|
||||
echo "${KOKKOS_PATH} repository appears to not be complete. please verify and try again"
|
||||
exit 0
|
||||
fi
|
||||
else
|
||||
echo "KOKKOS_PATH does not appear to be set properly. please specify in location of CMakeLists.txt"
|
||||
display_help_text
|
||||
exit 0
|
||||
fi
|
||||
fi
|
||||
|
||||
get_kokkos_device_list
|
||||
get_kokkos_option_list
|
||||
get_kokkos_arch_list
|
||||
get_kokkos_cuda_option_list
|
||||
|
||||
## if HPX is enabled, we need to enforce cxx standard = 14
|
||||
if [[ ${KOKKOS_DEVICE_CMD} == *Kokkos_ENABLE_HPX* ]]; then
|
||||
if [ "${KOKKOS_CXX_STANDARD}" == "" ] || [ ${#KOKKOS_CXX_STANDARD} -lt 14 ]; then
|
||||
echo CXX Standard must be 14 or higher for HPX to work.
|
||||
KOKKOS_CXX_STANDARD=14
|
||||
fi
|
||||
fi
|
||||
|
||||
if [ "$KOKKOS_CXX_STANDARD" == "" ]; then
|
||||
STANDARD_CMD=
|
||||
else
|
||||
STANDARD_CMD=-DKokkos_CXX_STANDARD=${KOKKOS_CXX_STANDARD}
|
||||
fi
|
||||
|
||||
if [[ ${COMPILER} == *clang* ]]; then
|
||||
gcc_path=$(which g++ | awk --field-separator='/bin/g++' '{printf $1}' )
|
||||
KOKKOS_CXXFLAGS="${KOKKOS_CXXFLAGS} --gcc-toolchain=${gcc_path}"
|
||||
|
||||
if [ ! "${CUDA_PATH}" == "" ]; then
|
||||
KOKKOS_CXXFLAGS="${KOKKOS_CXXFLAGS} --cuda-path=${CUDA_PATH}"
|
||||
fi
|
||||
fi
|
||||
|
||||
echo cmake $COMPILER_CMD -DCMAKE_CXX_FLAGS="${KOKKOS_CXXFLAGS}" -DCMAKE_EXE_LINKER_FLAGS="${KOKKOS_LDFLAGS}" -DCMAKE_INSTALL_PREFIX=${PREFIX} ${KOKKOS_DEVICE_CMD} ${KOKKOS_ARCH_CMD} -DKokkos_ENABLE_TESTS=ON ${KOKKOS_OPTION_CMD} ${KOKKOS_CUDA_OPTION_CMD} -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_CXX_EXTENSIONS=OFF ${STANDARD_CMD} ${KOKKOS_DEBUG_CMD} ${KOKKOS_PATH}
|
||||
cmake $COMPILER_CMD -DCMAKE_CXX_FLAGS="${KOKKOS_CXXFLAGS//\"}" -DCMAKE_EXE_LINKER_FLAGS="${KOKKOS_LDFLAGS//\"}" -DCMAKE_INSTALL_PREFIX=${PREFIX} ${KOKKOS_DEVICE_CMD} ${KOKKOS_ARCH_CMD} -DKokkos_ENABLE_TESTS=ON ${KOKKOS_OPTION_CMD} ${KOKKOS_CUDA_OPTION_CMD} -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_CXX_EXTENSIONS=OFF ${STANDARD_CMD} ${KOKKOS_DEBUG_CMD} ${KOKKOS_PATH}
|
||||
@ -1,18 +1,14 @@
|
||||
# - Config file for the Kokkos package
|
||||
# It defines the following variables
|
||||
# Kokkos_INCLUDE_DIRS - include directories for Kokkos
|
||||
# Kokkos_LIBRARIES - libraries to link against
|
||||
|
||||
# Compute paths
|
||||
@PACKAGE_INIT@
|
||||
|
||||
#Find dependencies
|
||||
INCLUDE(CMakeFindDependencyMacro)
|
||||
|
||||
#This needs to go above the KokkosTargets in case
|
||||
#the Kokkos targets depend in some way on the TPL imports
|
||||
@KOKKOS_TPL_EXPORTS@
|
||||
|
||||
GET_FILENAME_COMPONENT(Kokkos_CMAKE_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH)
|
||||
SET(Kokkos_INCLUDE_DIRS "@CONF_INCLUDE_DIRS@")
|
||||
|
||||
# Our library dependencies (contains definitions for IMPORTED targets)
|
||||
IF(NOT TARGET kokkos AND NOT Kokkos_BINARY_DIR)
|
||||
INCLUDE("${Kokkos_CMAKE_DIR}/KokkosTargets.cmake")
|
||||
ENDIF()
|
||||
|
||||
# These are IMPORTED targets created by KokkosTargets.cmake
|
||||
SET(Kokkos_LIBRARY_DIRS @INSTALL_LIB_DIR@)
|
||||
SET(Kokkos_LIBRARIES @Kokkos_LIBRARIES_NAMES@)
|
||||
SET(Kokkos_TPL_LIBRARIES @KOKKOS_LIBS@)
|
||||
INCLUDE("${Kokkos_CMAKE_DIR}/KokkosConfigCommon.cmake")
|
||||
UNSET(Kokkos_CMAKE_DIR)
|
||||
|
||||
87
lib/kokkos/cmake/KokkosConfigCommon.cmake.in
Normal file
87
lib/kokkos/cmake/KokkosConfigCommon.cmake.in
Normal file
@ -0,0 +1,87 @@
|
||||
SET(Kokkos_DEVICES @KOKKOS_ENABLED_DEVICES@)
|
||||
SET(Kokkos_OPTIONS @KOKKOS_ENABLED_OPTIONS@)
|
||||
SET(Kokkos_TPLS @KOKKOS_ENABLED_TPLS@)
|
||||
SET(Kokkos_ARCH @KOKKOS_ENABLED_ARCH_LIST@)
|
||||
|
||||
# These are needed by KokkosKernels
|
||||
FOREACH(DEV ${Kokkos_DEVICES})
|
||||
SET(Kokkos_ENABLE_${DEV} ON)
|
||||
ENDFOREACH()
|
||||
|
||||
IF(NOT Kokkos_FIND_QUIETLY)
|
||||
MESSAGE(STATUS "Enabled Kokkos devices: ${Kokkos_DEVICES}")
|
||||
ENDIF()
|
||||
|
||||
IF (Kokkos_ENABLE_CUDA AND ${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.14.0")
|
||||
#If we are building CUDA, we have tricked CMake because we declare a CXX project
|
||||
#If the default C++ standard for a given compiler matches the requested
|
||||
#standard, then CMake just omits the -std flag in later versions of CMake
|
||||
#This breaks CUDA compilation (CUDA compiler can have a different default
|
||||
#-std then the underlying host compiler by itself). Setting this variable
|
||||
#forces CMake to always add the -std flag even if it thinks it doesn't need it
|
||||
SET(CMAKE_CXX_STANDARD_DEFAULT 98 CACHE INTERNAL "" FORCE)
|
||||
ENDIF()
|
||||
|
||||
SET(KOKKOS_USE_CXX_EXTENSIONS @KOKKOS_USE_CXX_EXTENSIONS@)
|
||||
IF (NOT DEFINED CMAKE_CXX_EXTENSIONS OR CMAKE_CXX_EXTENSIONS)
|
||||
IF (NOT KOKKOS_USE_CXX_EXTENSIONS)
|
||||
MESSAGE(WARNING "The installed Kokkos configuration does not support CXX extensions. Forcing -DCMAKE_CXX_EXTENSIONS=Off")
|
||||
SET(CMAKE_CXX_EXTENSIONS OFF CACHE BOOL "" FORCE)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
include(FindPackageHandleStandardArgs)
|
||||
|
||||
# This function makes sure that Kokkos was built with the requested backends
|
||||
# and target architectures and generates a fatal error if it was not.
|
||||
#
|
||||
# kokkos_check(
|
||||
# [DEVICES <devices>...] # Set of backends (e.g. "OpenMP" and/or "Cuda")
|
||||
# [ARCH <archs>...] # Target architectures (e.g. "Power9" and/or "Volta70")
|
||||
# [OPTIONS <options>...] # Optional settings (e.g. "PROFILING")
|
||||
# [TPLS <tpls>...] # Third party libraries
|
||||
# [RETURN_VALUE <result>] # Set a variable that indicates the result of the
|
||||
# # check instead of a fatal error
|
||||
# )
|
||||
function(kokkos_check)
|
||||
set(ALLOWED_ARGS DEVICES ARCH OPTIONS TPLS)
|
||||
cmake_parse_arguments(KOKKOS_CHECK "" "RETURN_VALUE" "${ALLOWED_ARGS}" ${ARGN})
|
||||
foreach(_arg ${KOKKOS_CHECK_UNPARSED_ARGUMENTS})
|
||||
message(SEND_ERROR "Argument '${_arg}' passed to kokkos_check() was not recognized")
|
||||
endforeach()
|
||||
# Get the list of keywords that were actually passed to the function.
|
||||
set(REQUESTED_ARGS)
|
||||
foreach(arg ${ALLOWED_ARGS})
|
||||
if(KOKKOS_CHECK_${arg})
|
||||
list(APPEND REQUESTED_ARGS ${arg})
|
||||
endif()
|
||||
endforeach()
|
||||
set(KOKKOS_CHECK_SUCCESS TRUE)
|
||||
foreach(arg ${REQUESTED_ARGS})
|
||||
# Define variables named after the required arguments that are provided by
|
||||
# the Kokkos install.
|
||||
foreach(requested ${KOKKOS_CHECK_${arg}})
|
||||
foreach(provided ${Kokkos_${arg}})
|
||||
STRING(TOUPPER ${requested} REQUESTED_UC)
|
||||
STRING(TOUPPER ${provided} PROVIDED_UC)
|
||||
if(PROVIDED_UC STREQUAL REQUESTED_UC)
|
||||
string(REPLACE ";" " " ${requested} "${KOKKOS_CHECK_${arg}}")
|
||||
endif()
|
||||
endforeach()
|
||||
endforeach()
|
||||
# Somewhat divert the CMake function below from its original purpose and
|
||||
# use it to check that there are variables defined for all required
|
||||
# arguments. Success or failure messages will be displayed but we are
|
||||
# responsible for signaling failure and skip the build system generation.
|
||||
find_package_handle_standard_args("Kokkos_${arg}" DEFAULT_MSG
|
||||
${KOKKOS_CHECK_${arg}})
|
||||
if(NOT Kokkos_${arg}_FOUND)
|
||||
set(KOKKOS_CHECK_SUCCESS FALSE)
|
||||
endif()
|
||||
endforeach()
|
||||
if(NOT KOKKOS_CHECK_SUCCESS AND NOT KOKKOS_CHECK_RETURN_VALUE)
|
||||
message(FATAL_ERROR "Kokkos does NOT provide all backends and/or architectures requested")
|
||||
else()
|
||||
set(${KOKKOS_CHECK_RETURN_VALUE} ${KOKKOS_CHECK_SUCCESS} PARENT_SCOPE)
|
||||
endif()
|
||||
endfunction()
|
||||
89
lib/kokkos/cmake/KokkosCore_config.h.in
Normal file
89
lib/kokkos/cmake/KokkosCore_config.h.in
Normal file
@ -0,0 +1,89 @@
|
||||
|
||||
#if !defined(KOKKOS_MACROS_HPP) || defined(KOKKOS_CORE_CONFIG_H)
|
||||
#error "Do not include KokkosCore_config.h directly; include Kokkos_Macros.hpp instead."
|
||||
#else
|
||||
#define KOKKOS_CORE_CONFIG_H
|
||||
#endif
|
||||
|
||||
/* Execution Spaces */
|
||||
#cmakedefine KOKKOS_ENABLE_SERIAL
|
||||
#cmakedefine KOKKOS_ENABLE_OPENMP
|
||||
#cmakedefine KOKKOS_ENABLE_THREADS
|
||||
#cmakedefine KOKKOS_ENABLE_CUDA
|
||||
#cmakedefine KOKKOS_ENABLE_HPX
|
||||
#cmakedefine KOKKOS_ENABLE_MEMKIND
|
||||
#cmakedefine KOKKOS_ENABLE_LIBRT
|
||||
|
||||
#ifndef __CUDA_ARCH__
|
||||
#cmakedefine KOKKOS_ENABLE_TM
|
||||
#cmakedefine KOKKOS_USE_ISA_X86_64
|
||||
#cmakedefine KOKKOS_USE_ISA_KNC
|
||||
#cmakedefine KOKKOS_USE_ISA_POWERPCLE
|
||||
#cmakedefine KOKKOS_USE_ISA_POWERPCBE
|
||||
#endif
|
||||
|
||||
/* General Settings */
|
||||
#cmakedefine KOKKOS_ENABLE_CXX11
|
||||
#cmakedefine KOKKOS_ENABLE_CXX14
|
||||
#cmakedefine KOKKOS_ENABLE_CXX17
|
||||
#cmakedefine KOKKOS_ENABLE_CXX20
|
||||
|
||||
#cmakedefine KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE
|
||||
#cmakedefine KOKKOS_ENABLE_CUDA_UVM
|
||||
#cmakedefine KOKKOS_ENABLE_CUDA_LAMBDA
|
||||
#cmakedefine KOKKOS_ENABLE_CUDA_CONSTEXPR
|
||||
#cmakedefine KOKKOS_ENABLE_CUDA_LDG_INTRINSIC
|
||||
#cmakedefine KOKKOS_ENABLE_HPX_ASYNC_DISPATCH
|
||||
#cmakedefine KOKKOS_ENABLE_DEBUG
|
||||
#cmakedefine KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK
|
||||
#cmakedefine KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK
|
||||
#cmakedefine KOKKOS_ENABLE_COMPILER_WARNINGS
|
||||
#cmakedefine KOKKOS_ENABLE_PROFILING
|
||||
#cmakedefine KOKKOS_ENABLE_PROFILING_LOAD_PRINT
|
||||
#cmakedefine KOKKOS_ENABLE_DEPRECATED_CODE
|
||||
#cmakedefine KOKKOS_ENABLE_ETI
|
||||
#cmakedefine KOKKOS_ENABLE_LARGE_MEM_TESTS
|
||||
#cmakedefine KOKKOS_ENABLE_DUALVIEW_MODIFY_CHECK
|
||||
#cmakedefine KOKKOS_ENABLE_COMPLEX_ALIGN
|
||||
#cmakedefine KOKKOS_OPT_RANGE_AGGRESSIVE_VECTORIZATION
|
||||
|
||||
/* TPL Settings */
|
||||
#cmakedefine KOKKOS_ENABLE_HWLOC
|
||||
#cmakedefine KOKKOS_USE_LIBRT
|
||||
#cmakedefine KOKKOS_ENABLE_HWBSPACE
|
||||
|
||||
#cmakedefine KOKKOS_IMPL_CUDA_CLANG_WORKAROUND
|
||||
|
||||
#cmakedefine KOKKOS_COMPILER_CUDA_VERSION @KOKKOS_COMPILER_CUDA_VERSION@
|
||||
|
||||
#cmakedefine KOKKOS_ARCH_SSE42
|
||||
#cmakedefine KOKKOS_ARCH_ARMV80
|
||||
#cmakedefine KOKKOS_ARCH_ARMV8_THUNDERX
|
||||
#cmakedefine KOKKOS_ARCH_ARMV81
|
||||
#cmakedefine KOKKOS_ARCH_ARMV8_THUNDERX2
|
||||
#cmakedefine KOKKOS_ARCH_AMD_AVX2
|
||||
#cmakedefine KOKKOS_ARCH_AVX
|
||||
#cmakedefine KOKKOS_ARCH_AVX2
|
||||
#cmakedefine KOKKOS_ARCH_AVX512XEON
|
||||
#cmakedefine KOKKOS_ARCH_KNC
|
||||
#cmakedefine KOKKOS_ARCH_AVX512MIC
|
||||
#cmakedefine KOKKOS_ARCH_POWER7
|
||||
#cmakedefine KOKKOS_ARCH_POWER8
|
||||
#cmakedefine KOKKOS_ARCH_POWER9
|
||||
#cmakedefine KOKKOS_ARCH_KEPLER
|
||||
#cmakedefine KOKKOS_ARCH_KEPLER30
|
||||
#cmakedefine KOKKOS_ARCH_KEPLER32
|
||||
#cmakedefine KOKKOS_ARCH_KEPLER35
|
||||
#cmakedefine KOKKOS_ARCH_KEPLER37
|
||||
#cmakedefine KOKKOS_ARCH_MAXWELL
|
||||
#cmakedefine KOKKOS_ARCH_MAXWELL50
|
||||
#cmakedefine KOKKOS_ARCH_MAXWELL52
|
||||
#cmakedefine KOKKOS_ARCH_MAXWELL53
|
||||
#cmakedefine KOKKOS_ARCH_PASCAL
|
||||
#cmakedefine KOKKOS_ARCH_PASCAL60
|
||||
#cmakedefine KOKKOS_ARCH_PASCAL61
|
||||
#cmakedefine KOKKOS_ARCH_VOLTA
|
||||
#cmakedefine KOKKOS_ARCH_VOLTA70
|
||||
#cmakedefine KOKKOS_ARCH_VOLTA72
|
||||
#cmakedefine KOKKOS_ARCH_TURING75
|
||||
#cmakedefine KOKKOS_ARCH_AMD_EPYC
|
||||
@ -1,8 +0,0 @@
|
||||
ifndef KOKKOS_PATH
|
||||
MAKEFILE_PATH := $(abspath $(lastword $(MAKEFILE_LIST)))
|
||||
KOKKOS_PATH = $(subst Makefile,,$(MAKEFILE_PATH))..
|
||||
endif
|
||||
|
||||
include $(KOKKOS_PATH)/Makefile.kokkos
|
||||
include $(KOKKOS_PATH)/core/src/Makefile.generate_header_lists
|
||||
include $(KOKKOS_PATH)/core/src/Makefile.generate_build_files
|
||||
@ -1,20 +0,0 @@
|
||||
#.rst:
|
||||
# FindHWLOC
|
||||
# ----------
|
||||
#
|
||||
# Try to find HWLOC, based on KOKKOS_HWLOC_DIR
|
||||
#
|
||||
# The following variables are defined:
|
||||
#
|
||||
# HWLOC_FOUND - System has HWLOC
|
||||
# HWLOC_INCLUDE_DIR - HWLOC include directory
|
||||
# HWLOC_LIBRARIES - Libraries needed to use HWLOC
|
||||
|
||||
find_path(HWLOC_INCLUDE_DIR hwloc.h PATHS "${KOKKOS_HWLOC_DIR}/include")
|
||||
find_library(HWLOC_LIBRARIES hwloc PATHS "${KOKKOS_HWLOC_DIR}/lib")
|
||||
|
||||
include(FindPackageHandleStandardArgs)
|
||||
find_package_handle_standard_args(HWLOC DEFAULT_MSG
|
||||
HWLOC_INCLUDE_DIR HWLOC_LIBRARIES)
|
||||
|
||||
mark_as_advanced(HWLOC_INCLUDE_DIR HWLOC_LIBRARIES)
|
||||
@ -1,20 +0,0 @@
|
||||
#.rst:
|
||||
# FindMemkind
|
||||
# ----------
|
||||
#
|
||||
# Try to find Memkind.
|
||||
#
|
||||
# The following variables are defined:
|
||||
#
|
||||
# MEMKIND_FOUND - System has Memkind
|
||||
# MEMKIND_INCLUDE_DIR - Memkind include directory
|
||||
# MEMKIND_LIBRARIES - Libraries needed to use Memkind
|
||||
|
||||
find_path(MEMKIND_INCLUDE_DIR memkind.h)
|
||||
find_library(MEMKIND_LIBRARIES memkind)
|
||||
|
||||
include(FindPackageHandleStandardArgs)
|
||||
find_package_handle_standard_args(Memkind DEFAULT_MSG
|
||||
MEMKIND_INCLUDE_DIR MEMKIND_LIBRARIES)
|
||||
|
||||
mark_as_advanced(MEMKIND_INCLUDE_DIR MEMKIND_LIBRARIES)
|
||||
@ -1,20 +0,0 @@
|
||||
#.rst:
|
||||
# FindQthreads
|
||||
# ----------
|
||||
#
|
||||
# Try to find Qthreads.
|
||||
#
|
||||
# The following variables are defined:
|
||||
#
|
||||
# QTHREADS_FOUND - System has Qthreads
|
||||
# QTHREADS_INCLUDE_DIR - Qthreads include directory
|
||||
# QTHREADS_LIBRARIES - Libraries needed to use Qthreads
|
||||
|
||||
find_path(QTHREADS_INCLUDE_DIR qthread.h)
|
||||
find_library(QTHREADS_LIBRARIES qthread)
|
||||
|
||||
include(FindPackageHandleStandardArgs)
|
||||
find_package_handle_standard_args(Qthreads DEFAULT_MSG
|
||||
QTHREADS_INCLUDE_DIR QTHREADS_LIBRARIES)
|
||||
|
||||
mark_as_advanced(QTHREADS_INCLUDE_DIR QTHREADS_LIBRARIES)
|
||||
13
lib/kokkos/cmake/Modules/FindTPLCUDA.cmake
Normal file
13
lib/kokkos/cmake/Modules/FindTPLCUDA.cmake
Normal file
@ -0,0 +1,13 @@
|
||||
|
||||
IF (KOKKOS_CXX_COMPILER_ID STREQUAL Clang)
|
||||
KOKKOS_FIND_IMPORTED(CUDA INTERFACE
|
||||
LIBRARIES cudart cuda
|
||||
LIBRARY_PATHS ENV LD_LIBRARY_PATH ENV CUDA_PATH
|
||||
ALLOW_SYSTEM_PATH_FALLBACK
|
||||
)
|
||||
ELSE()
|
||||
KOKKOS_CREATE_IMPORTED_TPL(CUDA INTERFACE
|
||||
LINK_LIBRARIES cuda
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
15
lib/kokkos/cmake/Modules/FindTPLHPX.cmake
Normal file
15
lib/kokkos/cmake/Modules/FindTPLHPX.cmake
Normal file
@ -0,0 +1,15 @@
|
||||
|
||||
FIND_PACKAGE(HPX REQUIRED)
|
||||
#as of right now, HPX doesn't export correctly
|
||||
#so let's convert it to an interface target
|
||||
KOKKOS_CREATE_IMPORTED_TPL(HPX INTERFACE
|
||||
LINK_LIBRARIES ${HPX_LIBRARIES}
|
||||
INCLUDES ${HPX_INCLUDE_DIRS}
|
||||
)
|
||||
#this is a bit funky since this is a CMake target
|
||||
#but HPX doesn't export itself correctly
|
||||
KOKKOS_EXPORT_CMAKE_TPL(HPX)
|
||||
|
||||
#I would prefer all of this gets replaced with
|
||||
#KOKKOS_IMPORT_CMAKE_TPL(HPX)
|
||||
|
||||
1
lib/kokkos/cmake/Modules/FindTPLHWLOC.cmake
Normal file
1
lib/kokkos/cmake/Modules/FindTPLHWLOC.cmake
Normal file
@ -0,0 +1 @@
|
||||
KOKKOS_FIND_IMPORTED(HWLOC HEADER hwloc.h LIBRARY hwloc)
|
||||
1
lib/kokkos/cmake/Modules/FindTPLLIBDL.cmake
Normal file
1
lib/kokkos/cmake/Modules/FindTPLLIBDL.cmake
Normal file
@ -0,0 +1 @@
|
||||
KOKKOS_FIND_IMPORTED(LIBDL HEADER dlfcn.h LIBRARY dl)
|
||||
1
lib/kokkos/cmake/Modules/FindTPLLIBNUMA.cmake
Normal file
1
lib/kokkos/cmake/Modules/FindTPLLIBNUMA.cmake
Normal file
@ -0,0 +1 @@
|
||||
KOKKOS_FIND_IMPORTED(LIBNUMA HEADER numa.h LIBRARY numa)
|
||||
1
lib/kokkos/cmake/Modules/FindTPLLIBRT.cmake
Normal file
1
lib/kokkos/cmake/Modules/FindTPLLIBRT.cmake
Normal file
@ -0,0 +1 @@
|
||||
KOKKOS_FIND_IMPORTED(LIBRT HEADER time.h LIBRARY rt)
|
||||
1
lib/kokkos/cmake/Modules/FindTPLMEMKIND.cmake
Normal file
1
lib/kokkos/cmake/Modules/FindTPLMEMKIND.cmake
Normal file
@ -0,0 +1 @@
|
||||
KOKKOS_FIND_IMPORTED(MEMKIND HEADER memkind.h LIBRARY memkind)
|
||||
17
lib/kokkos/cmake/Modules/FindTPLPTHREAD.cmake
Normal file
17
lib/kokkos/cmake/Modules/FindTPLPTHREAD.cmake
Normal file
@ -0,0 +1,17 @@
|
||||
|
||||
TRY_COMPILE(KOKKOS_HAS_PTHREAD_ARG
|
||||
${KOKKOS_TOP_BUILD_DIR}/tpl_tests
|
||||
${KOKKOS_SOURCE_DIR}/cmake/compile_tests/pthread.cpp
|
||||
LINK_LIBRARIES -pthread
|
||||
COMPILE_DEFINITIONS -pthread)
|
||||
|
||||
INCLUDE(FindPackageHandleStandardArgs)
|
||||
FIND_PACKAGE_HANDLE_STANDARD_ARGS(PTHREAD DEFAULT_MSG KOKKOS_HAS_PTHREAD_ARG)
|
||||
|
||||
KOKKOS_CREATE_IMPORTED_TPL(PTHREAD
|
||||
INTERFACE #this is not a real library with a real location
|
||||
COMPILE_OPTIONS -pthread
|
||||
LINK_OPTIONS -pthread)
|
||||
|
||||
|
||||
|
||||
331
lib/kokkos/cmake/README.md
Normal file
331
lib/kokkos/cmake/README.md
Normal file
@ -0,0 +1,331 @@
|
||||

|
||||
|
||||
# Developing Kokkos
|
||||
|
||||
This document contains a build system overview for developers with information on adding new CMake options that could influence
|
||||
* Header configuration macros
|
||||
* Optional features
|
||||
* Third-partly libraries
|
||||
* Compiler and linker flags
|
||||
For build system details for users, refer to the [build instructions](../BUILD.md).
|
||||
|
||||
## Build System
|
||||
|
||||
Kokkos uses CMake to configure, build, and install.
|
||||
Rather than being a completely straightforward use of modern CMake,
|
||||
Kokkos has several extra complications, primarily due to:
|
||||
* Kokkos must support linking to an installed version or in-tree builds as a subdirectory of a larger project.
|
||||
* Kokkos must configure a special compiler `nvcc_wrapper` that allows `nvcc` to accept all C++ flags (which `nvcc` currently does not).
|
||||
* Kokkos must work as a part of TriBITS, a CMake library providing a particular build idiom for Trilinos.
|
||||
* Kokkos has many pre-existing users. We need to be careful about breaking previous versions or generating meaningful error messags if we do break backwards compatibility.
|
||||
|
||||
If you are looking at the build system code wondering why certain decisions were made: we have had to balance many competing requirements and certain technical debt. Everything in the build system was done for a reason, trying to adhere as closely as possible to modern CMake best practices while meeting all pre-existing. customer requirements.
|
||||
|
||||
### Modern CMake Philosophy
|
||||
|
||||
Modern CMake relies on understanding the principle of *building* and *using* a code project.
|
||||
What preprocessor, compiler, and linker flags do I need to *build* my project?
|
||||
What flags does a downstream project that links to me need to *use* my project?
|
||||
In CMake terms, flags that are only needed for building are `PRIVATE`.
|
||||
Only Kokkos needs these flags, not a package that depends on Kokkos.
|
||||
Flags that must be used in a downstream project are `PUBLIC`.
|
||||
Kokkos must tell other projects to use them.
|
||||
|
||||
In Kokkos, almost everything is a public flag since Kokkos is driven by headers and Kokkos is in charge of optimizing your code to achieve performance portability!
|
||||
Include paths, C++ standard flags, architecture-specific optimizations, or OpenMP and CUDA flags are all examples of flags that Kokkos configures and adds to your project.
|
||||
|
||||
Modern CMake now automatically propagates flags through the `target_link_libraries` command.
|
||||
Suppose you have a library `stencil` that needs to build with Kokkos.
|
||||
Consider the following CMake code:
|
||||
|
||||
````
|
||||
find_package(Kokkos)
|
||||
add_library(stencil stencil.cpp)
|
||||
target_link_libraries(stencil Kokkos::kokkos)
|
||||
````
|
||||
|
||||
This locates the Kokkos package, adds your library, and tells CMake to link Kokkos to your library.
|
||||
All public build flags get added automatically through the `target_link_libraries` command.
|
||||
There is nothing to do. You can be happily oblivious to how Kokkos was configured.
|
||||
Everything should just work.
|
||||
|
||||
As a Kokkos developer who wants to add new public compiler flags, how do you ensure that CMake does this properly? Modern CMake works through targets and properties.
|
||||
Each target has a set of standard properties:
|
||||
* `INTERFACE_COMPILE_OPTIONS` contains all the compiler options that Kokkos should add to downstream projects
|
||||
* `INTERFACE_INCLUDE_DIRECTORIES` contains all the directories downstream projects must include from Kokkos
|
||||
* `INTERFACE_COMPILE_DEFINITIONS` contains the list of preprocessor `-D` flags
|
||||
* `INTERFACE_LINK_LIBRARIES` contains all the libraries downstream projects need to link
|
||||
* `INTERFACE_COMPILE_FEATURES` essentially adds compiler flags, but with extra complications. Features names are specific to CMake. More later.
|
||||
|
||||
CMake makes it easy to append to these properties using:
|
||||
* `target_compile_options(kokkos PUBLIC -fmyflag)`
|
||||
* `target_include_directories(kokkos PUBLIC mySpecialFolder)`
|
||||
* `target_compile_definitions(kokkos PUBLIC -DmySpecialFlag=0)`
|
||||
* `target_link_libraries(kokkos PUBLIC mySpecialLibrary)`
|
||||
* `target_compile_features(kokkos PUBLIC mySpecialFeature)`
|
||||
Note that all of these use `PUBLIC`! Almost every Kokkos flag is not private to Kokkos, but must also be used by downstream projects.
|
||||
|
||||
|
||||
### Compiler Features and Compiler Options
|
||||
Compiler options are flags like `-fopenmp` that do not need to be "resolved."
|
||||
The flag is either on or off.
|
||||
Compiler features are more fine-grained and require conflicting requests to be resolved.
|
||||
Suppose I have
|
||||
````
|
||||
add_library(A a.cpp)
|
||||
target_compile_features(A PUBLIC cxx_std_11)
|
||||
````
|
||||
then another target
|
||||
````
|
||||
add_library(B b.cpp)
|
||||
target_compile_features(B PUBLIC cxx_std_14)
|
||||
target_link_libraries(A B)
|
||||
````
|
||||
I have requested two diferent features.
|
||||
CMake understands the requests and knows that `cxx_std_11` is a subset of `cxx_std_14`.
|
||||
CMake then picks C++14 for library `B`.
|
||||
CMake would not have been able to do feature resolution if we had directly done:
|
||||
````
|
||||
target_compile_options(A PUBLIC -std=c++11)
|
||||
````
|
||||
|
||||
### Adding Kokkos Options
|
||||
After configuring for the first time,
|
||||
CMake creates a cache of configure variables in `CMakeCache.txt`.
|
||||
Reconfiguring in the folder "restarts" from those variables.
|
||||
All flags passed as `-DKokkos_SOME_OPTION=X` to `cmake` become variables in the cache.
|
||||
All Kokkos options begin with camel case `Kokkos_` followed by an upper case option name.
|
||||
|
||||
CMake best practice is to avoid cache variables, if possible.
|
||||
In essence, you want the minimal amount of state cached between configurations.
|
||||
And never, ever have behavior influenced by multiple cache variables.
|
||||
If you want to change the Kokkos configuration, have a single unique variable that needs to be changed.
|
||||
Never require two cache variables to be changed.
|
||||
|
||||
Kokkos provides a function `KOKKOS_OPTION` for defining valid cache-level variables,
|
||||
proofreading them, and defining local project variables.
|
||||
The most common variables are called `Kokkos_ENABLE_X`,
|
||||
for which a helper function `KOKKOS_ENABLE_OPTION` is provided, e.g.
|
||||
````
|
||||
KOKKOS_ENABLE_OPTION(TESTS OFF "Whether to build tests")
|
||||
````
|
||||
The function checks if `-DKokkos_ENABLE_TESTS` was given,
|
||||
whether it was given with the wrong case, e.g. `-DKokkos_Enable_Tests`,
|
||||
and then defines a regular (non-cache) variable `KOKKOS_ENABLE_TESTS` to `ON` or `OFF`
|
||||
depending on the given default and whether the option was specified.
|
||||
|
||||
### Defining Kokkos Config Macros
|
||||
|
||||
Sometimes you may want to add `#define Kokkos_X` macros to the config header.
|
||||
This is straightforward with CMake.
|
||||
Suppose you want to define an optional macro `KOKKOS_SUPER_SCIENCE`.
|
||||
Simply go into `KokkosCore_config.h.in` and add
|
||||
````
|
||||
#cmakedefine KOKKOS_SUPER_SCIENCE
|
||||
````
|
||||
I can either add
|
||||
````
|
||||
KOKKOS_OPTION(SUPER_SCIENCE ON "Whether to do some super science")
|
||||
````
|
||||
to directly set the variable as a command-line `-D` option.
|
||||
Alternatively, based on other logic, I could add to a `CMakeLists.txt`
|
||||
````
|
||||
SET(KOKKOS_SUPER_SCIENCE ON)
|
||||
````
|
||||
If not set as a command-line option (cache variable), you must make sure the variable is visible in the top-level scope.
|
||||
If set in a function, you would need:
|
||||
````
|
||||
SET(KOKKOS_SUPER_SCIENCE ON PARENT_SCOPE)
|
||||
````
|
||||
|
||||
### Third-Party Libraries
|
||||
In much the same way that compiler flags transitively propagate to dependent projects,
|
||||
modern CMake allows us to propagate dependent libraries.
|
||||
If Kokkos depends on, e.g. `hwloc` the downstream project will also need to link `hwloc`.
|
||||
There are three stages in adding a new third-party library (TPL):
|
||||
* Finding: find the desired library on the system and verify the installation is correct
|
||||
* Importing: create a CMake target, if necessary, that is compatible with `target_link_libraries`. This is mostly relevant for TPLs not installed with CMake.
|
||||
* Exporting: make the desired library visible to downstream projects
|
||||
|
||||
TPLs are somewhat complicated by whether the library was installed with CMake or some other build system.
|
||||
If CMake, our lives are greatly simplified. We simply use `find_package` to locate the installed CMake project then call `target_link_libraries(kokkoscore PUBLIC/PRIVATE TPL)`. For libaries not installed with CMake, the process is a bit more complex.
|
||||
It is up to the Kokkos developers to "convert" the library into a CMake target as if it had been installed as a valid modern CMake target with properties.
|
||||
There are helper functions for simplifying the process of importing TPLs in Kokkos, but we walk through the process in detail to clearly illustrate the steps involved.
|
||||
|
||||
#### TPL Search Order
|
||||
|
||||
There are several options for where CMake could try to find a TPL.
|
||||
If there are multiple installations of the same TPL on the system,
|
||||
the search order is critical for making sure the correct TPL is found.
|
||||
There are 3 possibilities that could be used:
|
||||
|
||||
1. Default system paths like /usr
|
||||
1. User-provided paths through options `<NAME>_ROOT` and `Kokkos_<NAME>_DIR`
|
||||
1. Additional paths not in the CMake default list or provided by the user that Kokkos decides to add. For example, Kokkos may query `nvcc` or `LD_LIBRARY_PATH` for where to find CUDA libraries.
|
||||
|
||||
The following is the search order that Kokkos follows. Note: This differs from the default search order used by CMake `find_library` and `find_header`. CMake prefers default system paths over user-provided paths.
|
||||
For Kokkos (and package managers in general), it is better to prefer user-provided paths since this usually indicates a specific version we want.
|
||||
|
||||
1. `<NAME>_ROOT`
|
||||
1. `Kokkos_<NAME>_DIR`
|
||||
1. Paths added by Kokkos CMake logic
|
||||
1. Default system paths (if allowed)
|
||||
|
||||
Default system paths are allowed in two cases. First, none of the other options are given so the only place to look is system paths. Second, if explicitly given permission, configure will look in system paths.
|
||||
The rationale for this logic is that if you specify a custom location, you usually *only* want to look in that location.
|
||||
If you do not find the TPL where you expect it, you should error out rather than grab another random match.
|
||||
|
||||
|
||||
#### Finding TPLs
|
||||
|
||||
If finding a TPL that is not a modern CMake project, refer to the `FindHWLOC.cmake` file in `cmake/Modules` for an example.
|
||||
You will ususally need to verify expected headers with `find_path`
|
||||
````
|
||||
find_path(TPL_INCLUDE_DIR mytpl.h PATHS "${KOKKOS_MYTPL_DIR}/include")
|
||||
````
|
||||
This insures that the library header is in the expected include directory and defines the variable `TPL_INCLUDE_DIR` with a valid path if successful.
|
||||
Similarly, you can verify a library
|
||||
````
|
||||
find_library(TPL_LIBRARY mytpl PATHS "${KOKKOS_MYTPL_DIR/lib")
|
||||
````
|
||||
that then defines the variable `TPL_LIBRARY` with a valid path if successful.
|
||||
CMake provides a utility for checking if the `find_path` and `find_library` calls were successful that emulates the behavior of `find_package` for a CMake target.
|
||||
````
|
||||
include(FindPackageHandleStandardArgs)
|
||||
find_package_handle_standard_args(MYTPL DEFAULT_MSG
|
||||
MYTPL_INCLUDE_DIR MYTPL_LIBRARY)
|
||||
````
|
||||
If the find failed, CMake will print standard error messages explaining the failure.
|
||||
|
||||
#### Importing TPLs
|
||||
|
||||
The installed TPL must be adapted into a CMake target.
|
||||
CMake allows libraries to be added that are built externally as follows:
|
||||
````
|
||||
add_library(Kokkos::mytpl UNKNOWN IMPORTED)
|
||||
````
|
||||
Importantly, we use a `Kokkos::` namespace to avoid name conflicts and identify this specifically as the version imported by Kokkos.
|
||||
Because we are importing a non-CMake target, we must populate all the target properties that would have been automatically populated for a CMake target.
|
||||
````
|
||||
set_target_properties(Kokkos::mytpl PROPERTIES
|
||||
INTERFACE_INCLUDE_DIRECTORIES "${MYTPL_INCLUDE_DIR}"
|
||||
IMPORTED_LOCATION "${MYTPL_LIBRARY}"
|
||||
)
|
||||
````
|
||||
|
||||
#### Exporting TPLs
|
||||
|
||||
Kokkos may now depend on the target `Kokkos::mytpl` as a `PUBLIC` library (remember building and using).
|
||||
This means that downstream projects must also know about `Kokkos::myptl` - so Kokkos must export them.
|
||||
In the `KokkosConfig.cmake.in` file, we need to add code like the following:
|
||||
````
|
||||
set(MYTPL_LIBRARY @MYTPL_LIBRARY@)
|
||||
set(MYTPL_INCLUDE_DIR @MYTPL_INCLUDE_DIR@)
|
||||
add_library(Kokkos::mytpl UNKNOWN IMPORTED)
|
||||
set_target_properties(Kokkos::mytpl PROPERTIES
|
||||
INTERFACE_INCLUDE_DIRECTORIES "${MYTPL_INCLUDE_DIR}"
|
||||
IMPORTED_LOCATION "${MYTPL_LIBRARY}"
|
||||
)
|
||||
````
|
||||
If this looks familiar, that's because it is exactly the same code as above for importing the TPL.
|
||||
Exporting a TPL really just means importing the TPL when Kokkos is loaded by an external project.
|
||||
We will describe helper functions that simplify this process.
|
||||
|
||||
#### Interface TPLs
|
||||
|
||||
If a TPL is just a library and set of headers, we can make a simple `IMPORTED` target.
|
||||
However, a TPL is actually completely flexible and need not be limited to just headers and libraries.
|
||||
TPLs can configure compiler flags, linker flags, or multiple different libraries.
|
||||
For this, we use a special type of CMake target: `INTERFACE` libraries.
|
||||
These libraries don't build anything.
|
||||
They simply populate properties that will configure flags for dependent targets.
|
||||
We consider the example:
|
||||
````
|
||||
add_library(PTHREAD INTERFACE)
|
||||
target_compile_options(PTHREAD PUBLIC -pthread)
|
||||
````
|
||||
Kokkos uses the compiler flag `-pthread` to define compiler macros for re-entrant functions rather than treating it simply as a library with header `pthread.h` and library `-lpthread`.
|
||||
Any property can be configured, e.g.
|
||||
````
|
||||
target_link_libraries(MYTPL ...)
|
||||
````
|
||||
In contrast to imported TPLs which require direct modification of `KokkosConfig.cmake.in`,
|
||||
we can use CMake's built-in export functions:
|
||||
````
|
||||
INSTALL(
|
||||
TARGETS MYTPL
|
||||
EXPORT KokkosTargets
|
||||
RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR}
|
||||
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
|
||||
ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR}
|
||||
)
|
||||
````
|
||||
These interface targets will be automatically populated in the config file.
|
||||
|
||||
#### Linking the TPL
|
||||
After finishing the import process, it still remains to link the imported target as needed.
|
||||
For example,
|
||||
````
|
||||
target_link_libraries(kokkoscore PUBLIC Kokkos::HWLOC)
|
||||
````
|
||||
The complexity of which includes, options, and libraries the TPL requires
|
||||
should be encapsulated in the CMake target.
|
||||
|
||||
#### TPL Helper Functions
|
||||
##### KOKKOS_IMPORT_TPL
|
||||
This function can be invoked as, e.g.
|
||||
````
|
||||
KOKKOS_IMPORT_TPL(HWLOC)
|
||||
````
|
||||
This function checks if the TPL was enabled by a `-DKokkos_ENABLE_HWLOC=On` flag.
|
||||
If so, it calls `find_package(TPLHWLOC)`.
|
||||
This invokes the file `FindTPLHWLOC.cmake` which should be contained in the `cmake/Modules` folder.
|
||||
If successful, another function `KOKKOS_EXPORT_CMAKE_TPL` gets invoked.
|
||||
This automatically adds all the necessary import commands to `KokkosConfig.cmake`.
|
||||
|
||||
##### KOKKOS_FIND_IMPORTED
|
||||
Inside a `FindTPLX.cmake` file, the simplest way to import a library is to call, e.g.
|
||||
````
|
||||
KOKKOS_FIND_IMPORTED(HWLOC LIBRARY hwloc HEADER hwloc.h)
|
||||
````
|
||||
This finds the location of the library and header and creates an imported target `Kokkos::HWLOC`
|
||||
that can be linked against.
|
||||
The library/header find can be guided with `-DHWLOC_ROOT=` or `-DKokkos_HWLOC_DIR=` during CMake configure.
|
||||
These both specify the install prefix.
|
||||
|
||||
##### KOKKOS_LINK_TPL
|
||||
This function checks if the TPL has been enabled.
|
||||
If so, it links a given library against the imported (or interface) TPL target.
|
||||
|
||||
##### KOKKOS_CREATE_IMPORTED_TPL
|
||||
This helper function is best understood by reading the actual code.
|
||||
This function takes arguments specifying the properties and creates the actual TPL target.
|
||||
The most important thing to understand for this function is whether you call this function with the optional `INTERFACE` keyword.
|
||||
This tells the project to either create the target as an imported target or interface target, as discussed above.
|
||||
|
||||
##### KOKKOS_EXPORT_CMAKE_TPL
|
||||
Even if the TPL just loads a valid CMake target, we still must "export" it into the config file.
|
||||
When Kokkos is loaded by a downstream project, this TPL must be loaded.
|
||||
Calling this function simply appends text recording the location where the TPL was found
|
||||
and adding a `find_dependency(...)` call that will reload the CMake target.
|
||||
|
||||
### The Great TriBITS Compromise
|
||||
|
||||
TriBITS was a masterpiece of CMake version 2 before the modern CMake idioms of building and using.
|
||||
TriBITS greatly limited verbosity of CMake files, handled complicated dependency trees between packages, and handled automatically setting up include and linker paths for dependent libraries.
|
||||
|
||||
Kokkos is now used by numerous projects that don't (and won't) depend on TriBITS for their build systems.
|
||||
Kokkos has to work outside of TriBITS and provide a standard CMake 3+ build system.
|
||||
At the same time, Kokkos is used by numerous projects that depend on TriBITS and don't (and won't) switch to a standard CMake 3+ build system.
|
||||
|
||||
Instead of calling functions `TRIBITS_X(...)`, the CMake calls wrapper functions `KOKKOS_X(...)`.
|
||||
If TriBITS is available (as in Trilinos), `KOKKOS_X` will just be a thin wrapper around `TRIBITS_X`.
|
||||
If TriBITS is not available, Kokkos maps `KOKKOS_X` calls to native CMake that complies with CMake 3 idioms.
|
||||
For the time being, this seems the most sensible way to handle the competing requirements of a standalone modern CMake and TriBITS build system.
|
||||
|
||||
##### [LICENSE](https://github.com/kokkos/kokkos/blob/devel/LICENSE)
|
||||
|
||||
[](https://opensource.org/licenses/BSD-3-Clause)
|
||||
|
||||
Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
the U.S. Government retains certain rights in this software.
|
||||
9
lib/kokkos/cmake/compile_tests/clang_omp.cpp
Normal file
9
lib/kokkos/cmake/compile_tests/clang_omp.cpp
Normal file
@ -0,0 +1,9 @@
|
||||
#include <omp.h>
|
||||
|
||||
int main(int argc, char** argv) {
|
||||
int thr = omp_get_num_threads();
|
||||
if (thr > 0)
|
||||
return thr;
|
||||
else
|
||||
return 0;
|
||||
}
|
||||
10
lib/kokkos/cmake/compile_tests/pthread.cpp
Normal file
10
lib/kokkos/cmake/compile_tests/pthread.cpp
Normal file
@ -0,0 +1,10 @@
|
||||
#include <pthread.h>
|
||||
|
||||
void* kokkos_test(void* args) { return args; }
|
||||
|
||||
int main(void) {
|
||||
pthread_t thread;
|
||||
pthread_create(&thread, NULL, kokkos_test, NULL);
|
||||
pthread_join(thread, NULL);
|
||||
return 0;
|
||||
}
|
||||
9
lib/kokkos/cmake/cray.cmake
Normal file
9
lib/kokkos/cmake/cray.cmake
Normal file
@ -0,0 +1,9 @@
|
||||
|
||||
|
||||
function(kokkos_set_cray_flags full_standard int_standard)
|
||||
STRING(TOLOWER ${full_standard} FULL_LC_STANDARD)
|
||||
STRING(TOLOWER ${int_standard} INT_LC_STANDARD)
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG "-hstd=c++${FULL_LC_STANDARD}", PARENT_SCOPE)
|
||||
SET(KOKKOS_CXX_INTERMDIATE_STANDARD_FLAG "-hstd=c++${INT_LC_STANDARD}" PARENT_SCOPE)
|
||||
endfunction()
|
||||
|
||||
@ -73,7 +73,7 @@ IF(NOT _CUDA_FAILURE)
|
||||
GLOBAL_SET(TPL_CUDA_LIBRARY_DIRS)
|
||||
GLOBAL_SET(TPL_CUDA_INCLUDE_DIRS ${CUDA_TOOLKIT_INCLUDE})
|
||||
GLOBAL_SET(TPL_CUDA_LIBRARIES ${CUDA_CUDART_LIBRARY} ${CUDA_cublas_LIBRARY} ${CUDA_cufft_LIBRARY})
|
||||
TIBITS_CREATE_IMPORTED_TPL_LIBRARY(CUSPARSE)
|
||||
KOKKOS_CREATE_IMPORTED_TPL_LIBRARY(CUSPARSE)
|
||||
ELSE()
|
||||
SET(TPL_ENABLE_CUDA OFF)
|
||||
ENDIF()
|
||||
|
||||
@ -59,6 +59,6 @@
|
||||
# GLOBAL_SET(TPL_CUSPARSE_LIBRARY_DIRS)
|
||||
# GLOBAL_SET(TPL_CUSPARSE_INCLUDE_DIRS ${TPL_CUDA_INCLUDE_DIRS})
|
||||
# GLOBAL_SET(TPL_CUSPARSE_LIBRARIES ${CUDA_cusparse_LIBRARY})
|
||||
# TIBITS_CREATE_IMPORTED_TPL_LIBRARY(CUSPARSE)
|
||||
# KOKKOS_CREATE_IMPORTED_TPL_LIBRARY(CUSPARSE)
|
||||
#ENDIF()
|
||||
|
||||
|
||||
@ -64,7 +64,7 @@
|
||||
# Version: 1.3
|
||||
#
|
||||
|
||||
TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( HWLOC
|
||||
KOKKOS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( HWLOC
|
||||
REQUIRED_HEADERS hwloc.h
|
||||
REQUIRED_LIBS_NAMES "hwloc"
|
||||
)
|
||||
|
||||
@ -74,9 +74,9 @@ IF(USE_THREADS)
|
||||
SET(TPL_Pthread_INCLUDE_DIRS "")
|
||||
SET(TPL_Pthread_LIBRARIES "${CMAKE_THREAD_LIBS_INIT}")
|
||||
SET(TPL_Pthread_LIBRARY_DIRS "")
|
||||
TIBITS_CREATE_IMPORTED_TPL_LIBRARY(Pthread)
|
||||
KOKKOS_CREATE_IMPORTED_TPL_LIBRARY(Pthread)
|
||||
ELSE()
|
||||
TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( Pthread
|
||||
KOKKOS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( Pthread
|
||||
REQUIRED_HEADERS pthread.h
|
||||
REQUIRED_LIBS_NAMES pthread
|
||||
)
|
||||
|
||||
@ -1,69 +0,0 @@
|
||||
# @HEADER
|
||||
# ************************************************************************
|
||||
#
|
||||
# Trilinos: An Object-Oriented Solver Framework
|
||||
# Copyright (2001) Sandia Corporation
|
||||
#
|
||||
#
|
||||
# Copyright (2001) Sandia Corporation. Under the terms of Contract
|
||||
# DE-AC04-94AL85000, there is a non-exclusive license for use of this
|
||||
# work by or on behalf of the U.S. Government. Export of this program
|
||||
# may require a license from the United States Government.
|
||||
#
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
#
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
#
|
||||
# 3. Neither the name of the Corporation nor the names of the
|
||||
# contributors may be used to endorse or promote products derived from
|
||||
# this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
|
||||
# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# NOTICE: The United States Government is granted for itself and others
|
||||
# acting on its behalf a paid-up, nonexclusive, irrevocable worldwide
|
||||
# license in this data to reproduce, prepare derivative works, and
|
||||
# perform publicly and display publicly. Beginning five (5) years from
|
||||
# July 25, 2001, the United States Government is granted for itself and
|
||||
# others acting on its behalf a paid-up, nonexclusive, irrevocable
|
||||
# worldwide license in this data to reproduce, prepare derivative works,
|
||||
# distribute copies to the public, perform publicly and display
|
||||
# publicly, and to permit others to do so.
|
||||
#
|
||||
# NEITHER THE UNITED STATES GOVERNMENT, NOR THE UNITED STATES DEPARTMENT
|
||||
# OF ENERGY, NOR SANDIA CORPORATION, NOR ANY OF THEIR EMPLOYEES, MAKES
|
||||
# ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LEGAL LIABILITY OR
|
||||
# RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR USEFULNESS OF ANY
|
||||
# INFORMATION, APPARATUS, PRODUCT, OR PROCESS DISCLOSED, OR REPRESENTS
|
||||
# THAT ITS USE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS.
|
||||
#
|
||||
# ************************************************************************
|
||||
# @HEADER
|
||||
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality detection and control library.
|
||||
#
|
||||
# Acquisition information:
|
||||
# Date checked: July 2014
|
||||
# Checked by: H. Carter Edwards <hcedwar AT sandia.gov>
|
||||
# Source: https://code.google.com/p/qthreads
|
||||
#
|
||||
|
||||
TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( QTHREADS
|
||||
REQUIRED_HEADERS qthread.h
|
||||
REQUIRED_LIBS_NAMES "qthread"
|
||||
)
|
||||
338
lib/kokkos/cmake/fake_tribits.cmake
Normal file
338
lib/kokkos/cmake/fake_tribits.cmake
Normal file
@ -0,0 +1,338 @@
|
||||
#These are tribits wrappers used by all projects in the Kokkos ecosystem
|
||||
|
||||
INCLUDE(CMakeParseArguments)
|
||||
INCLUDE(CTest)
|
||||
|
||||
cmake_policy(SET CMP0054 NEW)
|
||||
|
||||
FUNCTION(ASSERT_DEFINED VARS)
|
||||
FOREACH(VAR ${VARS})
|
||||
IF(NOT DEFINED ${VAR})
|
||||
MESSAGE(SEND_ERROR "Error, the variable ${VAR} is not defined!")
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(KOKKOS_ADD_OPTION_AND_DEFINE USER_OPTION_NAME MACRO_DEFINE_NAME DOCSTRING DEFAULT_VALUE )
|
||||
SET( ${USER_OPTION_NAME} "${DEFAULT_VALUE}" CACHE BOOL "${DOCSTRING}" )
|
||||
IF(NOT ${MACRO_DEFINE_NAME} STREQUAL "")
|
||||
IF(${USER_OPTION_NAME})
|
||||
GLOBAL_SET(${MACRO_DEFINE_NAME} ON)
|
||||
ELSE()
|
||||
GLOBAL_SET(${MACRO_DEFINE_NAME} OFF)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(GLOBAL_RESET VARNAME)
|
||||
SET(${VARNAME} "" CACHE INTERNAL "" FORCE)
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(GLOBAL_OVERWRITE VARNAME VALUE TYPE)
|
||||
SET(${VARNAME} ${VALUE} CACHE ${TYPE} "" FORCE)
|
||||
ENDMACRO()
|
||||
|
||||
IF (NOT KOKKOS_HAS_TRILINOS)
|
||||
MACRO(APPEND_GLOB VAR)
|
||||
FILE(GLOB LOCAL_TMP_VAR ${ARGN})
|
||||
LIST(APPEND ${VAR} ${LOCAL_TMP_VAR})
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(GLOBAL_SET VARNAME)
|
||||
SET(${VARNAME} ${ARGN} CACHE INTERNAL "" FORCE)
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(VERIFY_EMPTY CONTEXT)
|
||||
if(${ARGN})
|
||||
MESSAGE(FATAL_ERROR "Kokkos does not support all of Tribits. Unhandled arguments in ${CONTEXT}:\n${ARGN}")
|
||||
endif()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(PREPEND_GLOBAL_SET VARNAME)
|
||||
ASSERT_DEFINED(${VARNAME})
|
||||
GLOBAL_SET(${VARNAME} ${ARGN} ${${VARNAME}})
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(PREPEND_TARGET_SET VARNAME TARGET_NAME TYPE)
|
||||
IF(TYPE STREQUAL "REQUIRED")
|
||||
SET(REQUIRED TRUE)
|
||||
ELSE()
|
||||
SET(REQUIRED FALSE)
|
||||
ENDIF()
|
||||
IF(TARGET ${TARGET_NAME})
|
||||
PREPEND_GLOBAL_SET(${VARNAME} ${TARGET_NAME})
|
||||
ELSE()
|
||||
IF(REQUIRED)
|
||||
MESSAGE(FATAL_ERROR "Missing dependency ${TARGET_NAME}")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDMACRO()
|
||||
endif()
|
||||
|
||||
|
||||
FUNCTION(KOKKOS_CONFIGURE_FILE PACKAGE_NAME_CONFIG_FILE)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_CONFIGURE_FILE(${PACKAGE_NAME_CONFIG_FILE})
|
||||
else()
|
||||
# Configure the file
|
||||
CONFIGURE_FILE(
|
||||
${PACKAGE_SOURCE_DIR}/cmake/${PACKAGE_NAME_CONFIG_FILE}.in
|
||||
${CMAKE_CURRENT_BINARY_DIR}/${PACKAGE_NAME_CONFIG_FILE}
|
||||
)
|
||||
endif()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(ADD_INTERFACE_LIBRARY LIB_NAME)
|
||||
FILE(WRITE ${CMAKE_CURRENT_BINARY_DIR}/dummy.cpp "")
|
||||
ADD_LIBRARY(${LIB_NAME} STATIC ${CMAKE_CURRENT_BINARY_DIR}/dummy.cpp)
|
||||
SET_TARGET_PROPERTIES(${LIB_NAME} PROPERTIES INTERFACE TRUE)
|
||||
ENDMACRO()
|
||||
|
||||
IF(NOT TARGET check)
|
||||
ADD_CUSTOM_TARGET(check COMMAND ${CMAKE_CTEST_COMMAND} -VV -C ${CMAKE_CFG_INTDIR})
|
||||
ENDIF()
|
||||
|
||||
FUNCTION(KOKKOS_ADD_TEST)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
CMAKE_PARSE_ARGUMENTS(TEST
|
||||
""
|
||||
"EXE;NAME"
|
||||
""
|
||||
${ARGN})
|
||||
IF(TEST_EXE)
|
||||
SET(EXE_ROOT ${TEST_EXE})
|
||||
ELSE()
|
||||
SET(EXE_ROOT ${TEST_NAME})
|
||||
ENDIF()
|
||||
|
||||
TRIBITS_ADD_TEST(
|
||||
${EXE_ROOT}
|
||||
NAME ${TEST_NAME}
|
||||
${ARGN}
|
||||
COMM serial mpi
|
||||
NUM_MPI_PROCS 1
|
||||
${TEST_UNPARSED_ARGUMENTS}
|
||||
)
|
||||
else()
|
||||
CMAKE_PARSE_ARGUMENTS(TEST
|
||||
"WILL_FAIL"
|
||||
"FAIL_REGULAR_EXPRESSION;PASS_REGULAR_EXPRESSION;EXE;NAME"
|
||||
"CATEGORIES;CMD_ARGS"
|
||||
${ARGN})
|
||||
IF(TEST_EXE)
|
||||
SET(EXE ${TEST_EXE})
|
||||
ELSE()
|
||||
SET(EXE ${TEST_NAME})
|
||||
ENDIF()
|
||||
IF(WIN32)
|
||||
ADD_TEST(NAME ${TEST_NAME} WORKING_DIRECTORY ${LIBRARY_OUTPUT_PATH} COMMAND ${EXE}${CMAKE_EXECUTABLE_SUFFIX} ${TEST_CMD_ARGS})
|
||||
ELSE()
|
||||
ADD_TEST(NAME ${TEST_NAME} COMMAND ${EXE} ${TEST_CMD_ARGS})
|
||||
ENDIF()
|
||||
IF(TEST_WILL_FAIL)
|
||||
SET_TESTS_PROPERTIES(${TEST_NAME} PROPERTIES WILL_FAIL ${TEST_WILL_FAIL})
|
||||
ENDIF()
|
||||
IF(TEST_FAIL_REGULAR_EXPRESSION)
|
||||
SET_TESTS_PROPERTIES(${TEST_NAME} PROPERTIES FAIL_REGULAR_EXPRESSION ${TEST_FAIL_REGULAR_EXPRESSION})
|
||||
ENDIF()
|
||||
IF(TEST_PASS_REGULAR_EXPRESSION)
|
||||
SET_TESTS_PROPERTIES(${TEST_NAME} PROPERTIES PASS_REGULAR_EXPRESSION ${TEST_PASS_REGULAR_EXPRESSION})
|
||||
ENDIF()
|
||||
VERIFY_EMPTY(KOKKOS_ADD_TEST ${TEST_UNPARSED_ARGUMENTS})
|
||||
endif()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_ADD_ADVANCED_TEST)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_ADD_ADVANCED_TEST(${ARGN})
|
||||
else()
|
||||
# TODO Write this
|
||||
endif()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(KOKKOS_CREATE_IMPORTED_TPL_LIBRARY TPL_NAME)
|
||||
ADD_INTERFACE_LIBRARY(TPL_LIB_${TPL_NAME})
|
||||
TARGET_LINK_LIBRARIES(TPL_LIB_${TPL_NAME} LINK_PUBLIC ${TPL_${TPL_NAME}_LIBRARIES})
|
||||
TARGET_INCLUDE_DIRECTORIES(TPL_LIB_${TPL_NAME} INTERFACE ${TPL_${TPL_NAME}_INCLUDE_DIRS})
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(KOKKOS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES TPL_NAME)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES(${TPL_NAME} ${ARGN})
|
||||
else()
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
""
|
||||
""
|
||||
"REQUIRED_HEADERS;REQUIRED_LIBS_NAMES"
|
||||
${ARGN})
|
||||
|
||||
SET(_${TPL_NAME}_ENABLE_SUCCESS TRUE)
|
||||
IF (PARSE_REQUIRED_LIBS_NAMES)
|
||||
FIND_LIBRARY(TPL_${TPL_NAME}_LIBRARIES NAMES ${PARSE_REQUIRED_LIBS_NAMES})
|
||||
IF(NOT TPL_${TPL_NAME}_LIBRARIES)
|
||||
SET(_${TPL_NAME}_ENABLE_SUCCESS FALSE)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
IF (PARSE_REQUIRED_HEADERS)
|
||||
FIND_PATH(TPL_${TPL_NAME}_INCLUDE_DIRS NAMES ${PARSE_REQUIRED_HEADERS})
|
||||
IF(NOT TPL_${TPL_NAME}_INCLUDE_DIRS)
|
||||
SET(_${TPL_NAME}_ENABLE_SUCCESS FALSE)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
IF (_${TPL_NAME}_ENABLE_SUCCESS)
|
||||
KOKKOS_CREATE_IMPORTED_TPL_LIBRARY(${TPL_NAME})
|
||||
ENDIF()
|
||||
VERIFY_EMPTY(KOKKOS_CREATE_IMPORTED_TPL_LIBRARY ${PARSE_UNPARSED_ARGUMENTS})
|
||||
endif()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(KOKKOS_TARGET_COMPILE_OPTIONS TARGET)
|
||||
if(KOKKOS_HAS_TRILINOS)
|
||||
TARGET_COMPILE_OPTIONS(${TARGET} ${ARGN})
|
||||
else()
|
||||
TARGET_COMPILE_OPTIONS(${TARGET} ${ARGN})
|
||||
endif()
|
||||
ENDMACRO()
|
||||
|
||||
|
||||
MACRO(KOKKOS_EXCLUDE_AUTOTOOLS_FILES)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_EXCLUDE_AUTOTOOLS_FILES()
|
||||
else()
|
||||
#do nothing
|
||||
endif()
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(KOKKOS_LIB_TYPE LIB RET)
|
||||
GET_TARGET_PROPERTY(PROP ${LIB} TYPE)
|
||||
IF (${PROP} STREQUAL "INTERFACE_LIBRARY")
|
||||
SET(${RET} "INTERFACE" PARENT_SCOPE)
|
||||
ELSE()
|
||||
SET(${RET} "PUBLIC" PARENT_SCOPE)
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_TARGET_INCLUDE_DIRECTORIES TARGET)
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
KOKKOS_LIB_TYPE(${TARGET} INCTYPE)
|
||||
#don't trust tribits to do this correctly - but need to add package name
|
||||
TARGET_INCLUDE_DIRECTORIES(${TARGET} ${INCTYPE} ${ARGN})
|
||||
ELSEIF(TARGET ${TARGET})
|
||||
#the target actually exists - this means we are doing separate libs
|
||||
#or this a test library
|
||||
KOKKOS_LIB_TYPE(${TARGET} INCTYPE)
|
||||
TARGET_INCLUDE_DIRECTORIES(${TARGET} ${INCTYPE} ${ARGN})
|
||||
ELSE()
|
||||
GET_PROPERTY(LIBS GLOBAL PROPERTY KOKKOS_LIBRARIES_NAMES)
|
||||
IF (${TARGET} IN_LIST LIBS)
|
||||
SET_PROPERTY(GLOBAL APPEND PROPERTY KOKKOS_LIBRARY_INCLUDES ${ARGN})
|
||||
ELSE()
|
||||
MESSAGE(FATAL_ERROR "Trying to set include directories on unknown target ${TARGET}")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_LINK_INTERNAL_LIBRARY TARGET DEPLIB)
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
#do nothing
|
||||
ELSE()
|
||||
SET(options INTERFACE)
|
||||
SET(oneValueArgs)
|
||||
SET(multiValueArgs)
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
"INTERFACE"
|
||||
""
|
||||
""
|
||||
${ARGN})
|
||||
SET(LINK_TYPE)
|
||||
IF(PARSE_INTERFACE)
|
||||
SET(LINK_TYPE INTERFACE)
|
||||
ELSE()
|
||||
SET(LINK_TYPE PUBLIC)
|
||||
ENDIF()
|
||||
TARGET_LINK_LIBRARIES(${TARGET} ${LINK_TYPE} ${DEPLIB})
|
||||
VERIFY_EMPTY(KOKKOS_LINK_INTERNAL_LIBRARY ${PARSE_UNPARSED_ARGUMENTS})
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_ADD_TEST_LIBRARY NAME)
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_ADD_LIBRARY(${NAME} ${ARGN} TESTONLY
|
||||
ADDED_LIB_TARGET_NAME_OUT ${NAME}
|
||||
)
|
||||
ELSE()
|
||||
SET(oneValueArgs)
|
||||
SET(multiValueArgs HEADERS SOURCES)
|
||||
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
"STATIC;SHARED"
|
||||
""
|
||||
"HEADERS;SOURCES"
|
||||
${ARGN})
|
||||
|
||||
IF(PARSE_HEADERS)
|
||||
LIST(REMOVE_DUPLICATES PARSE_HEADERS)
|
||||
ENDIF()
|
||||
IF(PARSE_SOURCES)
|
||||
LIST(REMOVE_DUPLICATES PARSE_SOURCES)
|
||||
ENDIF()
|
||||
ADD_LIBRARY(${NAME} ${PARSE_SOURCES})
|
||||
target_link_libraries(
|
||||
${NAME}
|
||||
PUBLIC kokkos
|
||||
)
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
|
||||
FUNCTION(KOKKOS_TARGET_COMPILE_DEFINITIONS)
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
TARGET_COMPILE_DEFINITIONS(${TARGET} ${ARGN})
|
||||
ELSE()
|
||||
TARGET_COMPILE_DEFINITIONS(${TARGET} ${ARGN})
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_INCLUDE_DIRECTORIES)
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_INCLUDE_DIRECTORIES(${ARGN})
|
||||
ELSE()
|
||||
CMAKE_PARSE_ARGUMENTS(
|
||||
INC
|
||||
"REQUIRED_DURING_INSTALLATION_TESTING"
|
||||
""
|
||||
""
|
||||
${ARGN}
|
||||
)
|
||||
INCLUDE_DIRECTORIES(${INC_UNPARSED_ARGUMENTS})
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
|
||||
MACRO(KOKKOS_ADD_COMPILE_OPTIONS)
|
||||
ADD_COMPILE_OPTIONS(${ARGN})
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(PRINTALL match)
|
||||
get_cmake_property(_variableNames VARIABLES)
|
||||
list (SORT _variableNames)
|
||||
foreach (_variableName ${_variableNames})
|
||||
if("${_variableName}" MATCHES "${match}")
|
||||
message(STATUS "${_variableName}=${${_variableName}}")
|
||||
endif()
|
||||
endforeach()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(SET_GLOBAL_REPLACE SUBSTR VARNAME)
|
||||
STRING(REPLACE ${SUBSTR} ${${VARNAME}} TEMP)
|
||||
GLOBAL_SET(${VARNAME} ${TEMP})
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(GLOBAL_APPEND VARNAME)
|
||||
#We make this a function since we are setting variables
|
||||
#and want to use scope to avoid overwriting local variables
|
||||
SET(TEMP ${${VARNAME}})
|
||||
LIST(APPEND TEMP ${ARGN})
|
||||
GLOBAL_SET(${VARNAME} ${TEMP})
|
||||
ENDFUNCTION()
|
||||
|
||||
23
lib/kokkos/cmake/gnu.cmake
Normal file
23
lib/kokkos/cmake/gnu.cmake
Normal file
@ -0,0 +1,23 @@
|
||||
|
||||
FUNCTION(kokkos_set_gnu_flags full_standard int_standard)
|
||||
STRING(TOLOWER ${full_standard} FULL_LC_STANDARD)
|
||||
STRING(TOLOWER ${int_standard} INT_LC_STANDARD)
|
||||
# The following three blocks of code were copied from
|
||||
# /Modules/Compiler/Intel-CXX.cmake from CMake 3.7.2 and then modified.
|
||||
IF(CMAKE_CXX_SIMULATE_ID STREQUAL MSVC)
|
||||
SET(_std -Qstd)
|
||||
SET(_ext c++)
|
||||
ELSE()
|
||||
SET(_std -std)
|
||||
SET(_ext gnu++)
|
||||
ENDIF()
|
||||
|
||||
IF (CMAKE_CXX_EXTENSIONS)
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG "-std=gnu++${FULL_LC_STANDARD}" PARENT_SCOPE)
|
||||
SET(KOKKOS_CXX_INTERMEDIATE_STANDARD_FLAG "-std=gnu++${INT_LC_STANDARD}" PARENT_SCOPE)
|
||||
ELSE()
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG "-std=c++${FULL_LC_STANDARD}" PARENT_SCOPE)
|
||||
SET(KOKKOS_CXX_INTERMEDIATE_STANDARD_FLAG "-std=c++${INT_LC_STANDARD}" PARENT_SCOPE)
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
30
lib/kokkos/cmake/intel.cmake
Normal file
30
lib/kokkos/cmake/intel.cmake
Normal file
@ -0,0 +1,30 @@
|
||||
|
||||
FUNCTION(kokkos_set_intel_flags full_standard int_standard)
|
||||
STRING(TOLOWER ${full_standard} FULL_LC_STANDARD)
|
||||
STRING(TOLOWER ${int_standard} INT_LC_STANDARD)
|
||||
# The following three blocks of code were copied from
|
||||
# /Modules/Compiler/Intel-CXX.cmake from CMake 3.7.2 and then modified.
|
||||
IF(CMAKE_CXX_SIMULATE_ID STREQUAL MSVC)
|
||||
SET(_std -Qstd)
|
||||
SET(_ext c++)
|
||||
ELSE()
|
||||
SET(_std -std)
|
||||
SET(_ext gnu++)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT KOKKOS_CXX_STANDARD STREQUAL 11 AND NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 15.0.2)
|
||||
#There is no gnu++14 value supported; figure out what to do.
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG "${_std}=c++${FULL_LC_STANDARD}" PARENT_SCOPE)
|
||||
SET(KOKKOS_CXX_INTERMEDIATE_STANDARD_FLAG "${_std}=c++${INT_LC_STANDARD}" PARENT_SCOPE)
|
||||
ELSEIF(KOKKOS_CXX_STANDARD STREQUAL 11 AND NOT CMAKE_CXX_COMPILER_VERSION VERSION_LESS 13.0)
|
||||
IF (CMAKE_CXX_EXTENSIONS)
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG "${_std}=${_ext}c++11" PARENT_SCOPE)
|
||||
ELSE()
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG "${_std}=c++11" PARENT_SCOPE)
|
||||
ENDIF()
|
||||
ELSE()
|
||||
MESSAGE(FATAL_ERROR "Intel compiler version too low - need 13.0 for C++11 and 15.0 for C++14")
|
||||
ENDIF()
|
||||
|
||||
ENDFUNCTION()
|
||||
|
||||
438
lib/kokkos/cmake/kokkos_arch.cmake
Normal file
438
lib/kokkos/cmake/kokkos_arch.cmake
Normal file
@ -0,0 +1,438 @@
|
||||
|
||||
FUNCTION(KOKKOS_ARCH_OPTION SUFFIX DEV_TYPE DESCRIPTION)
|
||||
#all optimizations off by default
|
||||
KOKKOS_OPTION(ARCH_${SUFFIX} OFF BOOL "Optimize for ${DESCRIPTION} (${DEV_TYPE})")
|
||||
IF (KOKKOS_ARCH_${SUFFIX})
|
||||
LIST(APPEND KOKKOS_ENABLED_ARCH_LIST ${SUFFIX})
|
||||
SET(KOKKOS_ENABLED_ARCH_LIST ${KOKKOS_ENABLED_ARCH_LIST} PARENT_SCOPE)
|
||||
ENDIF()
|
||||
SET(KOKKOS_ARCH_${SUFFIX} ${KOKKOS_ARCH_${SUFFIX}} PARENT_SCOPE)
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(ARCH_FLAGS)
|
||||
SET(COMPILERS NVIDIA PGI XL DEFAULT Cray Intel Clang AppleClang GNU)
|
||||
CMAKE_PARSE_ARGUMENTS(
|
||||
PARSE
|
||||
"LINK_ONLY;COMPILE_ONLY"
|
||||
""
|
||||
"${COMPILERS}"
|
||||
${ARGN})
|
||||
|
||||
SET(COMPILER ${KOKKOS_CXX_COMPILER_ID})
|
||||
|
||||
SET(FLAGS)
|
||||
SET(NEW_COMPILE_OPTIONS)
|
||||
SET(NEW_XCOMPILER_OPTIONS)
|
||||
SET(NEW_LINK_OPTIONS)
|
||||
LIST(APPEND NEW_XCOMPILER_OPTIONS ${KOKKOS_XCOMPILER_OPTIONS})
|
||||
LIST(APPEND NEW_COMPILE_OPTIONS ${KOKKOS_COMPILE_OPTIONS})
|
||||
LIST(APPEND NEW_LINK_OPTIONS ${KOKKOS_LINK_OPTIONS})
|
||||
FOREACH(COMP ${COMPILERS})
|
||||
IF (COMPILER STREQUAL "${COMP}")
|
||||
IF (PARSE_${COMPILER})
|
||||
IF (NOT "${PARSE_${COMPILER}}" STREQUAL "NO-VALUE-SPECIFIED")
|
||||
SET(FLAGS ${PARSE_${COMPILER}})
|
||||
ENDIF()
|
||||
ELSEIF(PARSE_DEFAULT)
|
||||
SET(FLAGS ${PARSE_DEFAULT})
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
|
||||
IF (NOT LINK_ONLY)
|
||||
# The funky logic here is for future handling of argument deduplication
|
||||
# If we naively pass multiple -Xcompiler flags to target_compile_options
|
||||
# -Xcompiler will get deduplicated and break the build
|
||||
IF ("-Xcompiler" IN_LIST FLAGS)
|
||||
LIST(REMOVE_ITEM FLAGS "-Xcompiler")
|
||||
GLOBAL_APPEND(KOKKOS_XCOMPILER_OPTIONS ${FLAGS})
|
||||
ELSE()
|
||||
GLOBAL_APPEND(KOKKOS_COMPILE_OPTIONS ${FLAGS})
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF (NOT COMPILE_ONLY)
|
||||
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS ${FLAGS})
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
# Make sure devices and compiler ID are done
|
||||
KOKKOS_CFG_DEPENDS(ARCH COMPILER_ID)
|
||||
KOKKOS_CFG_DEPENDS(ARCH DEVICES)
|
||||
KOKKOS_CFG_DEPENDS(ARCH OPTIONS)
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
# List of possible host architectures.
|
||||
#-------------------------------------------------------------------------------
|
||||
SET(KOKKOS_ARCH_LIST)
|
||||
|
||||
|
||||
KOKKOS_DEPRECATED_LIST(ARCH ARCH)
|
||||
KOKKOS_ARCH_OPTION(AMDAVX HOST "AMD chip")
|
||||
KOKKOS_ARCH_OPTION(ARMV80 HOST "ARMv8.0 Compatible CPU")
|
||||
KOKKOS_ARCH_OPTION(ARMV81 HOST "ARMv8.1 Compatible CPU")
|
||||
KOKKOS_ARCH_OPTION(ARMV8_THUNDERX HOST "ARMv8 Cavium ThunderX CPU")
|
||||
KOKKOS_ARCH_OPTION(ARMV8_THUNDERX2 HOST "ARMv8 Cavium ThunderX2 CPU")
|
||||
KOKKOS_ARCH_OPTION(WSM HOST "Intel Westmere CPU")
|
||||
KOKKOS_ARCH_OPTION(SNB HOST "Intel Sandy/Ivy Bridge CPUs")
|
||||
KOKKOS_ARCH_OPTION(HSW HOST "Intel Haswell CPUs")
|
||||
KOKKOS_ARCH_OPTION(BDW HOST "Intel Broadwell Xeon E-class CPUs")
|
||||
KOKKOS_ARCH_OPTION(SKX HOST "Intel Sky Lake Xeon E-class HPC CPUs (AVX512)")
|
||||
KOKKOS_ARCH_OPTION(KNC HOST "Intel Knights Corner Xeon Phi")
|
||||
KOKKOS_ARCH_OPTION(KNL HOST "Intel Knights Landing Xeon Phi")
|
||||
KOKKOS_ARCH_OPTION(BGQ HOST "IBM Blue Gene Q")
|
||||
KOKKOS_ARCH_OPTION(POWER7 HOST "IBM POWER7 CPUs")
|
||||
KOKKOS_ARCH_OPTION(POWER8 HOST "IBM POWER8 CPUs")
|
||||
KOKKOS_ARCH_OPTION(POWER9 HOST "IBM POWER9 CPUs")
|
||||
KOKKOS_ARCH_OPTION(KEPLER30 GPU "NVIDIA Kepler generation CC 3.0")
|
||||
KOKKOS_ARCH_OPTION(KEPLER32 GPU "NVIDIA Kepler generation CC 3.2")
|
||||
KOKKOS_ARCH_OPTION(KEPLER35 GPU "NVIDIA Kepler generation CC 3.5")
|
||||
KOKKOS_ARCH_OPTION(KEPLER37 GPU "NVIDIA Kepler generation CC 3.7")
|
||||
KOKKOS_ARCH_OPTION(MAXWELL50 GPU "NVIDIA Maxwell generation CC 5.0")
|
||||
KOKKOS_ARCH_OPTION(MAXWELL52 GPU "NVIDIA Maxwell generation CC 5.2")
|
||||
KOKKOS_ARCH_OPTION(MAXWELL53 GPU "NVIDIA Maxwell generation CC 5.3")
|
||||
KOKKOS_ARCH_OPTION(PASCAL60 GPU "NVIDIA Pascal generation CC 6.0")
|
||||
KOKKOS_ARCH_OPTION(PASCAL61 GPU "NVIDIA Pascal generation CC 6.1")
|
||||
KOKKOS_ARCH_OPTION(VOLTA70 GPU "NVIDIA Volta generation CC 7.0")
|
||||
KOKKOS_ARCH_OPTION(VOLTA72 GPU "NVIDIA Volta generation CC 7.2")
|
||||
KOKKOS_ARCH_OPTION(TURING75 GPU "NVIDIA Turing generation CC 7.5")
|
||||
KOKKOS_ARCH_OPTION(EPYC HOST "AMD Epyc architecture")
|
||||
|
||||
|
||||
IF (KOKKOS_ENABLE_CUDA)
|
||||
#Regardless of version, make sure we define the general architecture name
|
||||
IF (KOKKOS_ARCH_KEPLER30 OR KOKKOS_ARCH_KEPLER32 OR KOKKOS_ARCH_KEPLER35 OR KOKKOS_ARCH_KEPLER37)
|
||||
SET(KOKKOS_ARCH_KEPLER ON)
|
||||
ENDIF()
|
||||
|
||||
#Regardless of version, make sure we define the general architecture name
|
||||
IF (KOKKOS_ARCH_MAXWELL50 OR KOKKOS_ARCH_MAXWELL52 OR KOKKOS_ARCH_MAXWELL53)
|
||||
SET(KOKKOS_ARCH_MAXWELL ON)
|
||||
ENDIF()
|
||||
|
||||
#Regardless of version, make sure we define the general architecture name
|
||||
IF (KOKKOS_ARCH_PASCAL60 OR KOKKOS_ARCH_PASCAL61)
|
||||
SET(KOKKOS_ARCH_PASCAL ON)
|
||||
ENDIF()
|
||||
|
||||
#Regardless of version, make sure we define the general architecture name
|
||||
IF (KOKKOS_ARCH_VOLTA70 OR KOKKOS_ARCH_VOLTA72)
|
||||
SET(KOKKOS_ARCH_VOLTA ON)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
|
||||
|
||||
IF(KOKKOS_ENABLE_COMPILER_WARNINGS)
|
||||
SET(COMMON_WARNINGS
|
||||
"-Wall" "-Wshadow" "-pedantic"
|
||||
"-Wsign-compare" "-Wtype-limits" "-Wuninitialized")
|
||||
|
||||
SET(GNU_WARNINGS "-Wempty-body" "-Wclobbered" "-Wignored-qualifiers"
|
||||
${COMMON_WARNINGS})
|
||||
|
||||
ARCH_FLAGS(
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
GNU ${GNU_WARNINGS}
|
||||
DEFAULT ${COMMON_WARNINGS}
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
|
||||
#------------------------------- KOKKOS_CUDA_OPTIONS ---------------------------
|
||||
GLOBAL_RESET(KOKKOS_CUDA_OPTIONS)
|
||||
# Construct the Makefile options
|
||||
IF (KOKKOS_ENABLE_CUDA_LAMBDA)
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS "-expt-extended-lambda")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ENABLE_CUDA_CONSTEXPR)
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS "-expt-relaxed-constexpr")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_CXX_COMPILER_ID STREQUAL Clang)
|
||||
SET(CUDA_ARCH_FLAG "--cuda-gpu-arch")
|
||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS -x cuda)
|
||||
IF (KOKKOS_ENABLE_CUDA)
|
||||
SET(KOKKOS_IMPL_CUDA_CLANG_WORKAROUND ON CACHE BOOL "enable CUDA Clang workarounds" FORCE)
|
||||
ENDIF()
|
||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
SET(CUDA_ARCH_FLAG "-arch")
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
STRING(TOUPPER "${CMAKE_BUILD_TYPE}" _UPPERCASE_CMAKE_BUILD_TYPE)
|
||||
IF (KOKKOS_ENABLE_DEBUG OR _UPPERCASE_CMAKE_BUILD_TYPE STREQUAL "DEBUG")
|
||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS -lineinfo)
|
||||
ENDIF()
|
||||
UNSET(_UPPERCASE_CMAKE_BUILD_TYPE)
|
||||
IF (KOKKOS_CXX_COMPILER_VERSION VERSION_GREATER 9.0 OR KOKKOS_CXX_COMPILER_VERSION VERSION_EQUAL 9.0)
|
||||
GLOBAL_APPEND(KOKKOS_CUDAFE_OPTIONS --diag_suppress=esa_on_defaulted_function_ignored)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF(KOKKOS_ENABLE_OPENMP)
|
||||
IF (KOKKOS_CXX_COMPILER_ID STREQUAL AppleClang)
|
||||
MESSAGE(FATAL_ERROR "Apple Clang does not support OpenMP. Use native Clang instead")
|
||||
ENDIF()
|
||||
ARCH_FLAGS(
|
||||
Clang -fopenmp=libomp
|
||||
PGI -mp
|
||||
NVIDIA -Xcompiler -fopenmp
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
XL -qsmp=omp
|
||||
DEFAULT -fopenmp
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_ARMV80)
|
||||
ARCH_FLAGS(
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
DEFAULT -march=armv8-a
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_ARMV81)
|
||||
ARCH_FLAGS(
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
DEFAULT -march=armv8.1-a
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_ARMV8_THUNDERX)
|
||||
SET(KOKKOS_ARCH_ARMV80 ON) #Not a cache variable
|
||||
ARCH_FLAGS(
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
DEFAULT -march=armv8-a -mtune=thunderx
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_ARMV8_THUNDERX2)
|
||||
SET(KOKKOS_ARCH_ARMV81 ON) #Not a cache variable
|
||||
ARCH_FLAGS(
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
DEFAULT -mcpu=thunderx2t99 -mtune=thunderx2t99
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_EPYC)
|
||||
ARCH_FLAGS(
|
||||
Intel -mavx2
|
||||
DEFAULT -march=znver1 -mtune=znver1
|
||||
)
|
||||
SET(KOKKOS_ARCH_AMD_EPYC ON)
|
||||
SET(KOKKOS_ARCH_AMD_AVX2 ON)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_WSM)
|
||||
ARCH_FLAGS(
|
||||
Intel -xSSE4.2
|
||||
PGI -tp=nehalem
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
DEFAULT -msse4.2
|
||||
)
|
||||
SET(KOKKOS_ARCH_SSE42 ON)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_SNB OR KOKKOS_ARCH_AMDAVX)
|
||||
SET(KOKKOS_ARCH_AVX ON)
|
||||
ARCH_FLAGS(
|
||||
Intel -mavx
|
||||
PGI -tp=sandybridge
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
DEFAULT -mavx
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_HSW)
|
||||
SET(KOKKOS_ARCH_AVX2 ON)
|
||||
ARCH_FLAGS(
|
||||
Intel -xCORE-AVX2
|
||||
PGI -tp=haswell
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
DEFAULT -march=core-avx2 -mtune=core-avx2
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_BDW)
|
||||
SET(KOKKOS_ARCH_AVX2 ON)
|
||||
ARCH_FLAGS(
|
||||
Intel -xCORE-AVX2
|
||||
PGI -tp=haswell
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
DEFAULT -march=core-avx2 -mtune=core-avx2 -mrtm
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_EPYC)
|
||||
SET(KOKKOS_ARCH_AMD_AVX2 ON)
|
||||
ARCH_FLAGS(
|
||||
Intel -mvax2
|
||||
DEFAULT -march=znver1 -mtune=znver1
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_KNL)
|
||||
#avx512-mic
|
||||
SET(KOKKOS_ARCH_AVX512MIC ON) #not a cache variable
|
||||
ARCH_FLAGS(
|
||||
Intel -xMIC-AVX512
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
DEFAULT -march=knl -mtune=knl
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_KNC)
|
||||
SET(KOKKOS_USE_ISA_KNC ON)
|
||||
ARCH_FLAGS(
|
||||
DEFAULT -mmic
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_SKX)
|
||||
#avx512-xeon
|
||||
SET(KOKKOS_ARCH_AVX512XEON ON)
|
||||
ARCH_FLAGS(
|
||||
Intel -xCORE-AVX512
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
Cray NO-VALUE-SPECIFIED
|
||||
DEFAULT -march=skylake-avx512 -mtune=skylake-avx512 -mrtm
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_WSM OR KOKKOS_ARCH_SNB OR KOKKOS_ARCH_HSW OR KOKKOS_ARCH_BDW OR KOKKOS_ARCH_KNL OR KOKKOS_ARCH_SKX OR KOKKOS_ARCH_EPYC)
|
||||
SET(KOKKOS_USE_ISA_X86_64 ON)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_BDW OR KOKKOS_ARCH_SKX)
|
||||
SET(KOKKOS_ENABLE_TM ON) #not a cache variable
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_POWER7)
|
||||
ARCH_FLAGS(
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
DEFAULT -mcpu=power7 -mtune=power7
|
||||
)
|
||||
SET(KOKKOS_USE_ISA_POWERPCBE ON)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_POWER8)
|
||||
ARCH_FLAGS(
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
NVIDIA NO-VALUE-SPECIFIED
|
||||
DEFAULT -mcpu=power8 -mtune=power8
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_POWER9)
|
||||
ARCH_FLAGS(
|
||||
PGI NO-VALUE-SPECIFIED
|
||||
NVIDIA NO-VALUE-SPECIFIED
|
||||
DEFAULT -mcpu=power9 -mtune=power9
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_ARCH_POWER8 OR KOKKOS_ARCH_POWER9)
|
||||
SET(KOKKOS_USE_ISA_POWERPCLE ON)
|
||||
ENDIF()
|
||||
|
||||
IF (Kokkos_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE)
|
||||
ARCH_FLAGS(
|
||||
Clang -fcuda-rdc
|
||||
NVIDIA --relocatable-device-code=true
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
|
||||
SET(CUDA_ARCH_ALREADY_SPECIFIED "")
|
||||
FUNCTION(CHECK_CUDA_ARCH ARCH FLAG)
|
||||
IF(KOKKOS_ARCH_${ARCH})
|
||||
IF(CUDA_ARCH_ALREADY_SPECIFIED)
|
||||
MESSAGE(FATAL_ERROR "Multiple GPU architectures given! Already have ${CUDA_ARCH_ALREADY_SPECIFIED}, but trying to add ${ARCH}. If you are re-running CMake, try clearing the cache and running again.")
|
||||
ENDIF()
|
||||
SET(CUDA_ARCH_ALREADY_SPECIFIED ${ARCH} PARENT_SCOPE)
|
||||
IF (NOT KOKKOS_ENABLE_CUDA)
|
||||
MESSAGE(WARNING "Given CUDA arch ${ARCH}, but Kokkos_ENABLE_CUDA is OFF. Option will be ignored.")
|
||||
UNSET(KOKKOS_ARCH_${ARCH} PARENT_SCOPE)
|
||||
ELSE()
|
||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS "${CUDA_ARCH_FLAG}=${FLAG}")
|
||||
IF(KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE)
|
||||
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS "${CUDA_ARCH_FLAG}=${FLAG}")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
|
||||
CHECK_CUDA_ARCH(KEPLER30 sm_30)
|
||||
CHECK_CUDA_ARCH(KEPLER32 sm_32)
|
||||
CHECK_CUDA_ARCH(KEPLER35 sm_35)
|
||||
CHECK_CUDA_ARCH(KEPLER37 sm_37)
|
||||
CHECK_CUDA_ARCH(MAXWELL50 sm_50)
|
||||
CHECK_CUDA_ARCH(MAXWELL52 sm_52)
|
||||
CHECK_CUDA_ARCH(MAXWELL53 sm_53)
|
||||
CHECK_CUDA_ARCH(PASCAL60 sm_60)
|
||||
CHECK_CUDA_ARCH(PASCAL61 sm_61)
|
||||
CHECK_CUDA_ARCH(VOLTA70 sm_70)
|
||||
CHECK_CUDA_ARCH(VOLTA72 sm_72)
|
||||
CHECK_CUDA_ARCH(TURING75 sm_75)
|
||||
|
||||
#CMake verbose is kind of pointless
|
||||
#Let's just always print things
|
||||
MESSAGE(STATUS "Execution Spaces:")
|
||||
IF(KOKKOS_ENABLE_CUDA)
|
||||
MESSAGE(STATUS " Device Parallel: CUDA")
|
||||
ELSE()
|
||||
MESSAGE(STATUS " Device Parallel: NONE")
|
||||
ENDIF()
|
||||
|
||||
FOREACH (_BACKEND OPENMP PTHREAD HPX)
|
||||
IF(KOKKOS_ENABLE_${_BACKEND})
|
||||
IF(_HOST_PARALLEL)
|
||||
MESSAGE(FATAL_ERROR "Multiple host parallel execution spaces are not allowed! "
|
||||
"Trying to enable execution space ${_BACKEND}, "
|
||||
"but execution space ${_HOST_PARALLEL} is already enabled. "
|
||||
"Remove the CMakeCache.txt file and re-configure.")
|
||||
ENDIF()
|
||||
SET(_HOST_PARALLEL ${_BACKEND})
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
|
||||
IF(NOT _HOST_PARALLEL AND NOT KOKKOS_ENABLE_SERIAL)
|
||||
MESSAGE(FATAL_ERROR "At least one host execution space must be enabled, "
|
||||
"but no host parallel execution space was requested "
|
||||
"and Kokkos_ENABLE_SERIAL=OFF.")
|
||||
ENDIF()
|
||||
|
||||
IF(NOT _HOST_PARALLEL)
|
||||
SET(_HOST_PARALLEL "NONE")
|
||||
ENDIF()
|
||||
MESSAGE(STATUS " Host Parallel: ${_HOST_PARALLEL}")
|
||||
UNSET(_HOST_PARALLEL)
|
||||
|
||||
IF(KOKKOS_ENABLE_PTHREAD)
|
||||
SET(KOKKOS_ENABLE_THREADS ON)
|
||||
ENDIF()
|
||||
|
||||
IF(KOKKOS_ENABLE_SERIAL)
|
||||
MESSAGE(STATUS " Host Serial: SERIAL")
|
||||
ELSE()
|
||||
MESSAGE(STATUS " Host Serial: NONE")
|
||||
ENDIF()
|
||||
|
||||
MESSAGE(STATUS "")
|
||||
MESSAGE(STATUS "Architectures:")
|
||||
FOREACH(Arch ${KOKKOS_ENABLED_ARCH_LIST})
|
||||
MESSAGE(STATUS " ${Arch}")
|
||||
ENDFOREACH()
|
||||
|
||||
@ -1,261 +0,0 @@
|
||||
############################ Detect if submodule ###############################
|
||||
#
|
||||
# With thanks to StackOverflow:
|
||||
# http://stackoverflow.com/questions/25199677/how-to-detect-if-current-scope-has-a-parent-in-cmake
|
||||
#
|
||||
get_directory_property(HAS_PARENT PARENT_DIRECTORY)
|
||||
if(HAS_PARENT)
|
||||
message(STATUS "Submodule build")
|
||||
SET(KOKKOS_HEADER_DIR "include/kokkos")
|
||||
else()
|
||||
message(STATUS "Standalone build")
|
||||
SET(KOKKOS_HEADER_DIR "include")
|
||||
endif()
|
||||
|
||||
################################ Handle the actual build #######################
|
||||
|
||||
SET(INSTALL_LIB_DIR lib CACHE PATH "Installation directory for libraries")
|
||||
SET(INSTALL_BIN_DIR bin CACHE PATH "Installation directory for executables")
|
||||
SET(INSTALL_INCLUDE_DIR ${KOKKOS_HEADER_DIR} CACHE PATH
|
||||
"Installation directory for header files")
|
||||
IF(WIN32 AND NOT CYGWIN)
|
||||
SET(DEF_INSTALL_CMAKE_DIR CMake)
|
||||
ELSE()
|
||||
SET(DEF_INSTALL_CMAKE_DIR lib/CMake/Kokkos)
|
||||
ENDIF()
|
||||
|
||||
SET(INSTALL_CMAKE_DIR ${DEF_INSTALL_CMAKE_DIR} CACHE PATH
|
||||
"Installation directory for CMake files")
|
||||
|
||||
# Make relative paths absolute (needed later on)
|
||||
FOREACH(p LIB BIN INCLUDE CMAKE)
|
||||
SET(var INSTALL_${p}_DIR)
|
||||
IF(NOT IS_ABSOLUTE "${${var}}")
|
||||
SET(${var} "${CMAKE_INSTALL_PREFIX}/${${var}}")
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
|
||||
# set up include-directories
|
||||
SET (Kokkos_INCLUDE_DIRS
|
||||
${Kokkos_SOURCE_DIR}/core/src
|
||||
${Kokkos_SOURCE_DIR}/containers/src
|
||||
${Kokkos_SOURCE_DIR}/algorithms/src
|
||||
${Kokkos_BINARY_DIR} # to find KokkosCore_config.h
|
||||
${KOKKOS_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
# pass include dirs back to parent scope
|
||||
if(HAS_PARENT)
|
||||
SET(Kokkos_INCLUDE_DIRS_RET ${Kokkos_INCLUDE_DIRS} PARENT_SCOPE)
|
||||
else()
|
||||
SET(Kokkos_INCLUDE_DIRS_RET ${Kokkos_INCLUDE_DIRS})
|
||||
endif()
|
||||
|
||||
INCLUDE_DIRECTORIES(${Kokkos_INCLUDE_DIRS})
|
||||
|
||||
IF(KOKKOS_SEPARATE_LIBS)
|
||||
# Sources come from makefile-generated kokkos_generated_settings.cmake file
|
||||
# Separate libs need to separate the sources
|
||||
set_kokkos_srcs(KOKKOS_SRC ${KOKKOS_SRC})
|
||||
|
||||
# kokkoscore
|
||||
ADD_LIBRARY(
|
||||
kokkoscore
|
||||
${KOKKOS_CORE_SRCS}
|
||||
)
|
||||
|
||||
target_compile_options(
|
||||
kokkoscore
|
||||
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
|
||||
)
|
||||
|
||||
target_include_directories(
|
||||
kokkoscore
|
||||
PUBLIC
|
||||
${KOKKOS_TPL_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
|
||||
if (("${lib}" STREQUAL "cuda") AND (NOT "${CMAKE_CXX_COMPILER_ID}" STREQUAL "Clang"))
|
||||
set(LIB_cuda "-lcuda")
|
||||
elseif ("${lib}" STREQUAL "hpx")
|
||||
find_package(HPX REQUIRED)
|
||||
if(${HPX_FOUND})
|
||||
target_link_libraries(kokkoscore PUBLIC ${HPX_LIBRARIES})
|
||||
target_link_libraries(kokkoscontainers PUBLIC ${HPX_LIBRARIES})
|
||||
target_link_libraries(kokkosalgorithms PUBLIC ${HPX_LIBRARIES})
|
||||
target_include_directories(kokkoscore PUBLIC ${HPX_INCLUDE_DIRS})
|
||||
target_include_directories(kokkoscontainers PUBLIC ${HPX_INCLUDE_DIRS})
|
||||
target_include_directories(kokkosalgorithms PUBLIC ${HPX_INCLUDE_DIRS})
|
||||
else()
|
||||
message(ERROR "HPX not found. Check the value of HPX_DIR (= ${HPX_DIR}) or CMAKE_PREFIX_PATH (= ${CMAKE_PREFIX_PATH}).")
|
||||
endif()
|
||||
else()
|
||||
find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
|
||||
endif()
|
||||
target_link_libraries(kokkoscore PUBLIC ${LIB_${lib}})
|
||||
endforeach()
|
||||
|
||||
target_link_libraries(kokkoscore PUBLIC "${KOKKOS_LINK_FLAGS}")
|
||||
|
||||
# Install the kokkoscore library
|
||||
INSTALL (TARGETS kokkoscore
|
||||
EXPORT KokkosTargets
|
||||
ARCHIVE DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin
|
||||
)
|
||||
|
||||
# kokkoscontainers
|
||||
if (DEFINED KOKKOS_CONTAINERS_SRCS)
|
||||
ADD_LIBRARY(
|
||||
kokkoscontainers
|
||||
${KOKKOS_CONTAINERS_SRCS}
|
||||
)
|
||||
endif()
|
||||
|
||||
TARGET_LINK_LIBRARIES(
|
||||
kokkoscontainers
|
||||
kokkoscore
|
||||
)
|
||||
|
||||
# Install the kokkocontainers library
|
||||
INSTALL (TARGETS kokkoscontainers
|
||||
EXPORT KokkosTargets
|
||||
ARCHIVE DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin)
|
||||
|
||||
# kokkosalgorithms - Build as interface library since no source files.
|
||||
ADD_LIBRARY(
|
||||
kokkosalgorithms
|
||||
INTERFACE
|
||||
)
|
||||
|
||||
target_include_directories(
|
||||
kokkosalgorithms
|
||||
INTERFACE ${Kokkos_SOURCE_DIR}/algorithms/src
|
||||
)
|
||||
|
||||
TARGET_LINK_LIBRARIES(
|
||||
kokkosalgorithms
|
||||
INTERFACE kokkoscore
|
||||
)
|
||||
|
||||
# Install the kokkoalgorithms library
|
||||
INSTALL (TARGETS kokkosalgorithms
|
||||
ARCHIVE DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin)
|
||||
|
||||
SET (Kokkos_LIBRARIES_NAMES kokkoscore kokkoscontainers kokkosalgorithms)
|
||||
|
||||
ELSE()
|
||||
# kokkos
|
||||
ADD_LIBRARY(
|
||||
kokkos
|
||||
${KOKKOS_CORE_SRCS}
|
||||
${KOKKOS_CONTAINERS_SRCS}
|
||||
)
|
||||
|
||||
target_compile_options(
|
||||
kokkos
|
||||
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
|
||||
)
|
||||
|
||||
target_include_directories(
|
||||
kokkos
|
||||
PUBLIC
|
||||
${KOKKOS_TPL_INCLUDE_DIRS}
|
||||
)
|
||||
|
||||
foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
|
||||
if (("${lib}" STREQUAL "cuda") AND (NOT "${CMAKE_CXX_COMPILER_ID}" STREQUAL "Clang"))
|
||||
set(LIB_cuda "-lcuda")
|
||||
elseif ("${lib}" STREQUAL "hpx")
|
||||
find_package(HPX REQUIRED)
|
||||
if(${HPX_FOUND})
|
||||
target_link_libraries(kokkos PUBLIC ${HPX_LIBRARIES})
|
||||
target_include_directories(kokkos PUBLIC ${HPX_INCLUDE_DIRS})
|
||||
else()
|
||||
message(ERROR "HPX not found. Check the value of HPX_DIR (= ${HPX_DIR}) or CMAKE_PREFIX_PATH (= ${CMAKE_PREFIX_PATH}).")
|
||||
endif()
|
||||
else()
|
||||
find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
|
||||
endif()
|
||||
target_link_libraries(kokkos PUBLIC ${LIB_${lib}})
|
||||
endforeach()
|
||||
|
||||
target_link_libraries(kokkos PUBLIC "${KOKKOS_LINK_FLAGS}")
|
||||
|
||||
# Install the kokkos library
|
||||
INSTALL (TARGETS kokkos
|
||||
EXPORT KokkosTargets
|
||||
ARCHIVE DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
LIBRARY DESTINATION ${CMAKE_INSTALL_PREFIX}/lib
|
||||
RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin)
|
||||
|
||||
|
||||
SET (Kokkos_LIBRARIES_NAMES kokkos)
|
||||
|
||||
endif() # KOKKOS_SEPARATE_LIBS
|
||||
|
||||
# Install the kokkos headers
|
||||
INSTALL (DIRECTORY
|
||||
EXPORT KokkosTargets
|
||||
${Kokkos_SOURCE_DIR}/core/src/
|
||||
DESTINATION ${KOKKOS_HEADER_DIR}
|
||||
FILES_MATCHING PATTERN "*.hpp"
|
||||
)
|
||||
INSTALL (DIRECTORY
|
||||
EXPORT KokkosTargets
|
||||
${Kokkos_SOURCE_DIR}/containers/src/
|
||||
DESTINATION ${KOKKOS_HEADER_DIR}
|
||||
FILES_MATCHING PATTERN "*.hpp"
|
||||
)
|
||||
INSTALL (DIRECTORY
|
||||
EXPORT KokkosTargets
|
||||
${Kokkos_SOURCE_DIR}/algorithms/src/
|
||||
DESTINATION ${KOKKOS_HEADER_DIR}
|
||||
FILES_MATCHING PATTERN "*.hpp"
|
||||
)
|
||||
|
||||
INSTALL (FILES
|
||||
${Kokkos_BINARY_DIR}/KokkosCore_config.h
|
||||
DESTINATION ${KOKKOS_HEADER_DIR}
|
||||
)
|
||||
|
||||
# Add all targets to the build-tree export set
|
||||
export(TARGETS ${Kokkos_LIBRARIES_NAMES}
|
||||
FILE "${Kokkos_BINARY_DIR}/KokkosTargets.cmake")
|
||||
|
||||
# Export the package for use from the build-tree
|
||||
# (this registers the build-tree with a global CMake-registry)
|
||||
export(PACKAGE Kokkos)
|
||||
|
||||
# Create the KokkosConfig.cmake and KokkosConfigVersion files
|
||||
file(RELATIVE_PATH REL_INCLUDE_DIR "${INSTALL_CMAKE_DIR}"
|
||||
"${INSTALL_INCLUDE_DIR}")
|
||||
# ... for the build tree
|
||||
set(CONF_INCLUDE_DIRS "${Kokkos_SOURCE_DIR}" "${Kokkos_BINARY_DIR}")
|
||||
configure_file(${Kokkos_SOURCE_DIR}/cmake/KokkosConfig.cmake.in
|
||||
"${Kokkos_BINARY_DIR}/KokkosConfig.cmake" @ONLY)
|
||||
# ... for the install tree
|
||||
set(CONF_INCLUDE_DIRS "\${Kokkos_CMAKE_DIR}/${REL_INCLUDE_DIR}")
|
||||
configure_file(${Kokkos_SOURCE_DIR}/cmake/KokkosConfig.cmake.in
|
||||
"${Kokkos_BINARY_DIR}${CMAKE_FILES_DIRECTORY}/KokkosConfig.cmake" @ONLY)
|
||||
|
||||
# Install the KokkosConfig.cmake and KokkosConfigVersion.cmake
|
||||
install(FILES
|
||||
"${Kokkos_BINARY_DIR}${CMAKE_FILES_DIRECTORY}/KokkosConfig.cmake"
|
||||
DESTINATION "${INSTALL_CMAKE_DIR}")
|
||||
|
||||
#This seems not to do anything?
|
||||
#message(STATUS "KokkosTargets: " ${KokkosTargets})
|
||||
# Install the export set for use with the install-tree
|
||||
INSTALL(EXPORT KokkosTargets DESTINATION
|
||||
"${INSTALL_CMAKE_DIR}")
|
||||
|
||||
# build and install pkgconfig file
|
||||
CONFIGURE_FILE(core/src/kokkos.pc.in kokkos.pc @ONLY)
|
||||
INSTALL(FILES ${CMAKE_CURRENT_BINARY_DIR}/kokkos.pc DESTINATION lib/pkgconfig)
|
||||
80
lib/kokkos/cmake/kokkos_compiler_id.cmake
Normal file
80
lib/kokkos/cmake/kokkos_compiler_id.cmake
Normal file
@ -0,0 +1,80 @@
|
||||
KOKKOS_CFG_DEPENDS(COMPILER_ID NONE)
|
||||
|
||||
SET(KOKKOS_CXX_COMPILER ${CMAKE_CXX_COMPILER})
|
||||
SET(KOKKOS_CXX_COMPILER_ID ${CMAKE_CXX_COMPILER_ID})
|
||||
SET(KOKKOS_CXX_COMPILER_VERSION ${CMAKE_CXX_COMPILER_VERSION})
|
||||
|
||||
# Check if the compiler is nvcc (which really means nvcc_wrapper).
|
||||
EXECUTE_PROCESS(COMMAND ${CMAKE_CXX_COMPILER} --version
|
||||
COMMAND grep nvcc
|
||||
COMMAND wc -l
|
||||
OUTPUT_VARIABLE INTERNAL_HAVE_COMPILER_NVCC
|
||||
OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||
|
||||
|
||||
STRING(REGEX REPLACE "^ +" ""
|
||||
INTERNAL_HAVE_COMPILER_NVCC ${INTERNAL_HAVE_COMPILER_NVCC})
|
||||
|
||||
|
||||
IF(INTERNAL_HAVE_COMPILER_NVCC)
|
||||
# SET the compiler id to nvcc. We use the value used by CMake 3.8.
|
||||
SET(KOKKOS_CXX_COMPILER_ID NVIDIA CACHE STRING INTERNAL FORCE)
|
||||
|
||||
# SET nvcc's compiler version.
|
||||
EXECUTE_PROCESS(COMMAND ${CMAKE_CXX_COMPILER} --version
|
||||
COMMAND grep release
|
||||
OUTPUT_VARIABLE INTERNAL_CXX_COMPILER_VERSION
|
||||
OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||
|
||||
STRING(REGEX MATCH "[0-9]+\\.[0-9]+\\.[0-9]+$"
|
||||
TEMP_CXX_COMPILER_VERSION ${INTERNAL_CXX_COMPILER_VERSION})
|
||||
SET(KOKKOS_CXX_COMPILER_VERSION ${TEMP_CXX_COMPILER_VERSION} CACHE STRING INTERNAL FORCE)
|
||||
ENDIF()
|
||||
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL Cray)
|
||||
|
||||
# SET nvcc's compiler version.
|
||||
EXECUTE_PROCESS(COMMAND ${CMAKE_CXX_COMPILER} --version
|
||||
OUTPUT_VARIABLE INTERNAL_CXX_COMPILER_VERSION
|
||||
OUTPUT_STRIP_TRAILING_WHITESPACE)
|
||||
|
||||
STRING(REGEX MATCH "[0-9]+\\.[0-9]+\\.[0-9]+$"
|
||||
TEMP_CXX_COMPILER_VERSION ${INTERNAL_CXX_COMPILER_VERSION})
|
||||
SET(KOKKOS_CXX_COMPILER_VERSION ${TEMP_CXX_COMPILER_VERSION} CACHE STRING INTERNAL FORCE)
|
||||
ENDIF()
|
||||
|
||||
# Enforce the minimum compilers supported by Kokkos.
|
||||
SET(KOKKOS_MESSAGE_TEXT "Compiler not supported by Kokkos. Required compiler versions:")
|
||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Clang 3.5.2 or higher")
|
||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n GCC 4.8.4 or higher")
|
||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Intel 15.0.2 or higher")
|
||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n NVCC 9.0.69 or higher")
|
||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n PGI 17.1 or higher\n")
|
||||
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang)
|
||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 3.5.2)
|
||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||
ENDIF()
|
||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL GNU)
|
||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 4.8.4)
|
||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||
ENDIF()
|
||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Intel)
|
||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 15.0.2)
|
||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||
ENDIF()
|
||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 9.0.69)
|
||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||
ENDIF()
|
||||
SET(CMAKE_CXX_EXTENSIONS OFF CACHE BOOL "Kokkos turns off CXX extensions" FORCE)
|
||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL PGI)
|
||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 17.1)
|
||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
STRING(REPLACE "." ";" VERSION_LIST ${KOKKOS_CXX_COMPILER_VERSION})
|
||||
LIST(GET VERSION_LIST 0 KOKKOS_COMPILER_VERSION_MAJOR)
|
||||
LIST(GET VERSION_LIST 1 KOKKOS_COMPILER_VERSION_MINOR)
|
||||
LIST(GET VERSION_LIST 2 KOKKOS_COMPILER_VERSION_PATCH)
|
||||
35
lib/kokkos/cmake/kokkos_corner_cases.cmake
Normal file
35
lib/kokkos/cmake/kokkos_corner_cases.cmake
Normal file
@ -0,0 +1,35 @@
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang AND KOKKOS_ENABLE_OPENMP)
|
||||
# The clang "version" doesn't actually tell you what runtimes and tools
|
||||
# were built into Clang. We should therefore make sure that libomp
|
||||
# was actually built into Clang. Otherwise the user will get nonsensical
|
||||
# errors when they try to build.
|
||||
|
||||
#Try compile is the height of CMake nonsense
|
||||
#I can't just give it compiler and link flags
|
||||
#I have to hackily pretend that compiler flags are compiler definitions
|
||||
#and that linker flags are libraries
|
||||
#also - this is easier to use than CMakeCheckCXXSourceCompiles
|
||||
TRY_COMPILE(CLANG_HAS_OMP
|
||||
${KOKKOS_TOP_BUILD_DIR}/corner_cases
|
||||
${KOKKOS_SOURCE_DIR}/cmake/compile_tests/clang_omp.cpp
|
||||
COMPILE_DEFINITIONS -fopenmp=libomp
|
||||
LINK_LIBRARIES -fopenmp=libomp
|
||||
)
|
||||
IF (NOT CLANG_HAS_OMP)
|
||||
UNSET(CLANG_HAS_OMP CACHE) #make sure CMake always re-runs this
|
||||
MESSAGE(FATAL_ERROR "Clang failed OpenMP check. You have requested -DKokkos_ENABLE_OPENMP=ON, but the Clang compiler does not appear to have been built with OpenMP support")
|
||||
ENDIF()
|
||||
UNSET(CLANG_HAS_OMP CACHE) #make sure CMake always re-runs this
|
||||
ENDIF()
|
||||
|
||||
|
||||
IF (KOKKOS_CXX_STANDARD STREQUAL 17)
|
||||
IF (KOKKOS_CXX_COMPILER_ID STREQUAL GNU AND KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 7)
|
||||
MESSAGE(FATAL_ERROR "You have requested c++17 support for GCC ${KOKKOS_CXX_COMPILER_VERSION}. Although CMake has allowed this and GCC accepts -std=c++1z/c++17, GCC <= 6 does not properly support *this capture. Please reduce the C++ standard to 14 or upgrade the compiler if you do need 17 support")
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
MESSAGE(FATAL_ERROR "You have requested c++17 support for NVCC. Please reduce the C++ standard to 14. No versions of NVCC currently support 17.")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
61
lib/kokkos/cmake/kokkos_enable_devices.cmake
Normal file
61
lib/kokkos/cmake/kokkos_enable_devices.cmake
Normal file
@ -0,0 +1,61 @@
|
||||
|
||||
FUNCTION(KOKKOS_DEVICE_OPTION SUFFIX DEFAULT DEV_TYPE DOCSTRING)
|
||||
KOKKOS_OPTION(ENABLE_${SUFFIX} ${DEFAULT} BOOL ${DOCSTRING})
|
||||
STRING(TOUPPER ${SUFFIX} UC_NAME)
|
||||
IF (KOKKOS_ENABLE_${UC_NAME})
|
||||
LIST(APPEND KOKKOS_ENABLED_DEVICES ${SUFFIX})
|
||||
#I hate that CMake makes me do this
|
||||
SET(KOKKOS_ENABLED_DEVICES ${KOKKOS_ENABLED_DEVICES} PARENT_SCOPE)
|
||||
ENDIF()
|
||||
SET(KOKKOS_ENABLE_${UC_NAME} ${KOKKOS_ENABLE_${UC_NAME}} PARENT_SCOPE)
|
||||
IF (KOKKOS_ENABLE_${UC_NAME} AND DEV_TYPE STREQUAL "HOST")
|
||||
SET(KOKKOS_HAS_HOST ON PARENT_SCOPE)
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
KOKKOS_CFG_DEPENDS(DEVICES NONE)
|
||||
|
||||
# Put a check in just in case people are using this option
|
||||
KOKKOS_DEPRECATED_LIST(DEVICES ENABLE)
|
||||
|
||||
|
||||
KOKKOS_DEVICE_OPTION(PTHREAD OFF HOST "Whether to build Pthread backend")
|
||||
IF (KOKKOS_ENABLE_PTHREAD)
|
||||
#patch the naming here
|
||||
SET(KOKKOS_ENABLE_THREADS ON)
|
||||
ENDIF()
|
||||
|
||||
IF(Trilinos_ENABLE_Kokkos AND Trilinos_ENABLE_OpenMP)
|
||||
SET(OMP_DEFAULT ON)
|
||||
ELSE()
|
||||
SET(OMP_DEFAULT OFF)
|
||||
ENDIF()
|
||||
KOKKOS_DEVICE_OPTION(OPENMP ${OMP_DEFAULT} HOST "Whether to build OpenMP backend")
|
||||
|
||||
IF(Trilinos_ENABLE_Kokkos AND TPL_ENABLE_CUDA)
|
||||
SET(CUDA_DEFAULT ON)
|
||||
ELSE()
|
||||
SET(CUDA_DEFAULT OFF)
|
||||
ENDIF()
|
||||
KOKKOS_DEVICE_OPTION(CUDA ${CUDA_DEFAULT} DEVICE "Whether to build CUDA backend")
|
||||
|
||||
IF (KOKKOS_ENABLE_CUDA)
|
||||
GLOBAL_SET(KOKKOS_DONT_ALLOW_EXTENSIONS "CUDA enabled")
|
||||
ENDIF()
|
||||
|
||||
# We want this to default to OFF for cache reasons, but if no
|
||||
# host space is given, then activate serial
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
#However, Trilinos always wants Serial ON
|
||||
SET(SERIAL_DEFAULT ON)
|
||||
ELSEIF (KOKKOS_HAS_HOST)
|
||||
SET(SERIAL_DEFAULT OFF)
|
||||
ELSE()
|
||||
SET(SERIAL_DEFAULT ON)
|
||||
IF (NOT DEFINED Kokkos_ENABLE_SERIAL)
|
||||
MESSAGE(STATUS "SERIAL backend is being turned on to ensure there is at least one Host space. To change this, you must enable another host execution space and configure with -DKokkos_ENABLE_SERIAL=OFF or change CMakeCache.txt")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
KOKKOS_DEVICE_OPTION(SERIAL ${SERIAL_DEFAULT} HOST "Whether to build serial backend")
|
||||
|
||||
KOKKOS_DEVICE_OPTION(HPX OFF HOST "Whether to build HPX backend (experimental)")
|
||||
92
lib/kokkos/cmake/kokkos_enable_options.cmake
Normal file
92
lib/kokkos/cmake/kokkos_enable_options.cmake
Normal file
@ -0,0 +1,92 @@
|
||||
########################## NOTES ###############################################
|
||||
# List the options for configuring kokkos using CMake method of doing it.
|
||||
# These options then get mapped onto KOKKOS_SETTINGS environment variable by
|
||||
# kokkos_settings.cmake. It is separate to allow other packages to override
|
||||
# these variables (e.g., TriBITS).
|
||||
|
||||
########################## AVAILABLE OPTIONS ###################################
|
||||
# Use lists for documentation, verification, and programming convenience
|
||||
|
||||
|
||||
FUNCTION(KOKKOS_ENABLE_OPTION SUFFIX DEFAULT DOCSTRING)
|
||||
KOKKOS_OPTION(ENABLE_${SUFFIX} ${DEFAULT} BOOL ${DOCSTRING})
|
||||
STRING(TOUPPER ${SUFFIX} UC_NAME)
|
||||
IF (KOKKOS_ENABLE_${UC_NAME})
|
||||
LIST(APPEND KOKKOS_ENABLED_OPTIONS ${UC_NAME})
|
||||
#I hate that CMake makes me do this
|
||||
SET(KOKKOS_ENABLED_OPTIONS ${KOKKOS_ENABLED_OPTIONS} PARENT_SCOPE)
|
||||
ENDIF()
|
||||
SET(KOKKOS_ENABLE_${UC_NAME} ${KOKKOS_ENABLE_${UC_NAME}} PARENT_SCOPE)
|
||||
ENDFUNCTION()
|
||||
|
||||
# Certain defaults will depend on knowing the enabled devices
|
||||
KOKKOS_CFG_DEPENDS(OPTIONS DEVICES)
|
||||
|
||||
# Put a check in just in case people are using this option
|
||||
KOKKOS_DEPRECATED_LIST(OPTIONS ENABLE)
|
||||
|
||||
KOKKOS_ENABLE_OPTION(CUDA_RELOCATABLE_DEVICE_CODE OFF "Whether to enable relocatable device code (RDC) for CUDA")
|
||||
KOKKOS_ENABLE_OPTION(CUDA_UVM OFF "Whether to use unified memory (UM) for CUDA by default")
|
||||
KOKKOS_ENABLE_OPTION(CUDA_LDG_INTRINSIC OFF "Whether to use CUDA LDG intrinsics")
|
||||
KOKKOS_ENABLE_OPTION(HPX_ASYNC_DISPATCH OFF "Whether HPX supports asynchronous dispatch")
|
||||
KOKKOS_ENABLE_OPTION(TESTS OFF "Whether to build the unit tests")
|
||||
STRING(TOUPPER "${CMAKE_BUILD_TYPE}" UPPERCASE_CMAKE_BUILD_TYPE)
|
||||
IF(UPPERCASE_CMAKE_BUILD_TYPE STREQUAL "DEBUG")
|
||||
KOKKOS_ENABLE_OPTION(DEBUG ON "Whether to activate extra debug features - may increase compile times")
|
||||
KOKKOS_ENABLE_OPTION(DEBUG_DUALVIEW_MODIFY_CHECK ON "Debug check on dual views")
|
||||
ELSE()
|
||||
KOKKOS_ENABLE_OPTION(DEBUG OFF "Whether to activate extra debug features - may increase compile times")
|
||||
KOKKOS_ENABLE_OPTION(DEBUG_DUALVIEW_MODIFY_CHECK OFF "Debug check on dual views")
|
||||
ENDIF()
|
||||
UNSET(_UPPERCASE_CMAKE_BUILD_TYPE)
|
||||
KOKKOS_ENABLE_OPTION(LARGE_MEM_TESTS OFF "Whether to perform extra large memory tests")
|
||||
KOKKOS_ENABLE_OPTION(DEBUG_BOUNDS_CHECK OFF "Whether to use bounds checking - will increase runtime")
|
||||
KOKKOS_ENABLE_OPTION(COMPILER_WARNINGS OFF "Whether to print all compiler warnings")
|
||||
KOKKOS_ENABLE_OPTION(PROFILING ON "Whether to create bindings for profiling tools")
|
||||
KOKKOS_ENABLE_OPTION(PROFILING_LOAD_PRINT OFF "Whether to print information about which profiling tools got loaded")
|
||||
KOKKOS_ENABLE_OPTION(AGGRESSIVE_VECTORIZATION OFF "Whether to aggressively vectorize loops")
|
||||
KOKKOS_ENABLE_OPTION(DEPRECATED_CODE OFF "Whether to enable deprecated code")
|
||||
|
||||
IF (KOKKOS_ENABLE_CUDA)
|
||||
SET(KOKKOS_COMPILER_CUDA_VERSION "${KOKKOS_COMPILER_VERSION_MAJOR}${KOKKOS_COMPILER_VERSION_MINOR}")
|
||||
ENDIF()
|
||||
|
||||
IF (Trilinos_ENABLE_Kokkos AND TPL_ENABLE_CUDA AND DEFINED KOKKOS_COMPILER_CUDA_VERSION AND KOKKOS_COMPILER_CUDA_VERSION GREATER 70)
|
||||
SET(LAMBDA_DEFAULT ON)
|
||||
ELSE()
|
||||
SET(LAMBDA_DEFAULT OFF)
|
||||
ENDIF()
|
||||
KOKKOS_ENABLE_OPTION(CUDA_LAMBDA ${LAMBDA_DEFAULT} "Whether to activate experimental lambda features")
|
||||
IF (Trilinos_ENABLE_Kokkos)
|
||||
SET(COMPLEX_ALIGN_DEFAULT OFF)
|
||||
ELSE()
|
||||
SET(COMPLEX_ALIGN_DEFAULT ON)
|
||||
ENDIF()
|
||||
KOKKOS_ENABLE_OPTION(COMPLEX_ALIGN ${COMPLEX_ALIGN_DEFAULT} "Whether to align Kokkos::complex to 2*alignof(RealType)")
|
||||
|
||||
KOKKOS_ENABLE_OPTION(CUDA_CONSTEXPR OFF "Whether to activate experimental relaxed constexpr functions")
|
||||
|
||||
FUNCTION(check_device_specific_options)
|
||||
CMAKE_PARSE_ARGUMENTS(SOME "" "DEVICE" "OPTIONS" ${ARGN})
|
||||
IF(NOT KOKKOS_ENABLE_${SOME_DEVICE})
|
||||
FOREACH(OPTION ${SOME_OPTIONS})
|
||||
IF(CMAKE_VERSION VERSION_GREATER_EQUAL 3.14)
|
||||
IF(NOT DEFINED CACHE{Kokkos_ENABLE_${OPTION}} OR NOT DEFINED CACHE{Kokkos_ENABLE_${SOME_DEVICE}})
|
||||
MESSAGE(FATAL_ERROR "Internal logic error: option '${OPTION}' or device '${SOME_DEVICE}' not recognized.")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
IF(KOKKOS_ENABLE_${OPTION})
|
||||
MESSAGE(WARNING "Kokkos_ENABLE_${OPTION} is ON but ${SOME_DEVICE} backend is not enabled. Option will be ignored.")
|
||||
UNSET(KOKKOS_ENABLE_${OPTION} PARENT_SCOPE)
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
CHECK_DEVICE_SPECIFIC_OPTIONS(DEVICE CUDA OPTIONS CUDA_UVM CUDA_RELOCATABLE_DEVICE_CODE CUDA_LAMBDA CUDA_CONSTEXPR CUDA_LDG_INTRINSIC)
|
||||
CHECK_DEVICE_SPECIFIC_OPTIONS(DEVICE HPX OPTIONS HPX_ASYNC_DISPATCH)
|
||||
|
||||
# Needed due to change from deprecated name to new header define name
|
||||
IF (KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION)
|
||||
SET(KOKKOS_OPT_RANGE_AGGRESSIVE_VECTORIZATION ON)
|
||||
ENDIF()
|
||||
File diff suppressed because it is too large
Load Diff
42
lib/kokkos/cmake/kokkos_install.cmake
Normal file
42
lib/kokkos/cmake/kokkos_install.cmake
Normal file
@ -0,0 +1,42 @@
|
||||
IF (NOT KOKKOS_HAS_TRILINOS)
|
||||
INCLUDE(GNUInstallDirs)
|
||||
|
||||
#Set all the variables needed for KokkosConfig.cmake
|
||||
GET_PROPERTY(KOKKOS_PROP_LIBS GLOBAL PROPERTY KOKKOS_LIBRARIES_NAMES)
|
||||
SET(KOKKOS_LIBRARIES ${KOKKOS_PROP_LIBS})
|
||||
|
||||
INCLUDE(CMakePackageConfigHelpers)
|
||||
CONFIGURE_PACKAGE_CONFIG_FILE(
|
||||
cmake/KokkosConfig.cmake.in
|
||||
"${Kokkos_BINARY_DIR}/KokkosConfig.cmake"
|
||||
INSTALL_DESTINATION ${CMAKE_INSTALL_FULL_LIBDIR}/cmake)
|
||||
|
||||
INCLUDE(CMakePackageConfigHelpers)
|
||||
CONFIGURE_PACKAGE_CONFIG_FILE(
|
||||
cmake/KokkosConfigCommon.cmake.in
|
||||
"${Kokkos_BINARY_DIR}/KokkosConfigCommon.cmake"
|
||||
INSTALL_DESTINATION ${CMAKE_INSTALL_FULL_LIBDIR}/cmake)
|
||||
|
||||
WRITE_BASIC_PACKAGE_VERSION_FILE("${Kokkos_BINARY_DIR}/KokkosConfigVersion.cmake"
|
||||
VERSION "${Kokkos_VERSION}"
|
||||
COMPATIBILITY SameMajorVersion)
|
||||
|
||||
# Install the KokkosConfig*.cmake files
|
||||
install(FILES
|
||||
"${Kokkos_BINARY_DIR}/KokkosConfig.cmake"
|
||||
"${Kokkos_BINARY_DIR}/KokkosConfigCommon.cmake"
|
||||
"${Kokkos_BINARY_DIR}/KokkosConfigVersion.cmake"
|
||||
DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/Kokkos)
|
||||
install(EXPORT KokkosTargets NAMESPACE Kokkos:: DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/Kokkos)
|
||||
ELSE()
|
||||
CONFIGURE_FILE(cmake/KokkosConfigCommon.cmake.in ${Kokkos_BINARY_DIR}/KokkosConfigCommon.cmake @ONLY)
|
||||
file(READ ${Kokkos_BINARY_DIR}/KokkosConfigCommon.cmake KOKKOS_CONFIG_COMMON)
|
||||
file(APPEND "${CMAKE_CURRENT_BINARY_DIR}/CMakeFiles/KokkosConfig_install.cmake" ${KOKKOS_CONFIG_COMMON})
|
||||
ENDIF()
|
||||
|
||||
# build and install pkgconfig file
|
||||
CONFIGURE_FILE(core/src/kokkos.pc.in kokkos.pc @ONLY)
|
||||
INSTALL(FILES ${CMAKE_CURRENT_BINARY_DIR}/kokkos.pc DESTINATION ${CMAKE_INSTALL_LIBDIR}/pkgconfig)
|
||||
|
||||
INSTALL(FILES ${CMAKE_CURRENT_BINARY_DIR}/KokkosCore_config.h DESTINATION ${KOKKOS_HEADER_DIR})
|
||||
|
||||
@ -1,419 +0,0 @@
|
||||
########################## NOTES ###############################################
|
||||
# List the options for configuring kokkos using CMake method of doing it.
|
||||
# These options then get mapped onto KOKKOS_SETTINGS environment variable by
|
||||
# kokkos_settings.cmake. It is separate to allow other packages to override
|
||||
# these variables (e.g., TriBITS).
|
||||
|
||||
########################## AVAILABLE OPTIONS ###################################
|
||||
# Use lists for documentation, verification, and programming convenience
|
||||
|
||||
# All CMake options of the type KOKKOS_ENABLE_*
|
||||
set(KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST)
|
||||
list(APPEND KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST
|
||||
Serial
|
||||
OpenMP
|
||||
Pthread
|
||||
Qthread
|
||||
HPX
|
||||
Cuda
|
||||
ROCm
|
||||
HWLOC
|
||||
MEMKIND
|
||||
LIBRT
|
||||
Cuda_Lambda
|
||||
Cuda_Relocatable_Device_Code
|
||||
Cuda_UVM
|
||||
Cuda_LDG_Intrinsic
|
||||
HPX_ASYNC_DISPATCH
|
||||
Debug
|
||||
Debug_DualView_Modify_Check
|
||||
Debug_Bounds_Check
|
||||
Compiler_Warnings
|
||||
Profiling
|
||||
Profiling_Load_Print
|
||||
Aggressive_Vectorization
|
||||
Deprecated_Code
|
||||
Explicit_Instantiation
|
||||
)
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- Recognize CamelCase Options ---------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
foreach(opt ${KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST})
|
||||
string(TOUPPER ${opt} OPT )
|
||||
IF(DEFINED Kokkos_ENABLE_${opt})
|
||||
IF(DEFINED KOKKOS_ENABLE_${OPT})
|
||||
IF(NOT ("${KOKKOS_ENABLE_${OPT}}" STREQUAL "${Kokkos_ENABLE_${opt}}"))
|
||||
IF(DEFINED KOKKOS_ENABLE_${OPT}_INTERNAL)
|
||||
MESSAGE(WARNING "Defined both Kokkos_ENABLE_${opt}=[${Kokkos_ENABLE_${opt}}] and KOKKOS_ENABLE_${OPT}=[${KOKKOS_ENABLE_${OPT}}] and they differ! Could be caused by old CMakeCache Variable. Run CMake again and warning should disappear. If not you are truly setting both variables.")
|
||||
IF(NOT ("${Kokkos_ENABLE_${opt}}" STREQUAL "${KOKKOS_ENABLE_${OPT}_INTERNAL}"))
|
||||
UNSET(KOKKOS_ENABLE_${OPT} CACHE)
|
||||
SET(KOKKOS_ENABLE_${OPT} ${Kokkos_ENABLE_${opt}})
|
||||
MESSAGE(WARNING "SET BOTH VARIABLES KOKKOS_ENABLE_${OPT}: ${KOKKOS_ENABLE_${OPT}}")
|
||||
ELSE()
|
||||
SET(Kokkos_ENABLE_${opt} ${KOKKOS_ENABLE_${OPT}})
|
||||
ENDIF()
|
||||
ELSE()
|
||||
MESSAGE(FATAL_ERROR "Defined both Kokkos_ENABLE_${opt}=[${Kokkos_ENABLE_${opt}}] and KOKKOS_ENABLE_${OPT}=[${KOKKOS_ENABLE_${OPT}}] and they differ!")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ELSE()
|
||||
SET(KOKKOS_INTERNAL_ENABLE_${OPT}_DEFAULT ${Kokkos_ENABLE_${opt}})
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
endforeach()
|
||||
|
||||
IF(DEFINED Kokkos_ARCH)
|
||||
MESSAGE(FATAL_ERROR "Defined Kokkos_ARCH, use KOKKOS_ARCH instead!")
|
||||
ENDIF()
|
||||
IF(DEFINED Kokkos_Arch)
|
||||
MESSAGE(FATAL_ERROR "Defined Kokkos_Arch, use KOKKOS_ARCH instead!")
|
||||
ENDIF()
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
# List of possible host architectures.
|
||||
#-------------------------------------------------------------------------------
|
||||
set(KOKKOS_ARCH_LIST)
|
||||
list(APPEND KOKKOS_ARCH_LIST
|
||||
None # No architecture optimization
|
||||
AMDAVX # (HOST) AMD chip
|
||||
EPYC # (HOST) AMD EPYC Zen-Core CPU
|
||||
ARMv80 # (HOST) ARMv8.0 Compatible CPU
|
||||
ARMv81 # (HOST) ARMv8.1 Compatible CPU
|
||||
ARMv8-ThunderX # (HOST) ARMv8 Cavium ThunderX CPU
|
||||
ARMv8-TX2 # (HOST) ARMv8 Cavium ThunderX2 CPU
|
||||
WSM # (HOST) Intel Westmere CPU
|
||||
SNB # (HOST) Intel Sandy/Ivy Bridge CPUs
|
||||
HSW # (HOST) Intel Haswell CPUs
|
||||
BDW # (HOST) Intel Broadwell Xeon E-class CPUs
|
||||
SKX # (HOST) Intel Sky Lake Xeon E-class HPC CPUs (AVX512)
|
||||
KNC # (HOST) Intel Knights Corner Xeon Phi
|
||||
KNL # (HOST) Intel Knights Landing Xeon Phi
|
||||
BGQ # (HOST) IBM Blue Gene Q
|
||||
Power7 # (HOST) IBM POWER7 CPUs
|
||||
Power8 # (HOST) IBM POWER8 CPUs
|
||||
Power9 # (HOST) IBM POWER9 CPUs
|
||||
Kepler # (GPU) NVIDIA Kepler default (generation CC 3.5)
|
||||
Kepler30 # (GPU) NVIDIA Kepler generation CC 3.0
|
||||
Kepler32 # (GPU) NVIDIA Kepler generation CC 3.2
|
||||
Kepler35 # (GPU) NVIDIA Kepler generation CC 3.5
|
||||
Kepler37 # (GPU) NVIDIA Kepler generation CC 3.7
|
||||
Maxwell # (GPU) NVIDIA Maxwell default (generation CC 5.0)
|
||||
Maxwell50 # (GPU) NVIDIA Maxwell generation CC 5.0
|
||||
Maxwell52 # (GPU) NVIDIA Maxwell generation CC 5.2
|
||||
Maxwell53 # (GPU) NVIDIA Maxwell generation CC 5.3
|
||||
Pascal60 # (GPU) NVIDIA Pascal generation CC 6.0
|
||||
Pascal61 # (GPU) NVIDIA Pascal generation CC 6.1
|
||||
Volta70 # (GPU) NVIDIA Volta generation CC 7.0
|
||||
Volta72 # (GPU) NVIDIA Volta generation CC 7.2
|
||||
Turing75 # (GPU) NVIDIA Turing generation CC 7.5
|
||||
)
|
||||
|
||||
# List of possible device architectures.
|
||||
# The case and spelling here needs to match Makefile.kokkos
|
||||
set(KOKKOS_DEVICES_LIST)
|
||||
# Options: Cuda,ROCm,OpenMP,Pthread,Qthreads,Serial
|
||||
list(APPEND KOKKOS_DEVICES_LIST
|
||||
Cuda # NVIDIA GPU -- see below
|
||||
OpenMP # OpenMP
|
||||
Pthread # pthread
|
||||
Qthreads # qthreads
|
||||
HPX # HPX
|
||||
Serial # serial
|
||||
ROCm # Relocatable device code
|
||||
)
|
||||
|
||||
# List of possible TPLs for Kokkos
|
||||
# From Makefile.kokkos: Options: hwloc,librt,experimental_memkind
|
||||
set(KOKKOS_USE_TPLS_LIST)
|
||||
if(APPLE)
|
||||
list(APPEND KOKKOS_USE_TPLS_LIST
|
||||
HWLOC # hwloc
|
||||
MEMKIND # experimental_memkind
|
||||
)
|
||||
else()
|
||||
list(APPEND KOKKOS_USE_TPLS_LIST
|
||||
HWLOC # hwloc
|
||||
LIBRT # librt
|
||||
MEMKIND # experimental_memkind
|
||||
)
|
||||
endif()
|
||||
# Map of cmake variables to Makefile variables
|
||||
set(KOKKOS_INTERNAL_HWLOC hwloc)
|
||||
set(KOKKOS_INTERNAL_LIBRT librt)
|
||||
set(KOKKOS_INTERNAL_MEMKIND experimental_memkind)
|
||||
|
||||
# List of possible Advanced options
|
||||
set(KOKKOS_OPTIONS_LIST)
|
||||
list(APPEND KOKKOS_OPTIONS_LIST
|
||||
AGGRESSIVE_VECTORIZATION
|
||||
DISABLE_PROFILING
|
||||
DISABLE_DUALVIEW_MODIFY_CHECK
|
||||
ENABLE_PROFILE_LOAD_PRINT
|
||||
)
|
||||
# Map of cmake variables to Makefile variables
|
||||
set(KOKKOS_INTERNAL_LDG_INTRINSIC use_ldg)
|
||||
set(KOKKOS_INTERNAL_UVM librt)
|
||||
set(KOKKOS_INTERNAL_RELOCATABLE_DEVICE_CODE rdc)
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
# List of possible Options for CUDA
|
||||
#-------------------------------------------------------------------------------
|
||||
# From Makefile.kokkos: Options: use_ldg,force_uvm,rdc
|
||||
set(KOKKOS_CUDA_OPTIONS_LIST)
|
||||
list(APPEND KOKKOS_CUDA_OPTIONS_LIST
|
||||
LDG_INTRINSIC # use_ldg
|
||||
UVM # force_uvm
|
||||
RELOCATABLE_DEVICE_CODE # rdc
|
||||
LAMBDA # enable_lambda
|
||||
)
|
||||
|
||||
# Map of cmake variables to Makefile variables
|
||||
set(KOKKOS_INTERNAL_LDG_INTRINSIC use_ldg)
|
||||
set(KOKKOS_INTERNAL_UVM force_uvm)
|
||||
set(KOKKOS_INTERNAL_RELOCATABLE_DEVICE_CODE rdc)
|
||||
set(KOKKOS_INTERNAL_LAMBDA enable_lambda)
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
# List of possible Options for HPX
|
||||
#-------------------------------------------------------------------------------
|
||||
# From Makefile.kokkos: Options: enable_async_dispatch
|
||||
set(KOKKOS_HPX_OPTIONS_LIST)
|
||||
list(APPEND KOKKOS_HPX_OPTIONS_LIST
|
||||
ASYNC_DISPATCH # enable_async_dispatch
|
||||
)
|
||||
|
||||
# Map of cmake variables to Makefile variables
|
||||
set(KOKKOS_INTERNAL_ENABLE_ASYNC_DISPATCH enable_async_dispatch)
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- Create doc strings ----------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
set(tmpr "\n ")
|
||||
string(REPLACE ";" ${tmpr} KOKKOS_INTERNAL_ARCH_DOCSTR "${KOKKOS_ARCH_LIST}")
|
||||
set(KOKKOS_INTERNAL_ARCH_DOCSTR "${tmpr}${KOKKOS_INTERNAL_ARCH_DOCSTR}")
|
||||
# This would be useful, but we use Foo_ENABLE mechanisms
|
||||
#string(REPLACE ";" ${tmpr} KOKKOS_INTERNAL_DEVICES_DOCSTR "${KOKKOS_DEVICES_LIST}")
|
||||
#string(REPLACE ";" ${tmpr} KOKKOS_INTERNAL_USE_TPLS_DOCSTR "${KOKKOS_USE_TPLS_LIST}")
|
||||
#string(REPLACE ";" ${tmpr} KOKKOS_INTERNAL_CUDA_OPTIONS_DOCSTR "${KOKKOS_CUDA_OPTIONS_LIST}")
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- GENERAL OPTIONS -------------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
# Setting this variable to a value other than "None" can improve host
|
||||
# performance by turning on architecture specific code.
|
||||
# NOT SET is used to determine if the option is passed in. It is reset to
|
||||
# default "None" down below.
|
||||
set(KOKKOS_ARCH "NOT_SET" CACHE STRING
|
||||
"Optimize for specific host architecture. Options are: ${KOKKOS_INTERNAL_ARCH_DOCSTR}")
|
||||
|
||||
# Whether to build separate libraries or now
|
||||
set(KOKKOS_SEPARATE_LIBS OFF CACHE BOOL "OFF = kokkos. ON = kokkoscore, kokkoscontainers, and kokkosalgorithms.")
|
||||
|
||||
# Qthreads options.
|
||||
set(KOKKOS_QTHREADS_DIR "" CACHE PATH "Location of Qthreads library.")
|
||||
|
||||
# HPX options.
|
||||
set(KOKKOS_HPX_DIR "" CACHE PATH "Location of HPX library.")
|
||||
|
||||
# Whether to build separate libraries or now
|
||||
set(KOKKOS_SEPARATE_TESTS OFF CACHE BOOL "Provide unit test targets with finer granularity.")
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- KOKKOS_DEVICES --------------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
# Figure out default settings
|
||||
IF(Trilinos_ENABLE_Kokkos)
|
||||
set_kokkos_default_default(SERIAL ON)
|
||||
set_kokkos_default_default(PTHREAD OFF)
|
||||
IF(TPL_ENABLE_QTHREAD)
|
||||
set_kokkos_default_default(QTHREADS ${TPL_ENABLE_QTHREAD})
|
||||
ELSE()
|
||||
set_kokkos_default_default(QTHREADS OFF)
|
||||
ENDIF()
|
||||
IF(TPL_ENABLE_HPX)
|
||||
set_kokkos_default_default(HPX ON)
|
||||
ELSE()
|
||||
set_kokkos_default_default(HPX OFF)
|
||||
ENDIF()
|
||||
IF(Trilinos_ENABLE_OpenMP)
|
||||
set_kokkos_default_default(OPENMP ${Trilinos_ENABLE_OpenMP})
|
||||
ELSE()
|
||||
set_kokkos_default_default(OPENMP OFF)
|
||||
ENDIF()
|
||||
IF(TPL_ENABLE_CUDA)
|
||||
set_kokkos_default_default(CUDA ${TPL_ENABLE_CUDA})
|
||||
ELSE()
|
||||
set_kokkos_default_default(CUDA OFF)
|
||||
ENDIF()
|
||||
set_kokkos_default_default(ROCM OFF)
|
||||
ELSE()
|
||||
set_kokkos_default_default(SERIAL ON)
|
||||
set_kokkos_default_default(OPENMP OFF)
|
||||
set_kokkos_default_default(PTHREAD OFF)
|
||||
set_kokkos_default_default(QTHREAD OFF)
|
||||
set_kokkos_default_default(HPX OFF)
|
||||
set_kokkos_default_default(CUDA OFF)
|
||||
set_kokkos_default_default(ROCM OFF)
|
||||
ENDIF()
|
||||
|
||||
# Set which Kokkos backend to use.
|
||||
# These are the actual options that define the settings.
|
||||
set(KOKKOS_ENABLE_SERIAL ${KOKKOS_INTERNAL_ENABLE_SERIAL_DEFAULT} CACHE BOOL "Whether to enable the Kokkos::Serial device. This device executes \"parallel\" kernels sequentially on a single CPU thread. It is enabled by default. If you disable this device, please enable at least one other CPU device, such as Kokkos::OpenMP or Kokkos::Threads.")
|
||||
set(KOKKOS_ENABLE_OPENMP ${KOKKOS_INTERNAL_ENABLE_OPENMP_DEFAULT} CACHE BOOL "Enable OpenMP support in Kokkos." FORCE)
|
||||
set(KOKKOS_ENABLE_PTHREAD ${KOKKOS_INTERNAL_ENABLE_PTHREAD_DEFAULT} CACHE BOOL "Enable Pthread support in Kokkos.")
|
||||
set(KOKKOS_ENABLE_QTHREADS ${KOKKOS_INTERNAL_ENABLE_QTHREADS_DEFAULT} CACHE BOOL "Enable Qthreads support in Kokkos.")
|
||||
set(KOKKOS_ENABLE_HPX ${KOKKOS_INTERNAL_ENABLE_HPX_DEFAULT} CACHE BOOL "Enable HPX support in Kokkos.")
|
||||
set(KOKKOS_ENABLE_CUDA ${KOKKOS_INTERNAL_ENABLE_CUDA_DEFAULT} CACHE BOOL "Enable CUDA support in Kokkos.")
|
||||
set(KOKKOS_ENABLE_ROCM ${KOKKOS_INTERNAL_ENABLE_ROCM_DEFAULT} CACHE BOOL "Enable ROCm support in Kokkos.")
|
||||
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- KOKKOS DEBUG and PROFILING --------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
# Debug related options enable compiler warnings
|
||||
|
||||
set_kokkos_default_default(DEBUG OFF)
|
||||
set(KOKKOS_ENABLE_DEBUG ${KOKKOS_INTERNAL_ENABLE_DEBUG_DEFAULT} CACHE BOOL "Enable Kokkos Debug.")
|
||||
|
||||
# From Makefile.kokkos: Advanced Options:
|
||||
#compiler_warnings, aggressive_vectorization, disable_profiling, disable_dualview_modify_check, enable_profile_load_print
|
||||
set_kokkos_default_default(COMPILER_WARNINGS OFF)
|
||||
set(KOKKOS_ENABLE_COMPILER_WARNINGS ${KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS_DEFAULT} CACHE BOOL "Enable compiler warnings.")
|
||||
|
||||
set_kokkos_default_default(DEBUG_DUALVIEW_MODIFY_CHECK OFF)
|
||||
set(KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK ${KOKKOS_INTERNAL_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK_DEFAULT} CACHE BOOL "Enable dualview modify check.")
|
||||
|
||||
# Enable aggressive vectorization.
|
||||
set_kokkos_default_default(AGGRESSIVE_VECTORIZATION OFF)
|
||||
set(KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION ${KOKKOS_INTERNAL_ENABLE_AGGRESSIVE_VECTORIZATION_DEFAULT} CACHE BOOL "Enable aggressive vectorization.")
|
||||
|
||||
# Enable profiling.
|
||||
set_kokkos_default_default(PROFILING ON)
|
||||
set(KOKKOS_ENABLE_PROFILING ${KOKKOS_INTERNAL_ENABLE_PROFILING_DEFAULT} CACHE BOOL "Enable profiling.")
|
||||
|
||||
set_kokkos_default_default(PROFILING_LOAD_PRINT OFF)
|
||||
set(KOKKOS_ENABLE_PROFILING_LOAD_PRINT ${KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT_DEFAULT} CACHE BOOL "Enable profile load print.")
|
||||
|
||||
set_kokkos_default_default(DEPRECATED_CODE ON)
|
||||
set(KOKKOS_ENABLE_DEPRECATED_CODE ${KOKKOS_INTERNAL_ENABLE_DEPRECATED_CODE_DEFAULT} CACHE BOOL "Enable deprecated code.")
|
||||
|
||||
set_kokkos_default_default(EXPLICIT_INSTANTIATION OFF)
|
||||
set(KOKKOS_ENABLE_EXPLICIT_INSTANTIATION ${KOKKOS_INTERNAL_ENABLE_EXPLICIT_INSTANTIATION_DEFAULT} CACHE BOOL "Enable explicit template instantiation.")
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- KOKKOS_USE_TPLS -------------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
# Enable hwloc library.
|
||||
# Figure out default:
|
||||
IF(Trilinos_ENABLE_Kokkos AND TPL_ENABLE_HWLOC)
|
||||
set_kokkos_default_default(HWLOC ON)
|
||||
ELSE()
|
||||
set_kokkos_default_default(HWLOC OFF)
|
||||
ENDIF()
|
||||
set(KOKKOS_ENABLE_HWLOC ${KOKKOS_INTERNAL_ENABLE_HWLOC_DEFAULT} CACHE BOOL "Enable hwloc for better process placement.")
|
||||
set(KOKKOS_HWLOC_DIR "" CACHE PATH "Location of hwloc library. (kokkos tpl)")
|
||||
|
||||
# Enable memkind library.
|
||||
set_kokkos_default_default(MEMKIND OFF)
|
||||
set(KOKKOS_ENABLE_MEMKIND ${KOKKOS_INTERNAL_ENABLE_MEMKIND_DEFAULT} CACHE BOOL "Enable memkind. (kokkos tpl)")
|
||||
set(KOKKOS_MEMKIND_DIR "" CACHE PATH "Location of memkind library. (kokkos tpl)")
|
||||
|
||||
# Enable rt library.
|
||||
IF(Trilinos_ENABLE_Kokkos)
|
||||
IF(DEFINED TPL_ENABLE_LIBRT)
|
||||
set_kokkos_default_default(LIBRT ${TPL_ENABLE_LIBRT})
|
||||
ELSE()
|
||||
set_kokkos_default_default(LIBRT OFF)
|
||||
ENDIF()
|
||||
ELSE()
|
||||
set_kokkos_default_default(LIBRT ON)
|
||||
ENDIF()
|
||||
set(KOKKOS_ENABLE_LIBRT ${KOKKOS_INTERNAL_ENABLE_LIBRT_DEFAULT} CACHE BOOL "Enable librt for more precise timer. (kokkos tpl)")
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- KOKKOS_CUDA_OPTIONS ---------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
# CUDA options.
|
||||
# Set Defaults
|
||||
set_kokkos_default_default(CUDA_LDG_INTRINSIC_DEFAULT OFF)
|
||||
set_kokkos_default_default(CUDA_UVM_DEFAULT OFF)
|
||||
set_kokkos_default_default(CUDA_RELOCATABLE_DEVICE_CODE OFF)
|
||||
IF(Trilinos_ENABLE_Kokkos)
|
||||
IF(KOKKOS_ENABLE_CUDA)
|
||||
find_package(CUDA)
|
||||
ENDIF()
|
||||
IF (DEFINED CUDA_VERSION)
|
||||
IF (CUDA_VERSION VERSION_GREATER "7.0")
|
||||
set_kokkos_default_default(CUDA_LAMBDA ON)
|
||||
ELSE()
|
||||
set_kokkos_default_default(CUDA_LAMBDA OFF)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ELSE()
|
||||
set_kokkos_default_default(CUDA_LAMBDA OFF)
|
||||
ENDIF()
|
||||
|
||||
# Set actual options
|
||||
set(KOKKOS_CUDA_DIR "" CACHE PATH "Location of CUDA library. Defaults to where nvcc installed.")
|
||||
set(KOKKOS_ENABLE_CUDA_LDG_INTRINSIC ${KOKKOS_INTERNAL_ENABLE_CUDA_LDG_INTRINSIC_DEFAULT} CACHE BOOL "Enable CUDA LDG. (cuda option)")
|
||||
set(KOKKOS_ENABLE_CUDA_UVM ${KOKKOS_INTERNAL_ENABLE_CUDA_UVM_DEFAULT} CACHE BOOL "Enable CUDA unified virtual memory.")
|
||||
set(KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE ${KOKKOS_INTERNAL_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE_DEFAULT} CACHE BOOL "Enable relocatable device code for CUDA. (cuda option)")
|
||||
set(KOKKOS_ENABLE_CUDA_LAMBDA ${KOKKOS_INTERNAL_ENABLE_CUDA_LAMBDA_DEFAULT} CACHE BOOL "Enable lambdas for CUDA. (cuda option)")
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- KOKKOS_HPX_OPTIONS ----------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
# HPX options.
|
||||
# Set Defaults
|
||||
set_kokkos_default_default(HPX_ASYNC_DISPATCH OFF)
|
||||
|
||||
# Set actual options
|
||||
set(KOKKOS_ENABLE_HPX_ASYNC_DISPATCH ${KOKKOS_INTERNAL_ENABLE_HPX_ASYNC_DISPATCH_DEFAULT} CACHE BOOL "Enable HPX async dispatch.")
|
||||
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#----------------------- HOST ARCH AND LEGACY TRIBITS --------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
# This defines the previous legacy TriBITS builds.
|
||||
set(KOKKOS_LEGACY_TRIBITS False)
|
||||
IF ("${KOKKOS_ARCH}" STREQUAL "NOT_SET")
|
||||
set(KOKKOS_ARCH "None")
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
set(KOKKOS_LEGACY_TRIBITS True)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
IF (KOKKOS_LEGACY_TRIBITS)
|
||||
message(STATUS "Using the legacy tribits build because KOKKOS_ARCH not set")
|
||||
ELSE()
|
||||
message(STATUS "NOT using the legacy tribits build because KOKKOS_ARCH *is* set")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#----------------------- Set CamelCase Options if they are not yet set ---------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
foreach(opt ${KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST})
|
||||
string(TOUPPER ${opt} OPT )
|
||||
UNSET(KOKKOS_ENABLE_${OPT}_INTERNAL CACHE)
|
||||
SET(KOKKOS_ENABLE_${OPT}_INTERNAL ${KOKKOS_ENABLE_${OPT}} CACHE BOOL INTERNAL)
|
||||
IF(DEFINED KOKKOS_ENABLE_${OPT})
|
||||
UNSET(Kokkos_ENABLE_${opt} CACHE)
|
||||
SET(Kokkos_ENABLE_${opt} ${KOKKOS_ENABLE_${OPT}} CACHE BOOL "CamelCase Compatibility setting for KOKKOS_ENABLE_${OPT}")
|
||||
ENDIF()
|
||||
endforeach()
|
||||
46
lib/kokkos/cmake/kokkos_pick_cxx_std.cmake
Normal file
46
lib/kokkos/cmake/kokkos_pick_cxx_std.cmake
Normal file
@ -0,0 +1,46 @@
|
||||
# From CMake 3.10 documentation
|
||||
|
||||
#This can run at any time
|
||||
KOKKOS_OPTION(CXX_STANDARD "" STRING "The C++ standard for Kokkos to use: 11, 14, 17, or 20. If empty, this will default to CMAKE_CXX_STANDARD. If both CMAKE_CXX_STANDARD and Kokkos_CXX_STANDARD are empty, this will default to 11")
|
||||
|
||||
# Set CXX standard flags
|
||||
SET(KOKKOS_ENABLE_CXX11 OFF)
|
||||
SET(KOKKOS_ENABLE_CXX14 OFF)
|
||||
SET(KOKKOS_ENABLE_CXX17 OFF)
|
||||
SET(KOKKOS_ENABLE_CXX20 OFF)
|
||||
IF (KOKKOS_CXX_STANDARD)
|
||||
IF (${KOKKOS_CXX_STANDARD} STREQUAL "c++98")
|
||||
MESSAGE(FATAL_ERROR "Kokkos no longer supports C++98 - minimum C++11")
|
||||
ELSEIF (${KOKKOS_CXX_STANDARD} STREQUAL "c++11")
|
||||
MESSAGE(WARNING "Deprecated Kokkos C++ standard set as 'c++11'. Use '11' instead.")
|
||||
SET(KOKKOS_CXX_STANDARD "11")
|
||||
ELSEIF(${KOKKOS_CXX_STANDARD} STREQUAL "c++14")
|
||||
MESSAGE(WARNING "Deprecated Kokkos C++ standard set as 'c++14'. Use '14' instead.")
|
||||
SET(KOKKOS_CXX_STANDARD "14")
|
||||
ELSEIF(${KOKKOS_CXX_STANDARD} STREQUAL "c++17")
|
||||
MESSAGE(WARNING "Deprecated Kokkos C++ standard set as 'c++17'. Use '17' instead.")
|
||||
SET(KOKKOS_CXX_STANDARD "17")
|
||||
ELSEIF(${KOKKOS_CXX_STANDARD} STREQUAL "c++1y")
|
||||
MESSAGE(WARNING "Deprecated Kokkos C++ standard set as 'c++1y'. Use '1Y' instead.")
|
||||
SET(KOKKOS_CXX_STANDARD "1Y")
|
||||
ELSEIF(${KOKKOS_CXX_STANDARD} STREQUAL "c++1z")
|
||||
MESSAGE(WARNING "Deprecated Kokkos C++ standard set as 'c++1z'. Use '1Z' instead.")
|
||||
SET(KOKKOS_CXX_STANDARD "1Z")
|
||||
ELSEIF(${KOKKOS_CXX_STANDARD} STREQUAL "c++2a")
|
||||
MESSAGE(WARNING "Deprecated Kokkos C++ standard set as 'c++2a'. Use '2A' instead.")
|
||||
SET(KOKKOS_CXX_STANDARD "2A")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF (NOT KOKKOS_CXX_STANDARD AND NOT CMAKE_CXX_STANDARD)
|
||||
MESSAGE(STATUS "Setting default Kokkos CXX standard to 11")
|
||||
SET(KOKKOS_CXX_STANDARD "11")
|
||||
ELSEIF(NOT KOKKOS_CXX_STANDARD)
|
||||
MESSAGE(STATUS "Setting default Kokkos CXX standard to ${CMAKE_CXX_STANDARD}")
|
||||
SET(KOKKOS_CXX_STANDARD ${CMAKE_CXX_STANDARD})
|
||||
ENDIF()
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@ -1,259 +0,0 @@
|
||||
########################## NOTES ###############################################
|
||||
# This files goal is to take CMake options found in kokkos_options.cmake but
|
||||
# possibly set from elsewhere
|
||||
# (see: trilinos/cmake/ProjectCOmpilerPostConfig.cmake)
|
||||
# using CMake idioms and map them onto the KOKKOS_SETTINGS variables that gets
|
||||
# passed to the kokkos makefile configuration:
|
||||
# make -f ${CMAKE_SOURCE_DIR}/core/src/Makefile ${KOKKOS_SETTINGS} build-makefile-cmake-kokkos
|
||||
# that generates KokkosCore_config.h and kokkos_generated_settings.cmake
|
||||
# To understand how to form KOKKOS_SETTINGS, see
|
||||
# <KOKKOS_PATH>/Makefile.kokkos
|
||||
|
||||
#-------------------------------------------------------------------------------
|
||||
#------------------------------- GENERAL OPTIONS -------------------------------
|
||||
#-------------------------------------------------------------------------------
|
||||
|
||||
# Ensure that KOKKOS_ARCH is in the ARCH_LIST
|
||||
if (KOKKOS_ARCH MATCHES ",")
|
||||
message("-- Detected a comma in: KOKKOS_ARCH=`${KOKKOS_ARCH}`")
|
||||
message("-- Although we prefer KOKKOS_ARCH to be semicolon-delimited, we do allow")
|
||||
message("-- comma-delimited values for compatibility with scripts (see github.com/trilinos/Trilinos/issues/2330)")
|
||||
string(REPLACE "," ";" KOKKOS_ARCH "${KOKKOS_ARCH}")
|
||||
message("-- Commas were changed to semicolons, now KOKKOS_ARCH=`${KOKKOS_ARCH}`")
|
||||
endif()
|
||||
foreach(arch ${KOKKOS_ARCH})
|
||||
list(FIND KOKKOS_ARCH_LIST ${arch} indx)
|
||||
if (indx EQUAL -1)
|
||||
message(FATAL_ERROR "`${arch}` is not an accepted value in KOKKOS_ARCH=`${KOKKOS_ARCH}`."
|
||||
" Please pick from these choices: ${KOKKOS_INTERNAL_ARCH_DOCSTR}")
|
||||
endif ()
|
||||
endforeach()
|
||||
|
||||
# KOKKOS_SETTINGS uses KOKKOS_ARCH
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_ARCH "${KOKKOS_ARCH}")
|
||||
|
||||
# From Makefile.kokkos: Options: yes,no
|
||||
if(${KOKKOS_ENABLE_DEBUG})
|
||||
set(KOKKOS_GMAKE_DEBUG yes)
|
||||
else()
|
||||
set(KOKKOS_GMAKE_DEBUG no)
|
||||
endif()
|
||||
|
||||
#------------------------------- KOKKOS_DEVICES --------------------------------
|
||||
# Can have multiple devices
|
||||
set(KOKKOS_DEVICESl)
|
||||
foreach(devopt ${KOKKOS_DEVICES_LIST})
|
||||
string(TOUPPER ${devopt} devoptuc)
|
||||
if (${KOKKOS_ENABLE_${devoptuc}})
|
||||
list(APPEND KOKKOS_DEVICESl ${devopt})
|
||||
endif ()
|
||||
endforeach()
|
||||
# List needs to be comma-delmitted
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_DEVICES "${KOKKOS_DEVICESl}")
|
||||
|
||||
#------------------------------- KOKKOS_OPTIONS --------------------------------
|
||||
# From Makefile.kokkos: Options: aggressive_vectorization,disable_profiling,disable_deprecated_code
|
||||
#compiler_warnings, aggressive_vectorization, disable_profiling, disable_dualview_modify_check, enable_profile_load_print
|
||||
|
||||
set(KOKKOS_OPTIONSl)
|
||||
if(${KOKKOS_ENABLE_COMPILER_WARNINGS})
|
||||
list(APPEND KOKKOS_OPTIONSl compiler_warnings)
|
||||
endif()
|
||||
if(${KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION})
|
||||
list(APPEND KOKKOS_OPTIONSl aggressive_vectorization)
|
||||
endif()
|
||||
if(NOT ${KOKKOS_ENABLE_PROFILING})
|
||||
list(APPEND KOKKOS_OPTIONSl disable_profiling)
|
||||
endif()
|
||||
if(NOT ${KOKKOS_ENABLE_DEPRECATED_CODE})
|
||||
list(APPEND KOKKOS_OPTIONSl disable_deprecated_code)
|
||||
endif()
|
||||
if(NOT ${KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK})
|
||||
list(APPEND KOKKOS_OPTIONSl disable_dualview_modify_check)
|
||||
endif()
|
||||
if(${KOKKOS_ENABLE_PROFILING_LOAD_PRINT})
|
||||
list(APPEND KOKKOS_OPTIONSl enable_profile_load_print)
|
||||
endif()
|
||||
if(${KOKKOS_ENABLE_EXPLICIT_INSTANTIATION})
|
||||
list(APPEND KOKKOS_OPTIONSl enable_eti)
|
||||
endif()
|
||||
# List needs to be comma-delimitted
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_OPTIONS "${KOKKOS_OPTIONSl}")
|
||||
|
||||
|
||||
#------------------------------- KOKKOS_USE_TPLS -------------------------------
|
||||
# Construct the Makefile options
|
||||
set(KOKKOS_USE_TPLSl)
|
||||
foreach(tplopt ${KOKKOS_USE_TPLS_LIST})
|
||||
if (${KOKKOS_ENABLE_${tplopt}})
|
||||
list(APPEND KOKKOS_USE_TPLSl ${KOKKOS_INTERNAL_${tplopt}})
|
||||
endif ()
|
||||
endforeach()
|
||||
# List needs to be comma-delimitted
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_USE_TPLS "${KOKKOS_USE_TPLSl}")
|
||||
|
||||
|
||||
#------------------------------- KOKKOS_CUDA_OPTIONS ---------------------------
|
||||
# Construct the Makefile options
|
||||
set(KOKKOS_CUDA_OPTIONSl)
|
||||
foreach(cudaopt ${KOKKOS_CUDA_OPTIONS_LIST})
|
||||
if (${KOKKOS_ENABLE_CUDA_${cudaopt}})
|
||||
list(APPEND KOKKOS_CUDA_OPTIONSl ${KOKKOS_INTERNAL_${cudaopt}})
|
||||
endif ()
|
||||
endforeach()
|
||||
# List needs to be comma-delmitted
|
||||
string(REPLACE ";" "," KOKKOS_GMAKE_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")
|
||||
|
||||
#------------------------------- PATH VARIABLES --------------------------------
|
||||
# Want makefile to use same executables specified which means modifying
|
||||
# the path so the $(shell ...) commands in the makefile see the right exec
|
||||
# Also, the Makefile's use FOO_PATH naming scheme for -I/-L construction
|
||||
#TODO: Makefile.kokkos allows this to be overwritten? ROCM_HCC_PATH
|
||||
|
||||
set(KOKKOS_INTERNAL_PATHS)
|
||||
set(addpathl)
|
||||
foreach(kvar IN LISTS KOKKOS_USE_TPLS_LIST ITEMS CUDA QTHREADS)
|
||||
if(${KOKKOS_ENABLE_${kvar}})
|
||||
if(DEFINED KOKKOS_${kvar}_DIR)
|
||||
set(KOKKOS_INTERNAL_PATHS ${KOKKOS_INTERNAL_PATHS} "${kvar}_PATH=${KOKKOS_${kvar}_DIR}")
|
||||
if(IS_DIRECTORY ${KOKKOS_${kvar}_DIR}/bin)
|
||||
list(APPEND addpathl ${KOKKOS_${kvar}_DIR}/bin)
|
||||
endif()
|
||||
endif()
|
||||
endif()
|
||||
endforeach()
|
||||
# Path env is : delimitted
|
||||
string(REPLACE ";" ":" KOKKOS_INTERNAL_ADDTOPATH "${addpathl}")
|
||||
|
||||
|
||||
######################### SET KOKKOS_SETTINGS ##################################
|
||||
# Set the KOKKOS_SETTINGS String -- this is the primary communication with the
|
||||
# makefile configuration. See Makefile.kokkos
|
||||
|
||||
set(KOKKOS_SETTINGS KOKKOS_CMAKE=yes)
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_SRC_PATH=${KOKKOS_SRC_PATH})
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_PATH=${KOKKOS_PATH})
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_INSTALL_PATH=${CMAKE_INSTALL_PREFIX})
|
||||
|
||||
# Form of KOKKOS_foo=$KOKKOS_foo
|
||||
foreach(kvar ARCH;DEVICES;DEBUG;OPTIONS;CUDA_OPTIONS;USE_TPLS)
|
||||
if(DEFINED KOKKOS_GMAKE_${kvar})
|
||||
if (NOT "${KOKKOS_GMAKE_${kvar}}" STREQUAL "")
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_${kvar}=${KOKKOS_GMAKE_${kvar}})
|
||||
endif()
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
# Form of VAR=VAL
|
||||
#TODO: Makefile supports MPICH_CXX, OMPI_CXX as well
|
||||
foreach(ovar CXX;CXXFLAGS;LDFLAGS)
|
||||
if(DEFINED ${ovar})
|
||||
if (NOT "${${ovar}}" STREQUAL "")
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${ovar}=${${ovar}})
|
||||
endif()
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
# Finally, do the paths
|
||||
if (NOT "${KOKKOS_INTERNAL_PATHS}" STREQUAL "")
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_INTERNAL_PATHS})
|
||||
endif()
|
||||
if (NOT "${KOKKOS_INTERNAL_ADDTOPATH}" STREQUAL "")
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} "PATH=${KOKKOS_INTERNAL_ADDTOPATH}:$ENV{PATH}")
|
||||
endif()
|
||||
|
||||
if (CMAKE_CXX_STANDARD)
|
||||
if (CMAKE_CXX_STANDARD STREQUAL "98")
|
||||
message(FATAL_ERROR "Kokkos requires C++11 or newer!")
|
||||
endif()
|
||||
set(KOKKOS_CXX_STANDARD "c++${CMAKE_CXX_STANDARD}")
|
||||
if (CMAKE_CXX_EXTENSIONS)
|
||||
if (CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
|
||||
set(KOKKOS_CXX_STANDARD "gnu++${CMAKE_CXX_STANDARD}")
|
||||
endif()
|
||||
endif()
|
||||
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} "KOKKOS_CXX_STANDARD=\"${KOKKOS_CXX_STANDARD}\"")
|
||||
endif()
|
||||
|
||||
# Final form that gets passed to make
|
||||
set(KOKKOS_SETTINGS env ${KOKKOS_SETTINGS})
|
||||
|
||||
|
||||
############################ PRINT CONFIGURE STATUS ############################
|
||||
|
||||
if(KOKKOS_CMAKE_VERBOSE)
|
||||
message(STATUS "")
|
||||
message(STATUS "****************** Kokkos Settings ******************")
|
||||
message(STATUS "Execution Spaces")
|
||||
|
||||
if(KOKKOS_ENABLE_CUDA)
|
||||
message(STATUS " Device Parallel: Cuda")
|
||||
else()
|
||||
message(STATUS " Device Parallel: None")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_OPENMP)
|
||||
message(STATUS " Host Parallel: OpenMP")
|
||||
elseif(KOKKOS_ENABLE_PTHREAD)
|
||||
message(STATUS " Host Parallel: Pthread")
|
||||
elseif(KOKKOS_ENABLE_QTHREADS)
|
||||
message(STATUS " Host Parallel: Qthreads")
|
||||
elseif(KOKKOS_ENABLE_HPX)
|
||||
message(STATUS " Host Parallel: HPX")
|
||||
else()
|
||||
message(STATUS " Host Parallel: None")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_ENABLE_SERIAL)
|
||||
message(STATUS " Host Serial: Serial")
|
||||
else()
|
||||
message(STATUS " Host Serial: None")
|
||||
endif()
|
||||
|
||||
message(STATUS "")
|
||||
message(STATUS "Architectures:")
|
||||
message(STATUS " ${KOKKOS_GMAKE_ARCH}")
|
||||
|
||||
message(STATUS "")
|
||||
message(STATUS "Enabled options")
|
||||
|
||||
if(KOKKOS_SEPARATE_LIBS)
|
||||
message(STATUS " KOKKOS_SEPARATE_LIBS")
|
||||
endif()
|
||||
|
||||
foreach(opt IN LISTS KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST)
|
||||
string(TOUPPER ${opt} OPT)
|
||||
if (KOKKOS_ENABLE_${OPT})
|
||||
message(STATUS " KOKKOS_ENABLE_${OPT}")
|
||||
endif()
|
||||
endforeach()
|
||||
|
||||
if(KOKKOS_ENABLE_CUDA)
|
||||
if(KOKKOS_CUDA_DIR)
|
||||
message(STATUS " KOKKOS_CUDA_DIR: ${KOKKOS_CUDA_DIR}")
|
||||
endif()
|
||||
endif()
|
||||
|
||||
if(KOKKOS_QTHREADS_DIR)
|
||||
message(STATUS " KOKKOS_QTHREADS_DIR: ${KOKKOS_QTHREADS_DIR}")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_HWLOC_DIR)
|
||||
message(STATUS " KOKKOS_HWLOC_DIR: ${KOKKOS_HWLOC_DIR}")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_MEMKIND_DIR)
|
||||
message(STATUS " KOKKOS_MEMKIND_DIR: ${KOKKOS_MEMKIND_DIR}")
|
||||
endif()
|
||||
|
||||
if(KOKKOS_HPX_DIR)
|
||||
message(STATUS " KOKKOS_HPX_DIR: ${KOKKOS_HPX_DIR}")
|
||||
endif()
|
||||
|
||||
message(STATUS "")
|
||||
message(STATUS "Final kokkos settings variable:")
|
||||
message(STATUS " ${KOKKOS_SETTINGS}")
|
||||
|
||||
message(STATUS "*****************************************************")
|
||||
message(STATUS "")
|
||||
endif()
|
||||
144
lib/kokkos/cmake/kokkos_test_cxx_std.cmake
Normal file
144
lib/kokkos/cmake/kokkos_test_cxx_std.cmake
Normal file
@ -0,0 +1,144 @@
|
||||
KOKKOS_CFG_DEPENDS(CXX_STD COMPILER_ID)
|
||||
|
||||
FUNCTION(kokkos_set_cxx_standard_feature standard)
|
||||
SET(EXTENSION_NAME CMAKE_CXX${standard}_EXTENSION_COMPILE_OPTION)
|
||||
SET(STANDARD_NAME CMAKE_CXX${standard}_STANDARD_COMPILE_OPTION)
|
||||
SET(FEATURE_NAME cxx_std_${standard})
|
||||
#CMake's way of telling us that the standard (or extension)
|
||||
#flags are supported is the extension/standard variables
|
||||
IF (NOT DEFINED CMAKE_CXX_EXTENSIONS)
|
||||
IF(KOKKOS_DONT_ALLOW_EXTENSIONS)
|
||||
GLOBAL_SET(KOKKOS_USE_CXX_EXTENSIONS OFF)
|
||||
ELSE()
|
||||
GLOBAL_SET(KOKKOS_USE_CXX_EXTENSIONS ON)
|
||||
ENDIF()
|
||||
ELSEIF(CMAKE_CXX_EXTENSIONS)
|
||||
IF(KOKKOS_DONT_ALLOW_EXTENSIONS)
|
||||
MESSAGE(FATAL_ERROR "The chosen configuration does not support CXX extensions flags: ${KOKKOS_DONT_ALLOW_EXTENSIONS}. Must set CMAKE_CXX_EXTENSIONS=OFF to continue")
|
||||
ELSE()
|
||||
GLOBAL_SET(KOKKOS_USE_CXX_EXTENSIONS ON)
|
||||
ENDIF()
|
||||
ELSE()
|
||||
#For trilinos, we need to make sure downstream projects
|
||||
GLOBAL_SET(KOKKOS_USE_CXX_EXTENSIONS OFF)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_USE_CXX_EXTENSIONS AND ${EXTENSION_NAME})
|
||||
MESSAGE(STATUS "Using ${${EXTENSION_NAME}} for C++${standard} extensions as feature")
|
||||
GLOBAL_SET(KOKKOS_CXX_STANDARD_FEATURE ${FEATURE_NAME})
|
||||
ELSEIF(NOT KOKKOS_USE_CXX_EXTENSIONS AND ${STANDARD_NAME})
|
||||
MESSAGE(STATUS "Using ${${STANDARD_NAME}} for C++${standard} standard as feature")
|
||||
GLOBAL_SET(KOKKOS_CXX_STANDARD_FEATURE ${FEATURE_NAME})
|
||||
ELSE()
|
||||
#nope, we can't do anything here
|
||||
MESSAGE(WARNING "C++${standard} is not supported as a compiler feature. We will choose custom flags for now, but this behavior has been deprecated. Please open an issue at https://github.com/kokkos/kokkos/issues reporting that ${KOKKOS_CXX_COMPILER_ID} ${KOKKOS_CXX_COMPILER_VERSION} failed for ${KOKKOS_CXX_STANDARD}, preferrably including your CMake command.")
|
||||
GLOBAL_SET(KOKKOS_CXX_STANDARD_FEATURE "")
|
||||
ENDIF()
|
||||
|
||||
IF(NOT ${FEATURE_NAME} IN_LIST CMAKE_CXX_COMPILE_FEATURES)
|
||||
MESSAGE(FATAL_ERROR "Compiler ${KOKKOS_CXX_COMPILER_ID} should support ${FEATURE_NAME}, but CMake reports feature not supported")
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
|
||||
IF (KOKKOS_CXX_STANDARD AND CMAKE_CXX_STANDARD)
|
||||
#make sure these are consistent
|
||||
IF (NOT KOKKOS_CXX_STANDARD STREQUAL CMAKE_CXX_STANDARD)
|
||||
MESSAGE(WARNING "Specified both CMAKE_CXX_STANDARD=${CMAKE_CXX_STANDARD} and KOKKOS_CXX_STANDARD=${KOKKOS_CXX_STANDARD}, but they don't match")
|
||||
SET(CMAKE_CXX_STANDARD ${KOKKOS_CXX_STANDARD} CACHE STRING "C++ standard" FORCE)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
|
||||
IF (KOKKOS_CXX_STANDARD STREQUAL "11" )
|
||||
kokkos_set_cxx_standard_feature(11)
|
||||
SET(KOKKOS_ENABLE_CXX11 ON)
|
||||
SET(KOKKOS_CXX_INTERMEDIATE_STANDARD "11")
|
||||
ELSEIF(KOKKOS_CXX_STANDARD STREQUAL "14")
|
||||
kokkos_set_cxx_standard_feature(14)
|
||||
SET(KOKKOS_CXX_INTERMEDIATE_STANDARD "1Y")
|
||||
SET(KOKKOS_ENABLE_CXX14 ON)
|
||||
ELSEIF(KOKKOS_CXX_STANDARD STREQUAL "17")
|
||||
kokkos_set_cxx_standard_feature(17)
|
||||
SET(KOKKOS_CXX_INTERMEDIATE_STANDARD "1Z")
|
||||
SET(KOKKOS_ENABLE_CXX17 ON)
|
||||
ELSEIF(KOKKOS_CXX_STANDARD STREQUAL "20")
|
||||
kokkos_set_cxx_standard_feature(20)
|
||||
SET(KOKKOS_CXX_INTERMEDIATE_STANDARD "2A")
|
||||
SET(KOKKOS_ENABLE_CXX20 ON)
|
||||
ELSEIF(KOKKOS_CXX_STANDARD STREQUAL "98")
|
||||
MESSAGE(FATAL_ERROR "Kokkos requires C++11 or newer!")
|
||||
ELSE()
|
||||
MESSAGE(FATAL_ERROR "Unknown C++ standard ${KOKKOS_CXX_STANDARD} - must be 11, 14, 17, or 20")
|
||||
ENDIF()
|
||||
|
||||
|
||||
|
||||
# Enforce that extensions are turned off for nvcc_wrapper.
|
||||
# For compiling CUDA code using nvcc_wrapper, we will use the host compiler's
|
||||
# flags for turning on C++11. Since for compiler ID and versioning purposes
|
||||
# CMake recognizes the host compiler when calling nvcc_wrapper, this just
|
||||
# works. Both NVCC and nvcc_wrapper only recognize '-std=c++11' which means
|
||||
# that we can only use host compilers for CUDA builds that use those flags.
|
||||
# It also means that extensions (gnu++11) can't be turned on for CUDA builds.
|
||||
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
IF(NOT DEFINED CMAKE_CXX_EXTENSIONS)
|
||||
SET(CMAKE_CXX_EXTENSIONS OFF)
|
||||
ELSEIF(CMAKE_CXX_EXTENSIONS)
|
||||
MESSAGE(FATAL_ERROR "NVCC doesn't support C++ extensions. Set -DCMAKE_CXX_EXTENSIONS=OFF")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF(KOKKOS_ENABLE_CUDA)
|
||||
# ENFORCE that the compiler can compile CUDA code.
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang)
|
||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 4.0.0)
|
||||
MESSAGE(FATAL_ERROR "Compiling CUDA code directly with Clang requires version 4.0.0 or higher.")
|
||||
ENDIF()
|
||||
IF(NOT DEFINED CMAKE_CXX_EXTENSIONS)
|
||||
SET(CMAKE_CXX_EXTENSIONS OFF)
|
||||
ELSEIF(CMAKE_CXX_EXTENSIONS)
|
||||
MESSAGE(FATAL_ERROR "Compiling CUDA code with clang doesn't support C++ extensions. Set -DCMAKE_CXX_EXTENSIONS=OFF")
|
||||
ENDIF()
|
||||
ELSEIF(NOT KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||
MESSAGE(FATAL_ERROR "Invalid compiler for CUDA. The compiler must be nvcc_wrapper or Clang, but compiler ID was ${KOKKOS_CXX_COMPILER_ID}")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
IF (NOT KOKKOS_CXX_STANDARD_FEATURE)
|
||||
#we need to pick the C++ flags ourselves
|
||||
UNSET(CMAKE_CXX_STANDARD)
|
||||
UNSET(CMAKE_CXX_STANDARD CACHE)
|
||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL Cray)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/cray.cmake)
|
||||
kokkos_set_cray_flags(${KOKKOS_CXX_STANDARD} ${KOKKOS_CXX_INTERMEDIATE_STANDARD})
|
||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL PGI)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/pgi.cmake)
|
||||
kokkos_set_pgi_flags(${KOKKOS_CXX_STANDARD} ${KOKKOS_CXX_INTERMEDIATE_STANDARD})
|
||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Intel)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/intel.cmake)
|
||||
kokkos_set_intel_flags(${KOKKOS_CXX_STANDARD} ${KOKKOS_CXX_INTERMEDIATE_STANDARD})
|
||||
ELSE()
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/gnu.cmake)
|
||||
kokkos_set_gnu_flags(${KOKKOS_CXX_STANDARD} ${KOKKOS_CXX_INTERMEDIATE_STANDARD})
|
||||
ENDIF()
|
||||
#check that the compiler accepts the C++ standard flag
|
||||
INCLUDE(CheckCXXCompilerFlag)
|
||||
IF (DEFINED CXX_STD_FLAGS_ACCEPTED)
|
||||
UNSET(CXX_STD_FLAGS_ACCEPTED CACHE)
|
||||
ENDIF()
|
||||
CHECK_CXX_COMPILER_FLAG(${KOKKOS_CXX_STANDARD_FLAG} CXX_STD_FLAGS_ACCEPTED)
|
||||
IF (NOT CXX_STD_FLAGS_ACCEPTED)
|
||||
CHECK_CXX_COMPILER_FLAG(${KOKKOS_CXX_INTERMEDIATE_STANDARD_FLAG} CXX_INT_STD_FLAGS_ACCEPTED)
|
||||
IF (NOT CXX_INT_STD_FLAGS_ACCEPTED)
|
||||
MESSAGE(FATAL_ERROR "${KOKKOS_CXX_COMPILER_ID} did not accept ${KOKKOS_CXX_STANDARD_FLAG} or ${KOKKOS_CXX_INTERMEDIATE_STANDARD_FLAG}. You likely need to reduce the level of the C++ standard from ${KOKKOS_CXX_STANDARD}")
|
||||
ENDIF()
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG ${KOKKOS_CXX_INTERMEDIATE_STANDARD_FLAG})
|
||||
ENDIF()
|
||||
MESSAGE(STATUS "Compiler features not supported, but ${KOKKOS_CXX_COMPILER_ID} accepts ${KOKKOS_CXX_STANDARD_FLAG}")
|
||||
ENDIF()
|
||||
|
||||
|
||||
|
||||
|
||||
47
lib/kokkos/cmake/kokkos_tpls.cmake
Normal file
47
lib/kokkos/cmake/kokkos_tpls.cmake
Normal file
@ -0,0 +1,47 @@
|
||||
KOKKOS_CFG_DEPENDS(TPLS OPTIONS)
|
||||
KOKKOS_CFG_DEPENDS(TPLS DEVICES)
|
||||
|
||||
FUNCTION(KOKKOS_TPL_OPTION PKG DEFAULT)
|
||||
KOKKOS_ENABLE_OPTION(${PKG} ${DEFAULT} "Whether to enable the ${PKG} library")
|
||||
KOKKOS_OPTION(${PKG}_DIR "" PATH "Location of ${PKG} library")
|
||||
SET(KOKKOS_ENABLE_${PKG} ${KOKKOS_ENABLE_${PKG}} PARENT_SCOPE)
|
||||
SET(KOKKOS_${PKG}_DIR ${KOKKOS_${PKG}_DIR} PARENT_SCOPE)
|
||||
ENDFUNCTION()
|
||||
|
||||
KOKKOS_TPL_OPTION(HWLOC Off)
|
||||
KOKKOS_TPL_OPTION(LIBNUMA Off)
|
||||
KOKKOS_TPL_OPTION(MEMKIND Off)
|
||||
KOKKOS_TPL_OPTION(CUDA Off)
|
||||
KOKKOS_TPL_OPTION(LIBRT Off)
|
||||
KOKKOS_TPL_OPTION(LIBDL On)
|
||||
|
||||
IF(Trilinos_ENABLE_Kokkos AND TPL_ENABLE_HPX)
|
||||
SET(HPX_DEFAULT ON)
|
||||
ELSE()
|
||||
SET(HPX_DEFAULT OFF)
|
||||
ENDIF()
|
||||
KOKKOS_TPL_OPTION(HPX ${HPX_DEFAULT})
|
||||
|
||||
IF(Trilinos_ENABLE_Kokkos AND TPL_ENABLE_PTHREAD)
|
||||
SET(PTHREAD_DEFAULT ON)
|
||||
ELSE()
|
||||
SET(PTHREAD_DEFAULT OFF)
|
||||
ENDIF()
|
||||
KOKKOS_TPL_OPTION(PTHREAD ${PTHREAD_DEFAULT})
|
||||
|
||||
|
||||
#Make sure we use our local FindKokkosCuda.cmake
|
||||
KOKKOS_IMPORT_TPL(HPX INTERFACE)
|
||||
KOKKOS_IMPORT_TPL(CUDA INTERFACE)
|
||||
KOKKOS_IMPORT_TPL(HWLOC)
|
||||
KOKKOS_IMPORT_TPL(LIBNUMA)
|
||||
KOKKOS_IMPORT_TPL(LIBRT)
|
||||
KOKKOS_IMPORT_TPL(LIBDL)
|
||||
KOKKOS_IMPORT_TPL(MEMKIND)
|
||||
KOKKOS_IMPORT_TPL(PTHREAD INTERFACE)
|
||||
|
||||
#Convert list to newlines (which CMake doesn't always like in cache variables)
|
||||
STRING(REPLACE ";" "\n" KOKKOS_TPL_EXPORT_TEMP "${KOKKOS_TPL_EXPORTS}")
|
||||
#Convert to a regular variable
|
||||
UNSET(KOKKOS_TPL_EXPORTS CACHE)
|
||||
SET(KOKKOS_TPL_EXPORTS ${KOKKOS_TPL_EXPORT_TEMP})
|
||||
392
lib/kokkos/cmake/kokkos_tribits.cmake
Normal file
392
lib/kokkos/cmake/kokkos_tribits.cmake
Normal file
@ -0,0 +1,392 @@
|
||||
#These are tribits wrappers only ever called by Kokkos itself
|
||||
|
||||
INCLUDE(CMakeParseArguments)
|
||||
INCLUDE(CTest)
|
||||
INCLUDE(GNUInstallDirs)
|
||||
|
||||
MESSAGE(STATUS "The project name is: ${PROJECT_NAME}")
|
||||
|
||||
#Leave this here for now - but only do for tribits
|
||||
#This breaks the standalone CMake
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_OpenMP)
|
||||
SET(${PROJECT_NAME}_ENABLE_OpenMP OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_HPX)
|
||||
SET(${PROJECT_NAME}_ENABLE_HPX OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_DEBUG)
|
||||
SET(${PROJECT_NAME}_ENABLE_DEBUG OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_CXX11)
|
||||
SET(${PROJECT_NAME}_ENABLE_CXX11 ON)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_TESTS)
|
||||
SET(${PROJECT_NAME}_ENABLE_TESTS OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED TPL_ENABLE_Pthread)
|
||||
SET(TPL_ENABLE_Pthread OFF)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
MACRO(KOKKOS_SUBPACKAGE NAME)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_SUBPACKAGE(${NAME})
|
||||
else()
|
||||
SET(PACKAGE_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
SET(PARENT_PACKAGE_NAME ${PACKAGE_NAME})
|
||||
SET(PACKAGE_NAME ${PACKAGE_NAME}${NAME})
|
||||
STRING(TOUPPER ${PACKAGE_NAME} PACKAGE_NAME_UC)
|
||||
SET(${PACKAGE_NAME}_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
endif()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(KOKKOS_SUBPACKAGE_POSTPROCESS)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_SUBPACKAGE_POSTPROCESS()
|
||||
endif()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(KOKKOS_PACKAGE_DECL)
|
||||
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_PACKAGE_DECL(Kokkos)
|
||||
else()
|
||||
SET(PACKAGE_NAME Kokkos)
|
||||
SET(${PACKAGE_NAME}_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
STRING(TOUPPER ${PACKAGE_NAME} PACKAGE_NAME_UC)
|
||||
endif()
|
||||
|
||||
#SET(TRIBITS_DEPS_DIR "${CMAKE_SOURCE_DIR}/cmake/deps")
|
||||
#FILE(GLOB TPLS_FILES "${TRIBITS_DEPS_DIR}/*.cmake")
|
||||
#FOREACH(TPL_FILE ${TPLS_FILES})
|
||||
# TRIBITS_PROCESS_TPL_DEP_FILE(${TPL_FILE})
|
||||
#ENDFOREACH()
|
||||
|
||||
ENDMACRO()
|
||||
|
||||
|
||||
MACRO(KOKKOS_PROCESS_SUBPACKAGES)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_PROCESS_SUBPACKAGES()
|
||||
else()
|
||||
ADD_SUBDIRECTORY(core)
|
||||
ADD_SUBDIRECTORY(containers)
|
||||
ADD_SUBDIRECTORY(algorithms)
|
||||
ADD_SUBDIRECTORY(example)
|
||||
endif()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(KOKKOS_PACKAGE_DEF)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_PACKAGE_DEF()
|
||||
else()
|
||||
#do nothing
|
||||
endif()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(KOKKOS_INTERNAL_ADD_LIBRARY_INSTALL LIBRARY_NAME)
|
||||
KOKKOS_LIB_TYPE(${LIBRARY_NAME} INCTYPE)
|
||||
TARGET_INCLUDE_DIRECTORIES(${LIBRARY_NAME} ${INCTYPE} $<INSTALL_INTERFACE:${KOKKOS_HEADER_DIR}>)
|
||||
|
||||
INSTALL(
|
||||
TARGETS ${LIBRARY_NAME}
|
||||
EXPORT ${PROJECT_NAME}
|
||||
RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR}
|
||||
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
|
||||
ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR}
|
||||
COMPONENT ${PACKAGE_NAME}
|
||||
)
|
||||
|
||||
INSTALL(
|
||||
TARGETS ${LIBRARY_NAME}
|
||||
EXPORT KokkosTargets
|
||||
RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR}
|
||||
LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
|
||||
ARCHIVE DESTINATION ${CMAKE_INSTALL_LIBDIR}
|
||||
)
|
||||
|
||||
VERIFY_EMPTY(KOKKOS_ADD_LIBRARY ${PARSE_UNPARSED_ARGUMENTS})
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(KOKKOS_ADD_EXECUTABLE EXE_NAME)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_ADD_EXECUTABLE(${EXE_NAME} ${ARGN})
|
||||
else()
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
"TESTONLY"
|
||||
""
|
||||
"SOURCES;TESTONLYLIBS"
|
||||
${ARGN})
|
||||
|
||||
ADD_EXECUTABLE(${EXE_NAME} ${PARSE_SOURCES})
|
||||
IF (PARSE_TESTONLYLIBS)
|
||||
TARGET_LINK_LIBRARIES(${EXE_NAME} ${PARSE_TESTONLYLIBS})
|
||||
ENDIF()
|
||||
VERIFY_EMPTY(KOKKOS_ADD_EXECUTABLE ${PARSE_UNPARSED_ARGUMENTS})
|
||||
endif()
|
||||
ENDFUNCTION()
|
||||
|
||||
IF(NOT TARGET check)
|
||||
ADD_CUSTOM_TARGET(check COMMAND ${CMAKE_CTEST_COMMAND} -VV -C ${CMAKE_CFG_INTDIR})
|
||||
ENDIF()
|
||||
|
||||
|
||||
FUNCTION(KOKKOS_ADD_EXECUTABLE_AND_TEST ROOT_NAME)
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_ADD_EXECUTABLE_AND_TEST(
|
||||
${ROOT_NAME}
|
||||
TESTONLYLIBS kokkos_gtest
|
||||
${ARGN}
|
||||
NUM_MPI_PROCS 1
|
||||
COMM serial mpi
|
||||
FAIL_REGULAR_EXPRESSION " FAILED "
|
||||
)
|
||||
ELSE()
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
""
|
||||
""
|
||||
"SOURCES;CATEGORIES"
|
||||
${ARGN})
|
||||
VERIFY_EMPTY(KOKKOS_ADD_EXECUTABLE_AND_TEST ${PARSE_UNPARSED_ARGUMENTS})
|
||||
SET(EXE_NAME ${PACKAGE_NAME}_${ROOT_NAME})
|
||||
KOKKOS_ADD_TEST_EXECUTABLE(${EXE_NAME}
|
||||
SOURCES ${PARSE_SOURCES}
|
||||
)
|
||||
KOKKOS_ADD_TEST(NAME ${ROOT_NAME}
|
||||
EXE ${EXE_NAME}
|
||||
FAIL_REGULAR_EXPRESSION " FAILED "
|
||||
)
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(KOKKOS_SETUP_BUILD_ENVIRONMENT)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_compiler_id.cmake)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_enable_devices.cmake)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_enable_options.cmake)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_test_cxx_std.cmake)
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_arch.cmake)
|
||||
IF (NOT KOKKOS_HAS_TRILINOS)
|
||||
SET(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} "${Kokkos_SOURCE_DIR}/cmake/Modules/")
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_tpls.cmake)
|
||||
ENDIF()
|
||||
INCLUDE(${KOKKOS_SRC_PATH}/cmake/kokkos_corner_cases.cmake)
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(KOKKOS_ADD_TEST_EXECUTABLE EXE_NAME)
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
""
|
||||
""
|
||||
"SOURCES"
|
||||
${ARGN})
|
||||
KOKKOS_ADD_EXECUTABLE(${EXE_NAME}
|
||||
SOURCES ${PARSE_SOURCES}
|
||||
${PARSE_UNPARSED_ARGUMENTS}
|
||||
TESTONLYLIBS kokkos_gtest
|
||||
)
|
||||
IF (NOT KOKKOS_HAS_TRILINOS)
|
||||
ADD_DEPENDENCIES(check ${EXE_NAME})
|
||||
ENDIF()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(KOKKOS_PACKAGE_POSTPROCESS)
|
||||
if (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_PACKAGE_POSTPROCESS()
|
||||
endif()
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(KOKKOS_SET_LIBRARY_PROPERTIES LIBRARY_NAME)
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
"PLAIN_STYLE"
|
||||
""
|
||||
""
|
||||
${ARGN})
|
||||
|
||||
IF(${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.13")
|
||||
#great, this works the "right" way
|
||||
TARGET_LINK_OPTIONS(
|
||||
${LIBRARY_NAME} PUBLIC ${KOKKOS_LINK_OPTIONS}
|
||||
)
|
||||
ELSE()
|
||||
IF (PARSE_PLAIN_STYLE)
|
||||
TARGET_LINK_LIBRARIES(
|
||||
${LIBRARY_NAME} ${KOKKOS_LINK_OPTIONS}
|
||||
)
|
||||
ELSE()
|
||||
#well, have to do it the wrong way for now
|
||||
TARGET_LINK_LIBRARIES(
|
||||
${LIBRARY_NAME} PUBLIC ${KOKKOS_LINK_OPTIONS}
|
||||
)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
TARGET_COMPILE_OPTIONS(
|
||||
${LIBRARY_NAME} PUBLIC
|
||||
$<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_COMPILE_OPTIONS}>
|
||||
)
|
||||
|
||||
IF (KOKKOS_ENABLE_CUDA)
|
||||
TARGET_COMPILE_OPTIONS(
|
||||
${LIBRARY_NAME}
|
||||
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CUDA_OPTIONS}>
|
||||
)
|
||||
SET(NODEDUP_CUDAFE_OPTIONS)
|
||||
FOREACH(OPT ${KOKKOS_CUDAFE_OPTIONS})
|
||||
LIST(APPEND NODEDUP_CUDAFE_OPTIONS -Xcudafe ${OPT})
|
||||
ENDFOREACH()
|
||||
TARGET_COMPILE_OPTIONS(
|
||||
${LIBRARY_NAME}
|
||||
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${NODEDUP_CUDAFE_OPTIONS}>
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
LIST(LENGTH KOKKOS_XCOMPILER_OPTIONS XOPT_LENGTH)
|
||||
IF (XOPT_LENGTH GREATER 1)
|
||||
MESSAGE(FATAL_ERROR "CMake deduplication does not allow multiple -Xcompiler flags (${KOKKOS_XCOMPILER_OPTIONS}): will require Kokkos to upgrade to minimum 3.12")
|
||||
ENDIF()
|
||||
IF(KOKKOS_XCOMPILER_OPTIONS)
|
||||
SET(NODEDUP_XCOMPILER_OPTIONS)
|
||||
FOREACH(OPT ${KOKKOS_XCOMPILER_OPTIONS})
|
||||
#I have to do this for now because we can't guarantee 3.12 support
|
||||
#I really should do this with the shell option
|
||||
LIST(APPEND NODEDUP_XCOMPILER_OPTIONS -Xcompiler)
|
||||
LIST(APPEND NODEDUP_XCOMPILER_OPTIONS ${OPT})
|
||||
ENDFOREACH()
|
||||
TARGET_COMPILE_OPTIONS(
|
||||
${LIBRARY_NAME}
|
||||
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${NODEDUP_XCOMPILER_OPTIONS}>
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF (KOKKOS_CXX_STANDARD_FEATURE)
|
||||
#GREAT! I can do this the right way
|
||||
TARGET_COMPILE_FEATURES(${LIBRARY_NAME} PUBLIC ${KOKKOS_CXX_STANDARD_FEATURE})
|
||||
IF (NOT KOKKOS_USE_CXX_EXTENSIONS)
|
||||
SET_TARGET_PROPERTIES(${LIBRARY_NAME} PROPERTIES CXX_EXTENSIONS OFF)
|
||||
ENDIF()
|
||||
ELSE()
|
||||
#OH, well, no choice but the wrong way
|
||||
TARGET_COMPILE_OPTIONS(${LIBRARY_NAME} PUBLIC ${KOKKOS_CXX_STANDARD_FLAG})
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_INTERNAL_ADD_LIBRARY LIBRARY_NAME)
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
"STATIC;SHARED"
|
||||
""
|
||||
"HEADERS;SOURCES"
|
||||
${ARGN})
|
||||
|
||||
IF(PARSE_HEADERS)
|
||||
LIST(REMOVE_DUPLICATES PARSE_HEADERS)
|
||||
ENDIF()
|
||||
IF(PARSE_SOURCES)
|
||||
LIST(REMOVE_DUPLICATES PARSE_SOURCES)
|
||||
ENDIF()
|
||||
|
||||
ADD_LIBRARY(
|
||||
${LIBRARY_NAME}
|
||||
${PARSE_HEADERS}
|
||||
${PARSE_SOURCES}
|
||||
)
|
||||
|
||||
KOKKOS_INTERNAL_ADD_LIBRARY_INSTALL(${LIBRARY_NAME})
|
||||
|
||||
INSTALL(
|
||||
FILES ${PARSE_HEADERS}
|
||||
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}
|
||||
COMPONENT ${PACKAGE_NAME}
|
||||
)
|
||||
|
||||
#In case we are building in-tree, add an alias name
|
||||
#that matches the install Kokkos:: name
|
||||
ADD_LIBRARY(Kokkos::${LIBRARY_NAME} ALIAS ${LIBRARY_NAME})
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_ADD_LIBRARY LIBRARY_NAME)
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_ADD_LIBRARY(${LIBRARY_NAME} ${ARGN})
|
||||
#Stolen from Tribits - it can add prefixes
|
||||
SET(TRIBITS_LIBRARY_NAME_PREFIX "${${PROJECT_NAME}_LIBRARY_NAME_PREFIX}")
|
||||
SET(TRIBITS_LIBRARY_NAME ${TRIBITS_LIBRARY_NAME_PREFIX}${LIBRARY_NAME})
|
||||
#Tribits has way too much techinical debt and baggage to even
|
||||
#allow PUBLIC target_compile_options to be used. It forces C++ flags on projects
|
||||
#as a giant blob of space-separated strings. We end up with duplicated
|
||||
#flags between the flags implicitly forced on Kokkos-dependent and those Kokkos
|
||||
#has in its public INTERFACE_COMPILE_OPTIONS.
|
||||
#These do NOT get de-deduplicated because Tribits
|
||||
#creates flags as a giant monolithic space-separated string
|
||||
#Do not set any transitive properties and keep everything working as before
|
||||
#KOKKOS_SET_LIBRARY_PROPERTIES(${TRIBITS_LIBRARY_NAME} PLAIN_STYLE)
|
||||
ELSE()
|
||||
KOKKOS_INTERNAL_ADD_LIBRARY(
|
||||
${LIBRARY_NAME} ${ARGN})
|
||||
KOKKOS_SET_LIBRARY_PROPERTIES(${LIBRARY_NAME})
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_ADD_INTERFACE_LIBRARY NAME)
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_ADD_LIBRARY(${NAME} ${ARGN})
|
||||
ELSE()
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE
|
||||
""
|
||||
""
|
||||
"HEADERS;SOURCES"
|
||||
${ARGN}
|
||||
)
|
||||
|
||||
ADD_LIBRARY(${NAME} INTERFACE)
|
||||
KOKKOS_INTERNAL_ADD_LIBRARY_INSTALL(${NAME})
|
||||
|
||||
INSTALL(
|
||||
FILES ${PARSE_HEADERS}
|
||||
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}
|
||||
)
|
||||
|
||||
INSTALL(
|
||||
FILES ${PARSE_HEADERS}
|
||||
DESTINATION ${CMAKE_INSTALL_INCLUDEDIR}
|
||||
COMPONENT ${PACKAGE_NAME}
|
||||
)
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_LIB_INCLUDE_DIRECTORIES TARGET)
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
#ignore the target, tribits doesn't do anything directly with targets
|
||||
TRIBITS_INCLUDE_DIRECTORIES(${ARGN})
|
||||
ELSE() #append to a list for later
|
||||
KOKKOS_LIB_TYPE(${TARGET} INCTYPE)
|
||||
FOREACH(DIR ${ARGN})
|
||||
TARGET_INCLUDE_DIRECTORIES(${TARGET} ${INCTYPE} $<BUILD_INTERFACE:${DIR}>)
|
||||
ENDFOREACH()
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(KOKKOS_LIB_COMPILE_OPTIONS TARGET)
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
#don't trust tribits to do this correctly
|
||||
KOKKOS_TARGET_COMPILE_OPTIONS(${TARGET} ${ARGN})
|
||||
ELSE()
|
||||
KOKKOS_LIB_TYPE(${TARGET} INCTYPE)
|
||||
KOKKOS_TARGET_COMPILE_OPTIONS(${${PROJECT_NAME}_LIBRARY_NAME_PREFIX}${TARGET} ${INCTYPE} ${ARGN})
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(KOKKOS_ADD_TEST_DIRECTORIES)
|
||||
IF (KOKKOS_HAS_TRILINOS)
|
||||
TRIBITS_ADD_TEST_DIRECTORIES(${ARGN})
|
||||
ELSE()
|
||||
IF(KOKKOS_ENABLE_TESTS)
|
||||
FOREACH(TEST_DIR ${ARGN})
|
||||
ADD_SUBDIRECTORY(${TEST_DIR})
|
||||
ENDFOREACH()
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDMACRO()
|
||||
8
lib/kokkos/cmake/pgi.cmake
Normal file
8
lib/kokkos/cmake/pgi.cmake
Normal file
@ -0,0 +1,8 @@
|
||||
|
||||
function(kokkos_set_pgi_flags full_standard int_standard)
|
||||
STRING(TOLOWER ${full_standard} FULL_LC_STANDARD)
|
||||
STRING(TOLOWER ${int_standard} INT_LC_STANDARD)
|
||||
SET(KOKKOS_CXX_STANDARD_FLAG "--c++${FULL_LC_STANDARD}" PARENT_SCOPE)
|
||||
SET(KOKKOS_CXX_INTERMDIATE_STANDARD_FLAG "--c++${INT_LC_STANDARD}" PARENT_SCOPE)
|
||||
endfunction()
|
||||
|
||||
@ -67,7 +67,7 @@ ELSE()
|
||||
IF(CUDA_cusparse_LIBRARY STREQUAL "CUDA_cusparse_LIBRARY-NOTFOUND")
|
||||
MESSAGE(FATAL_ERROR "\nCUSPARSE: could not find cuspasre library.")
|
||||
ENDIF()
|
||||
ENDIF(CMAKE_VERSION VERSION_LESS "2.8.8")
|
||||
ENDIF()
|
||||
GLOBAL_SET(TPL_CUSPARSE_LIBRARY_DIRS)
|
||||
GLOBAL_SET(TPL_CUSPARSE_INCLUDE_DIRS ${TPL_CUDA_INCLUDE_DIRS})
|
||||
GLOBAL_SET(TPL_CUSPARSE_LIBRARIES ${CUDA_cusparse_LIBRARY})
|
||||
|
||||
@ -64,7 +64,7 @@
|
||||
# Version: 1.3
|
||||
#
|
||||
|
||||
TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( HWLOC
|
||||
KOKKOS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( HWLOC
|
||||
REQUIRED_HEADERS hwloc.h
|
||||
REQUIRED_LIBS_NAMES "hwloc"
|
||||
)
|
||||
|
||||
@ -75,7 +75,7 @@ IF(USE_THREADS)
|
||||
SET(TPL_Pthread_LIBRARIES "${CMAKE_THREAD_LIBS_INIT}")
|
||||
SET(TPL_Pthread_LIBRARY_DIRS "")
|
||||
ELSE()
|
||||
TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( Pthread
|
||||
KOKKOS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( Pthread
|
||||
REQUIRED_HEADERS pthread.h
|
||||
REQUIRED_LIBS_NAMES pthread
|
||||
)
|
||||
|
||||
@ -1,69 +0,0 @@
|
||||
# @HEADER
|
||||
# ************************************************************************
|
||||
#
|
||||
# Trilinos: An Object-Oriented Solver Framework
|
||||
# Copyright (2001) Sandia Corporation
|
||||
#
|
||||
#
|
||||
# Copyright (2001) Sandia Corporation. Under the terms of Contract
|
||||
# DE-AC04-94AL85000, there is a non-exclusive license for use of this
|
||||
# work by or on behalf of the U.S. Government. Export of this program
|
||||
# may require a license from the United States Government.
|
||||
#
|
||||
# 1. Redistributions of source code must retain the above copyright
|
||||
# notice, this list of conditions and the following disclaimer.
|
||||
#
|
||||
# 2. Redistributions in binary form must reproduce the above copyright
|
||||
# notice, this list of conditions and the following disclaimer in the
|
||||
# documentation and/or other materials provided with the distribution.
|
||||
#
|
||||
# 3. Neither the name of the Corporation nor the names of the
|
||||
# contributors may be used to endorse or promote products derived from
|
||||
# this software without specific prior written permission.
|
||||
#
|
||||
# THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
|
||||
# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
||||
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
||||
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
||||
#
|
||||
# NOTICE: The United States Government is granted for itself and others
|
||||
# acting on its behalf a paid-up, nonexclusive, irrevocable worldwide
|
||||
# license in this data to reproduce, prepare derivative works, and
|
||||
# perform publicly and display publicly. Beginning five (5) years from
|
||||
# July 25, 2001, the United States Government is granted for itself and
|
||||
# others acting on its behalf a paid-up, nonexclusive, irrevocable
|
||||
# worldwide license in this data to reproduce, prepare derivative works,
|
||||
# distribute copies to the public, perform publicly and display
|
||||
# publicly, and to permit others to do so.
|
||||
#
|
||||
# NEITHER THE UNITED STATES GOVERNMENT, NOR THE UNITED STATES DEPARTMENT
|
||||
# OF ENERGY, NOR SANDIA CORPORATION, NOR ANY OF THEIR EMPLOYEES, MAKES
|
||||
# ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LEGAL LIABILITY OR
|
||||
# RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR USEFULNESS OF ANY
|
||||
# INFORMATION, APPARATUS, PRODUCT, OR PROCESS DISCLOSED, OR REPRESENTS
|
||||
# THAT ITS USE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS.
|
||||
#
|
||||
# ************************************************************************
|
||||
# @HEADER
|
||||
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
# Hardware locality detection and control library.
|
||||
#
|
||||
# Acquisition information:
|
||||
# Date checked: July 2014
|
||||
# Checked by: H. Carter Edwards <hcedwar AT sandia.gov>
|
||||
# Source: https://code.google.com/p/qthreads
|
||||
#
|
||||
|
||||
TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES( QTHREADS
|
||||
REQUIRED_HEADERS qthread.h
|
||||
REQUIRED_LIBS_NAMES "qthread"
|
||||
)
|
||||
@ -1,531 +0,0 @@
|
||||
INCLUDE(CMakeParseArguments)
|
||||
INCLUDE(CTest)
|
||||
|
||||
cmake_policy(SET CMP0054 NEW)
|
||||
|
||||
MESSAGE(STATUS "The project name is: ${PROJECT_NAME}")
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_OpenMP)
|
||||
SET(${PROJECT_NAME}_ENABLE_OpenMP OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_HPX)
|
||||
SET(${PROJECT_NAME}_ENABLE_HPX OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_DEBUG)
|
||||
SET(${PROJECT_NAME}_ENABLE_DEBUG OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_CXX11)
|
||||
SET(${PROJECT_NAME}_ENABLE_CXX11 ON)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_TESTS)
|
||||
SET(${PROJECT_NAME}_ENABLE_TESTS OFF)
|
||||
ENDIF()
|
||||
|
||||
IF(NOT DEFINED TPL_ENABLE_Pthread)
|
||||
SET(TPL_ENABLE_Pthread OFF)
|
||||
ENDIF()
|
||||
|
||||
FUNCTION(ASSERT_DEFINED VARS)
|
||||
FOREACH(VAR ${VARS})
|
||||
IF(NOT DEFINED ${VAR})
|
||||
MESSAGE(SEND_ERROR "Error, the variable ${VAR} is not defined!")
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(GLOBAL_SET VARNAME)
|
||||
SET(${VARNAME} ${ARGN} CACHE INTERNAL "")
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(PREPEND_GLOBAL_SET VARNAME)
|
||||
ASSERT_DEFINED(${VARNAME})
|
||||
GLOBAL_SET(${VARNAME} ${ARGN} ${${VARNAME}})
|
||||
ENDMACRO()
|
||||
|
||||
#FUNCTION(REMOVE_GLOBAL_DUPLICATES VARNAME)
|
||||
# ASSERT_DEFINED(${VARNAME})
|
||||
# IF (${VARNAME})
|
||||
# SET(TMP ${${VARNAME}})
|
||||
# LIST(REMOVE_DUPLICATES TMP)
|
||||
# GLOBAL_SET(${VARNAME} ${TMP})
|
||||
# ENDIF()
|
||||
#ENDFUNCTION()
|
||||
|
||||
#MACRO(TRIBITS_ADD_OPTION_AND_DEFINE USER_OPTION_NAME MACRO_DEFINE_NAME DOCSTRING DEFAULT_VALUE)
|
||||
# MESSAGE(STATUS "TRIBITS_ADD_OPTION_AND_DEFINE: '${USER_OPTION_NAME}' '${MACRO_DEFINE_NAME}' '${DEFAULT_VALUE}'")
|
||||
# SET( ${USER_OPTION_NAME} "${DEFAULT_VALUE}" CACHE BOOL "${DOCSTRING}" )
|
||||
# IF(NOT ${MACRO_DEFINE_NAME} STREQUAL "")
|
||||
# IF(${USER_OPTION_NAME})
|
||||
# GLOBAL_SET(${MACRO_DEFINE_NAME} ON)
|
||||
# ELSE()
|
||||
# GLOBAL_SET(${MACRO_DEFINE_NAME} OFF)
|
||||
# ENDIF()
|
||||
# ENDIF()
|
||||
#ENDMACRO()
|
||||
|
||||
FUNCTION(TRIBITS_CONFIGURE_FILE PACKAGE_NAME_CONFIG_FILE)
|
||||
|
||||
# Configure the file
|
||||
CONFIGURE_FILE(
|
||||
${PACKAGE_SOURCE_DIR}/cmake/${PACKAGE_NAME_CONFIG_FILE}.in
|
||||
${CMAKE_CURRENT_BINARY_DIR}/${PACKAGE_NAME_CONFIG_FILE}
|
||||
)
|
||||
|
||||
ENDFUNCTION()
|
||||
|
||||
#MACRO(TRIBITS_ADD_DEBUG_OPTION)
|
||||
# TRIBITS_ADD_OPTION_AND_DEFINE(
|
||||
# ${PROJECT_NAME}_ENABLE_DEBUG
|
||||
# HAVE_${PROJECT_NAME_UC}_DEBUG
|
||||
# "Enable a host of runtime debug checking."
|
||||
# OFF
|
||||
# )
|
||||
#ENDMACRO()
|
||||
|
||||
|
||||
MACRO(TRIBITS_ADD_TEST_DIRECTORIES)
|
||||
IF(${${PROJECT_NAME}_ENABLE_TESTS})
|
||||
FOREACH(TEST_DIR ${ARGN})
|
||||
ADD_SUBDIRECTORY(${TEST_DIR})
|
||||
ENDFOREACH()
|
||||
ENDIF()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(TRIBITS_ADD_EXAMPLE_DIRECTORIES)
|
||||
IF(${PACKAGE_NAME}_ENABLE_EXAMPLES OR ${PARENT_PACKAGE_NAME}_ENABLE_EXAMPLES)
|
||||
FOREACH(EXAMPLE_DIR ${ARGN})
|
||||
ADD_SUBDIRECTORY(${EXAMPLE_DIR})
|
||||
ENDFOREACH()
|
||||
ENDIF()
|
||||
ENDMACRO()
|
||||
|
||||
|
||||
function(INCLUDE_DIRECTORIES)
|
||||
cmake_parse_arguments(INCLUDE_DIRECTORIES "REQUIRED_DURING_INSTALLATION_TESTING" "" "" ${ARGN})
|
||||
_INCLUDE_DIRECTORIES(${INCLUDE_DIRECTORIES_UNPARSED_ARGUMENTS})
|
||||
endfunction()
|
||||
|
||||
|
||||
MACRO(TARGET_TRANSFER_PROPERTY TARGET_NAME PROP_IN PROP_OUT)
|
||||
SET(PROP_VALUES)
|
||||
FOREACH(TARGET_X ${ARGN})
|
||||
LIST(APPEND PROP_VALUES "$<TARGET_PROPERTY:${TARGET_X},${PROP_IN}>")
|
||||
ENDFOREACH()
|
||||
SET_TARGET_PROPERTIES(${TARGET_NAME} PROPERTIES ${PROP_OUT} "${PROP_VALUES}")
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(ADD_INTERFACE_LIBRARY LIB_NAME)
|
||||
FILE(WRITE ${CMAKE_CURRENT_BINARY_DIR}/dummy.cpp "")
|
||||
ADD_LIBRARY(${LIB_NAME} STATIC ${CMAKE_CURRENT_BINARY_DIR}/dummy.cpp)
|
||||
SET_TARGET_PROPERTIES(${LIB_NAME} PROPERTIES INTERFACE TRUE)
|
||||
ENDMACRO()
|
||||
|
||||
# Older versions of cmake does not make include directories transitive
|
||||
MACRO(TARGET_LINK_AND_INCLUDE_LIBRARIES TARGET_NAME)
|
||||
TARGET_LINK_LIBRARIES(${TARGET_NAME} LINK_PUBLIC ${ARGN})
|
||||
FOREACH(DEP_LIB ${ARGN})
|
||||
TARGET_INCLUDE_DIRECTORIES(${TARGET_NAME} PUBLIC $<TARGET_PROPERTY:${DEP_LIB},INTERFACE_INCLUDE_DIRECTORIES>)
|
||||
TARGET_INCLUDE_DIRECTORIES(${TARGET_NAME} PUBLIC $<TARGET_PROPERTY:${DEP_LIB},INCLUDE_DIRECTORIES>)
|
||||
ENDFOREACH()
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(TRIBITS_ADD_LIBRARY LIBRARY_NAME)
|
||||
|
||||
SET(options STATIC SHARED TESTONLY NO_INSTALL_LIB_OR_HEADERS CUDALIBRARY)
|
||||
SET(oneValueArgs)
|
||||
SET(multiValueArgs HEADERS HEADERS_INSTALL_SUBDIR NOINSTALLHEADERS SOURCES DEPLIBS IMPORTEDLIBS DEFINES ADDED_LIB_TARGET_NAME_OUT)
|
||||
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
||||
|
||||
IF(PARSE_HEADERS)
|
||||
LIST(REMOVE_DUPLICATES PARSE_HEADERS)
|
||||
ENDIF()
|
||||
IF(PARSE_SOURCES)
|
||||
LIST(REMOVE_DUPLICATES PARSE_SOURCES)
|
||||
ENDIF()
|
||||
|
||||
# Local variable to hold all of the libraries that will be directly linked
|
||||
# to this library.
|
||||
SET(LINK_LIBS ${${PACKAGE_NAME}_DEPS})
|
||||
|
||||
# Add dependent libraries passed directly in
|
||||
|
||||
IF (PARSE_IMPORTEDLIBS)
|
||||
LIST(APPEND LINK_LIBS ${PARSE_IMPORTEDLIBS})
|
||||
ENDIF()
|
||||
|
||||
IF (PARSE_DEPLIBS)
|
||||
LIST(APPEND LINK_LIBS ${PARSE_DEPLIBS})
|
||||
ENDIF()
|
||||
|
||||
# Add the library and all the dependencies
|
||||
|
||||
IF (PARSE_DEFINES)
|
||||
ADD_DEFINITIONS(${PARSE_DEFINES})
|
||||
ENDIF()
|
||||
|
||||
IF (PARSE_STATIC)
|
||||
SET(STATIC_KEYWORD "STATIC")
|
||||
ELSE()
|
||||
SET(STATIC_KEYWORD)
|
||||
ENDIF()
|
||||
|
||||
IF (PARSE_SHARED)
|
||||
SET(SHARED_KEYWORD "SHARED")
|
||||
ELSE()
|
||||
SET(SHARED_KEYWORD)
|
||||
ENDIF()
|
||||
|
||||
IF (PARSE_TESTONLY)
|
||||
SET(EXCLUDE_FROM_ALL_KEYWORD "EXCLUDE_FROM_ALL")
|
||||
ELSE()
|
||||
SET(EXCLUDE_FROM_ALL_KEYWORD)
|
||||
ENDIF()
|
||||
IF (NOT PARSE_CUDALIBRARY)
|
||||
ADD_LIBRARY(
|
||||
${LIBRARY_NAME}
|
||||
${STATIC_KEYWORD}
|
||||
${SHARED_KEYWORD}
|
||||
${EXCLUDE_FROM_ALL_KEYWORD}
|
||||
${PARSE_HEADERS}
|
||||
${PARSE_NOINSTALLHEADERS}
|
||||
${PARSE_SOURCES}
|
||||
)
|
||||
ELSE()
|
||||
CUDA_ADD_LIBRARY(
|
||||
${LIBRARY_NAME}
|
||||
${PARSE_HEADERS}
|
||||
${PARSE_NOINSTALLHEADERS}
|
||||
${PARSE_SOURCES}
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
TARGET_LINK_AND_INCLUDE_LIBRARIES(${LIBRARY_NAME} ${LINK_LIBS})
|
||||
|
||||
IF (NOT PARSE_TESTONLY OR PARSE_NO_INSTALL_LIB_OR_HEADERS)
|
||||
|
||||
INSTALL(
|
||||
TARGETS ${LIBRARY_NAME}
|
||||
EXPORT ${PROJECT_NAME}
|
||||
RUNTIME DESTINATION bin
|
||||
LIBRARY DESTINATION lib
|
||||
ARCHIVE DESTINATION lib
|
||||
COMPONENT ${PACKAGE_NAME}
|
||||
)
|
||||
|
||||
INSTALL(
|
||||
FILES ${PARSE_HEADERS}
|
||||
EXPORT ${PROJECT_NAME}
|
||||
DESTINATION include
|
||||
COMPONENT ${PACKAGE_NAME}
|
||||
)
|
||||
|
||||
INSTALL(
|
||||
DIRECTORY ${PARSE_HEADERS_INSTALL_SUBDIR}
|
||||
EXPORT ${PROJECT_NAME}
|
||||
DESTINATION include
|
||||
COMPONENT ${PACKAGE_NAME}
|
||||
)
|
||||
|
||||
ENDIF()
|
||||
|
||||
IF (NOT PARSE_TESTONLY)
|
||||
PREPEND_GLOBAL_SET(${PACKAGE_NAME}_LIBS ${LIBRARY_NAME})
|
||||
REMOVE_GLOBAL_DUPLICATES(${PACKAGE_NAME}_LIBS)
|
||||
ENDIF()
|
||||
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(TRIBITS_ADD_EXECUTABLE EXE_NAME)
|
||||
|
||||
SET(options NOEXEPREFIX NOEXESUFFIX ADD_DIR_TO_NAME INSTALLABLE TESTONLY)
|
||||
SET(oneValueArgs ADDED_EXE_TARGET_NAME_OUT)
|
||||
SET(multiValueArgs SOURCES CATEGORIES HOST XHOST HOSTTYPE XHOSTTYPE DIRECTORY TESTONLYLIBS IMPORTEDLIBS DEPLIBS COMM LINKER_LANGUAGE TARGET_DEFINES DEFINES)
|
||||
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
||||
|
||||
IF (PARSE_TARGET_DEFINES)
|
||||
TARGET_COMPILE_DEFINITIONS(${EXE_NAME} PUBLIC ${PARSE_TARGET_DEFINES})
|
||||
ENDIF()
|
||||
|
||||
SET(LINK_LIBS PACKAGE_${PACKAGE_NAME})
|
||||
|
||||
IF (PARSE_TESTONLYLIBS)
|
||||
LIST(APPEND LINK_LIBS ${PARSE_TESTONLYLIBS})
|
||||
ENDIF()
|
||||
|
||||
IF (PARSE_IMPORTEDLIBS)
|
||||
LIST(APPEND LINK_LIBS ${PARSE_IMPORTEDLIBS})
|
||||
ENDIF()
|
||||
|
||||
SET (EXE_SOURCES)
|
||||
IF(PARSE_DIRECTORY)
|
||||
FOREACH( SOURCE_FILE ${PARSE_SOURCES} )
|
||||
IF(IS_ABSOLUTE ${SOURCE_FILE})
|
||||
SET (EXE_SOURCES ${EXE_SOURCES} ${SOURCE_FILE})
|
||||
ELSE()
|
||||
SET (EXE_SOURCES ${EXE_SOURCES} ${PARSE_DIRECTORY}/${SOURCE_FILE})
|
||||
ENDIF()
|
||||
ENDFOREACH( )
|
||||
ELSE()
|
||||
FOREACH( SOURCE_FILE ${PARSE_SOURCES} )
|
||||
SET (EXE_SOURCES ${EXE_SOURCES} ${SOURCE_FILE})
|
||||
ENDFOREACH( )
|
||||
ENDIF()
|
||||
|
||||
SET(EXE_BINARY_NAME ${EXE_NAME})
|
||||
IF(DEFINED PACKAGE_NAME AND NOT PARSE_NOEXEPREFIX)
|
||||
SET(EXE_BINARY_NAME ${PACKAGE_NAME}_${EXE_BINARY_NAME})
|
||||
ENDIF()
|
||||
|
||||
# IF (PARSE_TESTONLY)
|
||||
# SET(EXCLUDE_FROM_ALL_KEYWORD "EXCLUDE_FROM_ALL")
|
||||
# ELSE()
|
||||
# SET(EXCLUDE_FROM_ALL_KEYWORD)
|
||||
# ENDIF()
|
||||
ADD_EXECUTABLE(${EXE_BINARY_NAME} ${EXCLUDE_FROM_ALL_KEYWORD} ${EXE_SOURCES})
|
||||
|
||||
TARGET_LINK_AND_INCLUDE_LIBRARIES(${EXE_BINARY_NAME} ${LINK_LIBS})
|
||||
|
||||
IF(PARSE_ADDED_EXE_TARGET_NAME_OUT)
|
||||
SET(${PARSE_ADDED_EXE_TARGET_NAME_OUT} ${EXE_BINARY_NAME} PARENT_SCOPE)
|
||||
ENDIF()
|
||||
|
||||
IF(PARSE_INSTALLABLE)
|
||||
INSTALL(
|
||||
TARGETS ${EXE_BINARY_NAME}
|
||||
EXPORT ${PROJECT_NAME}
|
||||
DESTINATION bin
|
||||
)
|
||||
ENDIF()
|
||||
ENDFUNCTION()
|
||||
|
||||
IF(NOT TARGET check)
|
||||
ADD_CUSTOM_TARGET(check COMMAND ${CMAKE_CTEST_COMMAND} -VV -C ${CMAKE_CFG_INTDIR})
|
||||
ENDIF()
|
||||
|
||||
FUNCTION(TRIBITS_ADD_TEST)
|
||||
ENDFUNCTION()
|
||||
FUNCTION(TRIBITS_TPL_TENTATIVELY_ENABLE)
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(TRIBITS_ADD_ADVANCED_TEST)
|
||||
# TODO Write this
|
||||
ENDFUNCTION()
|
||||
|
||||
FUNCTION(TRIBITS_ADD_EXECUTABLE_AND_TEST EXE_NAME)
|
||||
|
||||
SET(options STANDARD_PASS_OUTPUT WILL_FAIL)
|
||||
SET(oneValueArgs PASS_REGULAR_EXPRESSION FAIL_REGULAR_EXPRESSION ENVIRONMENT TIMEOUT CATEGORIES ADDED_TESTS_NAMES_OUT ADDED_EXE_TARGET_NAME_OUT)
|
||||
SET(multiValueArgs)
|
||||
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
||||
|
||||
TRIBITS_ADD_EXECUTABLE(${EXE_NAME} TESTONLY ADDED_EXE_TARGET_NAME_OUT TEST_NAME ${PARSE_UNPARSED_ARGUMENTS})
|
||||
|
||||
IF(WIN32)
|
||||
ADD_TEST(NAME ${TEST_NAME} WORKING_DIRECTORY ${LIBRARY_OUTPUT_PATH} COMMAND ${TEST_NAME}${CMAKE_EXECUTABLE_SUFFIX})
|
||||
ELSE()
|
||||
ADD_TEST(NAME ${TEST_NAME} COMMAND ${TEST_NAME})
|
||||
ENDIF()
|
||||
ADD_DEPENDENCIES(check ${TEST_NAME})
|
||||
|
||||
IF(PARSE_FAIL_REGULAR_EXPRESSION)
|
||||
SET_TESTS_PROPERTIES(${TEST_NAME} PROPERTIES FAIL_REGULAR_EXPRESSION ${PARSE_FAIL_REGULAR_EXPRESSION})
|
||||
ENDIF()
|
||||
|
||||
IF(PARSE_PASS_REGULAR_EXPRESSION)
|
||||
SET_TESTS_PROPERTIES(${TEST_NAME} PROPERTIES PASS_REGULAR_EXPRESSION ${PARSE_PASS_REGULAR_EXPRESSION})
|
||||
ENDIF()
|
||||
|
||||
IF(PARSE_WILL_FAIL)
|
||||
SET_TESTS_PROPERTIES(${TEST_NAME} PROPERTIES WILL_FAIL ${PARSE_WILL_FAIL})
|
||||
ENDIF()
|
||||
|
||||
IF(PARSE_ADDED_TESTS_NAMES_OUT)
|
||||
SET(${PARSE_ADDED_TESTS_NAMES_OUT} ${TEST_NAME} PARENT_SCOPE)
|
||||
ENDIF()
|
||||
|
||||
IF(PARSE_ADDED_EXE_TARGET_NAME_OUT)
|
||||
SET(${PARSE_ADDED_EXE_TARGET_NAME_OUT} ${TEST_NAME} PARENT_SCOPE)
|
||||
ENDIF()
|
||||
|
||||
ENDFUNCTION()
|
||||
|
||||
MACRO(TIBITS_CREATE_IMPORTED_TPL_LIBRARY TPL_NAME)
|
||||
ADD_INTERFACE_LIBRARY(TPL_LIB_${TPL_NAME})
|
||||
TARGET_LINK_LIBRARIES(TPL_LIB_${TPL_NAME} LINK_PUBLIC ${TPL_${TPL_NAME}_LIBRARIES})
|
||||
TARGET_INCLUDE_DIRECTORIES(TPL_LIB_${TPL_NAME} INTERFACE ${TPL_${TPL_NAME}_INCLUDE_DIRS})
|
||||
ENDMACRO()
|
||||
|
||||
FUNCTION(TRIBITS_TPL_FIND_INCLUDE_DIRS_AND_LIBRARIES TPL_NAME)
|
||||
|
||||
SET(options MUST_FIND_ALL_LIBS MUST_FIND_ALL_HEADERS NO_PRINT_ENABLE_SUCCESS_FAIL)
|
||||
SET(oneValueArgs)
|
||||
SET(multiValueArgs REQUIRED_HEADERS REQUIRED_LIBS_NAMES)
|
||||
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
||||
|
||||
SET(_${TPL_NAME}_ENABLE_SUCCESS TRUE)
|
||||
IF (PARSE_REQUIRED_LIBS_NAMES)
|
||||
FIND_LIBRARY(TPL_${TPL_NAME}_LIBRARIES NAMES ${PARSE_REQUIRED_LIBS_NAMES})
|
||||
IF(NOT TPL_${TPL_NAME}_LIBRARIES)
|
||||
SET(_${TPL_NAME}_ENABLE_SUCCESS FALSE)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
IF (PARSE_REQUIRED_HEADERS)
|
||||
FIND_PATH(TPL_${TPL_NAME}_INCLUDE_DIRS NAMES ${PARSE_REQUIRED_HEADERS})
|
||||
IF(NOT TPL_${TPL_NAME}_INCLUDE_DIRS)
|
||||
SET(_${TPL_NAME}_ENABLE_SUCCESS FALSE)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
|
||||
|
||||
IF (_${TPL_NAME}_ENABLE_SUCCESS)
|
||||
TIBITS_CREATE_IMPORTED_TPL_LIBRARY(${TPL_NAME})
|
||||
ENDIF()
|
||||
|
||||
ENDFUNCTION()
|
||||
|
||||
#MACRO(TRIBITS_PROCESS_TPL_DEP_FILE TPL_FILE)
|
||||
# GET_FILENAME_COMPONENT(TPL_NAME ${TPL_FILE} NAME_WE)
|
||||
# INCLUDE("${TPL_FILE}")
|
||||
# IF(TARGET TPL_LIB_${TPL_NAME})
|
||||
# MESSAGE(STATUS "Found tpl library: ${TPL_NAME}")
|
||||
# SET(TPL_ENABLE_${TPL_NAME} TRUE)
|
||||
# ELSE()
|
||||
# MESSAGE(STATUS "Tpl library not found: ${TPL_NAME}")
|
||||
# SET(TPL_ENABLE_${TPL_NAME} FALSE)
|
||||
# ENDIF()
|
||||
#ENDMACRO()
|
||||
|
||||
MACRO(PREPEND_TARGET_SET VARNAME TARGET_NAME TYPE)
|
||||
IF(TYPE STREQUAL "REQUIRED")
|
||||
SET(REQUIRED TRUE)
|
||||
ELSE()
|
||||
SET(REQUIRED FALSE)
|
||||
ENDIF()
|
||||
IF(TARGET ${TARGET_NAME})
|
||||
PREPEND_GLOBAL_SET(${VARNAME} ${TARGET_NAME})
|
||||
ELSE()
|
||||
IF(REQUIRED)
|
||||
MESSAGE(FATAL_ERROR "Missing dependency ${TARGET_NAME}")
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(TRIBITS_APPEND_PACKAGE_DEPS DEP_LIST TYPE)
|
||||
FOREACH(DEP ${ARGN})
|
||||
PREPEND_GLOBAL_SET(${DEP_LIST} PACKAGE_${DEP})
|
||||
ENDFOREACH()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(TRIBITS_APPEND_TPLS_DEPS DEP_LIST TYPE)
|
||||
FOREACH(DEP ${ARGN})
|
||||
PREPEND_TARGET_SET(${DEP_LIST} TPL_LIB_${DEP} ${TYPE})
|
||||
ENDFOREACH()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(TRIBITS_ENABLE_TPLS)
|
||||
FOREACH(TPL ${ARGN})
|
||||
IF(TARGET ${TPL})
|
||||
GLOBAL_SET(${PACKAGE_NAME}_ENABLE_${TPL} TRUE)
|
||||
ELSE()
|
||||
GLOBAL_SET(${PACKAGE_NAME}_ENABLE_${TPL} FALSE)
|
||||
ENDIF()
|
||||
ENDFOREACH()
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(TRIBITS_PACKAGE_DEFINE_DEPENDENCIES)
|
||||
|
||||
SET(options)
|
||||
SET(oneValueArgs)
|
||||
SET(multiValueArgs
|
||||
LIB_REQUIRED_PACKAGES
|
||||
LIB_OPTIONAL_PACKAGES
|
||||
TEST_REQUIRED_PACKAGES
|
||||
TEST_OPTIONAL_PACKAGES
|
||||
LIB_REQUIRED_TPLS
|
||||
LIB_OPTIONAL_TPLS
|
||||
TEST_REQUIRED_TPLS
|
||||
TEST_OPTIONAL_TPLS
|
||||
REGRESSION_EMAIL_LIST
|
||||
SUBPACKAGES_DIRS_CLASSIFICATIONS_OPTREQS
|
||||
)
|
||||
CMAKE_PARSE_ARGUMENTS(PARSE "${options}" "${oneValueArgs}" "${multiValueArgs}" ${ARGN})
|
||||
|
||||
GLOBAL_SET(${PACKAGE_NAME}_DEPS "")
|
||||
TRIBITS_APPEND_PACKAGE_DEPS(${PACKAGE_NAME}_DEPS REQUIRED ${PARSE_LIB_REQUIRED_PACKAGES})
|
||||
TRIBITS_APPEND_PACKAGE_DEPS(${PACKAGE_NAME}_DEPS OPTIONAL ${PARSE_LIB_OPTIONAL_PACKAGES})
|
||||
TRIBITS_APPEND_TPLS_DEPS(${PACKAGE_NAME}_DEPS REQUIRED ${PARSE_LIB_REQUIRED_TPLS})
|
||||
TRIBITS_APPEND_TPLS_DEPS(${PACKAGE_NAME}_DEPS OPTIONAL ${PARSE_LIB_OPTIONAL_TPLS})
|
||||
|
||||
GLOBAL_SET(${PACKAGE_NAME}_TEST_DEPS "")
|
||||
TRIBITS_APPEND_PACKAGE_DEPS(${PACKAGE_NAME}_TEST_DEPS REQUIRED ${PARSE_TEST_REQUIRED_PACKAGES})
|
||||
TRIBITS_APPEND_PACKAGE_DEPS(${PACKAGE_NAME}_TEST_DEPS OPTIONAL ${PARSE_TEST_OPTIONAL_PACKAGES})
|
||||
TRIBITS_APPEND_TPLS_DEPS(${PACKAGE_NAME}_TEST_DEPS REQUIRED ${PARSE_TEST_REQUIRED_TPLS})
|
||||
TRIBITS_APPEND_TPLS_DEPS(${PACKAGE_NAME}_TEST_DEPS OPTIONAL ${PARSE_TEST_OPTIONAL_TPLS})
|
||||
|
||||
TRIBITS_ENABLE_TPLS(${PARSE_LIB_REQUIRED_TPLS} ${PARSE_LIB_OPTIONAL_TPLS} ${PARSE_TEST_REQUIRED_TPLS} ${PARSE_TEST_OPTIONAL_TPLS})
|
||||
|
||||
ENDMACRO()
|
||||
|
||||
MACRO(TRIBITS_SUBPACKAGE NAME)
|
||||
SET(PACKAGE_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
SET(PARENT_PACKAGE_NAME ${PACKAGE_NAME})
|
||||
SET(PACKAGE_NAME ${PACKAGE_NAME}${NAME})
|
||||
STRING(TOUPPER ${PACKAGE_NAME} PACKAGE_NAME_UC)
|
||||
SET(${PACKAGE_NAME}_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
|
||||
ADD_INTERFACE_LIBRARY(PACKAGE_${PACKAGE_NAME})
|
||||
|
||||
GLOBAL_SET(${PACKAGE_NAME}_LIBS "")
|
||||
|
||||
INCLUDE(${PACKAGE_SOURCE_DIR}/cmake/Dependencies.cmake)
|
||||
|
||||
ENDMACRO(TRIBITS_SUBPACKAGE)
|
||||
|
||||
MACRO(TRIBITS_SUBPACKAGE_POSTPROCESS)
|
||||
TARGET_LINK_AND_INCLUDE_LIBRARIES(PACKAGE_${PACKAGE_NAME} ${${PACKAGE_NAME}_LIBS})
|
||||
ENDMACRO(TRIBITS_SUBPACKAGE_POSTPROCESS)
|
||||
|
||||
MACRO(TRIBITS_PACKAGE_DECL NAME)
|
||||
|
||||
SET(PACKAGE_NAME ${NAME})
|
||||
SET(${PACKAGE_NAME}_SOURCE_DIR ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
STRING(TOUPPER ${PACKAGE_NAME} PACKAGE_NAME_UC)
|
||||
|
||||
#SET(TRIBITS_DEPS_DIR "${CMAKE_SOURCE_DIR}/cmake/deps")
|
||||
#FILE(GLOB TPLS_FILES "${TRIBITS_DEPS_DIR}/*.cmake")
|
||||
#FOREACH(TPL_FILE ${TPLS_FILES})
|
||||
# TRIBITS_PROCESS_TPL_DEP_FILE(${TPL_FILE})
|
||||
#ENDFOREACH()
|
||||
|
||||
ENDMACRO()
|
||||
|
||||
|
||||
MACRO(TRIBITS_PROCESS_SUBPACKAGES)
|
||||
FILE(GLOB SUBPACKAGES RELATIVE ${CMAKE_SOURCE_DIR} */cmake/Dependencies.cmake)
|
||||
FOREACH(SUBPACKAGE ${SUBPACKAGES})
|
||||
GET_FILENAME_COMPONENT(SUBPACKAGE_CMAKE ${SUBPACKAGE} DIRECTORY)
|
||||
GET_FILENAME_COMPONENT(SUBPACKAGE_DIR ${SUBPACKAGE_CMAKE} DIRECTORY)
|
||||
ADD_SUBDIRECTORY(${CMAKE_BINARY_DIR}/../${SUBPACKAGE_DIR})
|
||||
ENDFOREACH()
|
||||
ENDMACRO(TRIBITS_PROCESS_SUBPACKAGES)
|
||||
|
||||
MACRO(TRIBITS_PACKAGE_DEF)
|
||||
ENDMACRO(TRIBITS_PACKAGE_DEF)
|
||||
|
||||
MACRO(TRIBITS_EXCLUDE_AUTOTOOLS_FILES)
|
||||
ENDMACRO(TRIBITS_EXCLUDE_AUTOTOOLS_FILES)
|
||||
|
||||
MACRO(TRIBITS_EXCLUDE_FILES)
|
||||
ENDMACRO(TRIBITS_EXCLUDE_FILES)
|
||||
|
||||
MACRO(TRIBITS_PACKAGE_POSTPROCESS)
|
||||
ENDMACRO(TRIBITS_PACKAGE_POSTPROCESS)
|
||||
|
||||
@ -1,13 +1,10 @@
|
||||
|
||||
|
||||
TRIBITS_SUBPACKAGE(Containers)
|
||||
KOKKOS_SUBPACKAGE(Containers)
|
||||
|
||||
|
||||
IF(KOKKOS_HAS_TRILINOS)
|
||||
ADD_SUBDIRECTORY(src)
|
||||
ENDIF()
|
||||
|
||||
TRIBITS_ADD_TEST_DIRECTORIES(unit_tests)
|
||||
TRIBITS_ADD_TEST_DIRECTORIES(performance_tests)
|
||||
KOKKOS_ADD_TEST_DIRECTORIES(unit_tests)
|
||||
KOKKOS_ADD_TEST_DIRECTORIES(performance_tests)
|
||||
|
||||
TRIBITS_SUBPACKAGE_POSTPROCESS()
|
||||
KOKKOS_SUBPACKAGE_POSTPROCESS()
|
||||
|
||||
@ -1,49 +1,62 @@
|
||||
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
INCLUDE_DIRECTORIES(REQUIRED_DURING_INSTALLATION_TESTING ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR}/../src )
|
||||
|
||||
IF(NOT KOKKOS_HAS_TRILINOS)
|
||||
IF(KOKKOS_SEPARATE_LIBS)
|
||||
set(TEST_LINK_TARGETS kokkoscore)
|
||||
ELSE()
|
||||
set(TEST_LINK_TARGETS kokkos)
|
||||
ENDIF()
|
||||
ENDIF()
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
KOKKOS_INCLUDE_DIRECTORIES(REQUIRED_DURING_INSTALLATION_TESTING ${CMAKE_CURRENT_SOURCE_DIR})
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR}/../src )
|
||||
|
||||
IF(Kokkos_ENABLE_CUDA)
|
||||
SET(SOURCES
|
||||
TestMain.cpp
|
||||
TestCuda.cpp
|
||||
)
|
||||
|
||||
IF(Kokkos_ENABLE_Pthread)
|
||||
LIST( APPEND SOURCES TestThreads.cpp)
|
||||
KOKKOS_ADD_TEST_EXECUTABLE( PerfTestExec_Cuda
|
||||
SOURCES ${SOURCES}
|
||||
)
|
||||
|
||||
KOKKOS_ADD_TEST( NAME PerformanceTest_Cuda
|
||||
EXE PerfTestExec_Cuda
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF(Kokkos_ENABLE_OpenMP)
|
||||
LIST( APPEND SOURCES TestOpenMP.cpp)
|
||||
IF(Kokkos_ENABLE_PTHREAD)
|
||||
SET(SOURCES
|
||||
TestMain.cpp
|
||||
TestThreads.cpp
|
||||
)
|
||||
KOKKOS_ADD_TEST_EXECUTABLE( PerfTestExec_Threads
|
||||
SOURCES ${SOURCES}
|
||||
)
|
||||
|
||||
KOKKOS_ADD_TEST( NAME PerformanceTest_Threads
|
||||
EXE PerfTestExec_Threads
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF(Kokkos_ENABLE_OPENMP)
|
||||
SET(SOURCES
|
||||
TestMain.cpp
|
||||
TestOpenMP.cpp
|
||||
)
|
||||
KOKKOS_ADD_TEST_EXECUTABLE( PerfTestExec_OpenMP
|
||||
SOURCES ${SOURCES}
|
||||
)
|
||||
|
||||
KOKKOS_ADD_TEST( NAME PerformanceTest_OpenMP
|
||||
EXE PerfTestExec_OpenMP
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
IF(Kokkos_ENABLE_HPX)
|
||||
LIST( APPEND SOURCES TestHPX.cpp)
|
||||
SET(SOURCES
|
||||
TestMain.cpp
|
||||
TestHPX.cpp
|
||||
)
|
||||
KOKKOS_ADD_TEST_EXECUTABLE( PerfTestExec_HPX
|
||||
SOURCES ${SOURCES}
|
||||
)
|
||||
|
||||
KOKKOS_ADD_TEST( NAME PerformanceTest_HPX
|
||||
EXE PerfTestExec_HPX
|
||||
)
|
||||
ENDIF()
|
||||
|
||||
# Per #374, we always want to build this test, but we only want to run
|
||||
# it as a PERFORMANCE test. That's why we separate building the test
|
||||
# from running the test.
|
||||
|
||||
TRIBITS_ADD_EXECUTABLE(
|
||||
PerfTestExec
|
||||
SOURCES ${SOURCES}
|
||||
COMM serial mpi
|
||||
TESTONLYLIBS kokkos_gtest ${TEST_LINK_TARGETS}
|
||||
)
|
||||
|
||||
TRIBITS_ADD_TEST(
|
||||
PerformanceTest
|
||||
NAME PerfTestExec
|
||||
COMM serial mpi
|
||||
NUM_MPI_PROCS 1
|
||||
CATEGORIES PERFORMANCE
|
||||
FAIL_REGULAR_EXPRESSION " FAILED "
|
||||
)
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -67,44 +68,37 @@ namespace Performance {
|
||||
|
||||
class cuda : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
static void SetUpTestCase() {
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
Kokkos::InitArguments args(-1, -1, 0);
|
||||
Kokkos::initialize(args);
|
||||
}
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::finalize();
|
||||
}
|
||||
static void TearDownTestCase() { Kokkos::finalize(); }
|
||||
};
|
||||
|
||||
TEST_F( cuda, dynrankview_perf )
|
||||
{
|
||||
TEST_F(cuda, dynrankview_perf) {
|
||||
std::cout << "Cuda" << std::endl;
|
||||
std::cout << " DynRankView vs View: Initialization Only " << std::endl;
|
||||
test_dynrankview_op_perf<Kokkos::Cuda>(40960);
|
||||
}
|
||||
|
||||
TEST_F( cuda, global_2_local)
|
||||
{
|
||||
TEST_F(cuda, global_2_local) {
|
||||
std::cout << "Cuda" << std::endl;
|
||||
std::cout << "size, create, generate, fill, find" << std::endl;
|
||||
for (unsigned i=Performance::begin_id_size; i<=Performance::end_id_size; i *= Performance::id_step)
|
||||
for (unsigned i = Performance::begin_id_size; i <= Performance::end_id_size;
|
||||
i *= Performance::id_step)
|
||||
test_global_to_local_ids<Kokkos::Cuda>(i);
|
||||
}
|
||||
|
||||
TEST_F( cuda, unordered_map_performance_near)
|
||||
{
|
||||
TEST_F(cuda, unordered_map_performance_near) {
|
||||
Perf::run_performance_tests<Kokkos::Cuda, true>("cuda-near");
|
||||
}
|
||||
|
||||
TEST_F( cuda, unordered_map_performance_far)
|
||||
{
|
||||
TEST_F(cuda, unordered_map_performance_far) {
|
||||
Perf::run_performance_tests<Kokkos::Cuda, false>("cuda-far");
|
||||
}
|
||||
|
||||
}
|
||||
} // namespace Performance
|
||||
#else
|
||||
void KOKKOS_CONTAINERS_PERFORMANCE_TESTS_TESTCUDA_PREVENT_EMPTY_LINK_ERROR() {}
|
||||
#endif /* #if defined( KOKKOS_ENABLE_CUDA ) */
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -49,7 +50,8 @@
|
||||
|
||||
#include <impl/Kokkos_Timer.hpp>
|
||||
|
||||
// Compare performance of DynRankView to View, specific focus on the parenthesis operators
|
||||
// Compare performance of DynRankView to View, specific focus on the parenthesis
|
||||
// operators
|
||||
|
||||
namespace Performance {
|
||||
|
||||
@ -59,8 +61,7 @@ struct InitViewFunctor {
|
||||
typedef Kokkos::View<double ***, DeviceType> inviewtype;
|
||||
inviewtype _inview;
|
||||
|
||||
InitViewFunctor( inviewtype &inview_ ) : _inview(inview_)
|
||||
{}
|
||||
InitViewFunctor(inviewtype &inview_) : _inview(inview_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
@ -71,8 +72,7 @@ struct InitViewFunctor {
|
||||
}
|
||||
}
|
||||
|
||||
struct SumComputationTest
|
||||
{
|
||||
struct SumComputationTest {
|
||||
typedef Kokkos::View<double ***, DeviceType> inviewtype;
|
||||
inviewtype _inview;
|
||||
|
||||
@ -80,7 +80,8 @@ struct InitViewFunctor {
|
||||
outviewtype _outview;
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
SumComputationTest(inviewtype &inview_ , outviewtype &outview_) : _inview(inview_), _outview(outview_) {}
|
||||
SumComputationTest(inviewtype &inview_, outviewtype &outview_)
|
||||
: _inview(inview_), _outview(outview_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
@ -91,7 +92,6 @@ struct InitViewFunctor {
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
};
|
||||
|
||||
template <typename DeviceType>
|
||||
@ -99,8 +99,7 @@ struct InitStrideViewFunctor {
|
||||
typedef Kokkos::View<double ***, Kokkos::LayoutStride, DeviceType> inviewtype;
|
||||
inviewtype _inview;
|
||||
|
||||
InitStrideViewFunctor( inviewtype &inview_ ) : _inview(inview_)
|
||||
{}
|
||||
InitStrideViewFunctor(inviewtype &inview_) : _inview(inview_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
@ -110,7 +109,6 @@ struct InitStrideViewFunctor {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
};
|
||||
|
||||
template <typename DeviceType>
|
||||
@ -118,8 +116,7 @@ struct InitViewRank7Functor {
|
||||
typedef Kokkos::View<double *******, DeviceType> inviewtype;
|
||||
inviewtype _inview;
|
||||
|
||||
InitViewRank7Functor( inviewtype &inview_ ) : _inview(inview_)
|
||||
{}
|
||||
InitViewRank7Functor(inviewtype &inview_) : _inview(inview_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
@ -129,7 +126,6 @@ struct InitViewRank7Functor {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
};
|
||||
|
||||
// DynRankView functor
|
||||
@ -138,8 +134,7 @@ struct InitDynRankViewFunctor {
|
||||
typedef Kokkos::DynRankView<double, DeviceType> inviewtype;
|
||||
inviewtype _inview;
|
||||
|
||||
InitDynRankViewFunctor( inviewtype &inview_ ) : _inview(inview_)
|
||||
{}
|
||||
InitDynRankViewFunctor(inviewtype &inview_) : _inview(inview_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
@ -150,8 +145,7 @@ struct InitDynRankViewFunctor {
|
||||
}
|
||||
}
|
||||
|
||||
struct SumComputationTest
|
||||
{
|
||||
struct SumComputationTest {
|
||||
typedef Kokkos::DynRankView<double, DeviceType> inviewtype;
|
||||
inviewtype _inview;
|
||||
|
||||
@ -159,7 +153,8 @@ struct InitDynRankViewFunctor {
|
||||
outviewtype _outview;
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
SumComputationTest(inviewtype &inview_ , outviewtype &outview_) : _inview(inview_), _outview(outview_) {}
|
||||
SumComputationTest(inviewtype &inview_, outviewtype &outview_)
|
||||
: _inview(inview_), _outview(outview_) {}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(const int i) const {
|
||||
@ -170,14 +165,10 @@ struct InitDynRankViewFunctor {
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
};
|
||||
|
||||
|
||||
template <typename DeviceType>
|
||||
void test_dynrankview_op_perf( const int par_size )
|
||||
{
|
||||
|
||||
void test_dynrankview_op_perf(const int par_size) {
|
||||
typedef DeviceType execution_space;
|
||||
typedef typename execution_space::size_type size_type;
|
||||
const size_type dim_2 = 90;
|
||||
@ -191,7 +182,8 @@ void test_dynrankview_op_perf( const int par_size )
|
||||
double elapsed_time_compdrview = 0;
|
||||
Kokkos::Timer timer;
|
||||
{
|
||||
Kokkos::View<double***,DeviceType> testview("testview",par_size,dim_2,dim_3);
|
||||
Kokkos::View<double ***, DeviceType> testview("testview", par_size, dim_2,
|
||||
dim_3);
|
||||
typedef InitViewFunctor<DeviceType> FunctorType;
|
||||
|
||||
timer.reset();
|
||||
@ -201,26 +193,29 @@ void test_dynrankview_op_perf( const int par_size )
|
||||
elapsed_time_view = timer.seconds();
|
||||
std::cout << " View time (init only): " << elapsed_time_view << std::endl;
|
||||
|
||||
|
||||
timer.reset();
|
||||
Kokkos::View<double *, DeviceType> sumview("sumview", par_size);
|
||||
Kokkos::parallel_for( policy , typename FunctorType::SumComputationTest(testview, sumview) );
|
||||
Kokkos::parallel_for(
|
||||
policy, typename FunctorType::SumComputationTest(testview, sumview));
|
||||
DeviceType().fence();
|
||||
elapsed_time_compview = timer.seconds();
|
||||
std::cout << " View sum computation time: " << elapsed_time_view << std::endl;
|
||||
std::cout << " View sum computation time: " << elapsed_time_view
|
||||
<< std::endl;
|
||||
|
||||
|
||||
Kokkos::View<double***,Kokkos::LayoutStride, DeviceType> teststrideview = Kokkos::subview(testview, Kokkos::ALL, Kokkos::ALL,Kokkos::ALL);
|
||||
Kokkos::View<double ***, Kokkos::LayoutStride, DeviceType> teststrideview =
|
||||
Kokkos::subview(testview, Kokkos::ALL, Kokkos::ALL, Kokkos::ALL);
|
||||
typedef InitStrideViewFunctor<DeviceType> FunctorStrideType;
|
||||
|
||||
timer.reset();
|
||||
Kokkos::parallel_for(policy, FunctorStrideType(teststrideview));
|
||||
DeviceType().fence();
|
||||
elapsed_time_strideview = timer.seconds();
|
||||
std::cout << " Strided View time (init only): " << elapsed_time_strideview << std::endl;
|
||||
std::cout << " Strided View time (init only): " << elapsed_time_strideview
|
||||
<< std::endl;
|
||||
}
|
||||
{
|
||||
Kokkos::View<double*******,DeviceType> testview("testview",par_size,dim_2,dim_3,1,1,1,1);
|
||||
Kokkos::View<double *******, DeviceType> testview("testview", par_size,
|
||||
dim_2, dim_3, 1, 1, 1, 1);
|
||||
typedef InitViewRank7Functor<DeviceType> FunctorType;
|
||||
|
||||
timer.reset();
|
||||
@ -228,10 +223,12 @@ void test_dynrankview_op_perf( const int par_size )
|
||||
Kokkos::parallel_for(policy, FunctorType(testview));
|
||||
DeviceType().fence();
|
||||
elapsed_time_view_rank7 = timer.seconds();
|
||||
std::cout << " View Rank7 time (init only): " << elapsed_time_view_rank7 << std::endl;
|
||||
std::cout << " View Rank7 time (init only): " << elapsed_time_view_rank7
|
||||
<< std::endl;
|
||||
}
|
||||
{
|
||||
Kokkos::DynRankView<double,DeviceType> testdrview("testdrview",par_size,dim_2,dim_3);
|
||||
Kokkos::DynRankView<double, DeviceType> testdrview("testdrview", par_size,
|
||||
dim_2, dim_3);
|
||||
typedef InitDynRankViewFunctor<DeviceType> FunctorType;
|
||||
|
||||
timer.reset();
|
||||
@ -239,28 +236,38 @@ void test_dynrankview_op_perf( const int par_size )
|
||||
Kokkos::parallel_for(policy, FunctorType(testdrview));
|
||||
DeviceType().fence();
|
||||
elapsed_time_drview = timer.seconds();
|
||||
std::cout << " DynRankView time (init only): " << elapsed_time_drview << std::endl;
|
||||
std::cout << " DynRankView time (init only): " << elapsed_time_drview
|
||||
<< std::endl;
|
||||
|
||||
timer.reset();
|
||||
Kokkos::DynRankView<double, DeviceType> sumview("sumview", par_size);
|
||||
Kokkos::parallel_for( policy , typename FunctorType::SumComputationTest(testdrview, sumview) );
|
||||
Kokkos::parallel_for(
|
||||
policy, typename FunctorType::SumComputationTest(testdrview, sumview));
|
||||
DeviceType().fence();
|
||||
elapsed_time_compdrview = timer.seconds();
|
||||
std::cout << " DynRankView sum computation time: " << elapsed_time_compdrview << std::endl;
|
||||
|
||||
std::cout << " DynRankView sum computation time: "
|
||||
<< elapsed_time_compdrview << std::endl;
|
||||
}
|
||||
|
||||
std::cout << " Ratio of View to DynRankView time: " << elapsed_time_view / elapsed_time_drview << std::endl; //expect < 1
|
||||
std::cout << " Ratio of View to DynRankView sum computation time: " << elapsed_time_compview / elapsed_time_compdrview << std::endl; //expect < 1
|
||||
std::cout << " Ratio of View to View Rank7 time: " << elapsed_time_view / elapsed_time_view_rank7 << std::endl; //expect < 1
|
||||
std::cout << " Ratio of StrideView to DynRankView time: " << elapsed_time_strideview / elapsed_time_drview << std::endl; //expect < 1
|
||||
std::cout << " Ratio of DynRankView to View Rank7 time: " << elapsed_time_drview / elapsed_time_view_rank7 << std::endl; //expect ?
|
||||
std::cout << " Ratio of View to DynRankView time: "
|
||||
<< elapsed_time_view / elapsed_time_drview
|
||||
<< std::endl; // expect < 1
|
||||
std::cout << " Ratio of View to DynRankView sum computation time: "
|
||||
<< elapsed_time_compview / elapsed_time_compdrview
|
||||
<< std::endl; // expect < 1
|
||||
std::cout << " Ratio of View to View Rank7 time: "
|
||||
<< elapsed_time_view / elapsed_time_view_rank7
|
||||
<< std::endl; // expect < 1
|
||||
std::cout << " Ratio of StrideView to DynRankView time: "
|
||||
<< elapsed_time_strideview / elapsed_time_drview
|
||||
<< std::endl; // expect < 1
|
||||
std::cout << " Ratio of DynRankView to View Rank7 time: "
|
||||
<< elapsed_time_drview / elapsed_time_view_rank7
|
||||
<< std::endl; // expect ?
|
||||
|
||||
timer.reset();
|
||||
|
||||
} // end test_dynrankview
|
||||
|
||||
|
||||
} //end Performance
|
||||
} // namespace Performance
|
||||
#endif
|
||||
|
||||
|
||||
@ -1,10 +1,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -22,10 +23,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -57,33 +58,25 @@ static const unsigned begin_id_size = 256u;
|
||||
static const unsigned end_id_size = 1u << 22;
|
||||
static const unsigned id_step = 2u;
|
||||
|
||||
union helper
|
||||
{
|
||||
union helper {
|
||||
uint32_t word;
|
||||
uint8_t byte[4];
|
||||
};
|
||||
|
||||
|
||||
template <typename Device>
|
||||
struct generate_ids
|
||||
{
|
||||
struct generate_ids {
|
||||
typedef Device execution_space;
|
||||
typedef typename execution_space::size_type size_type;
|
||||
typedef Kokkos::View<uint32_t*, execution_space> local_id_view;
|
||||
|
||||
local_id_view local_2_global;
|
||||
|
||||
generate_ids( local_id_view & ids)
|
||||
: local_2_global(ids)
|
||||
{
|
||||
generate_ids(local_id_view& ids) : local_2_global(ids) {
|
||||
Kokkos::parallel_for(local_2_global.extent(0), *this);
|
||||
}
|
||||
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(size_type i) const
|
||||
{
|
||||
|
||||
void operator()(size_type i) const {
|
||||
helper x = {static_cast<uint32_t>(i)};
|
||||
|
||||
// shuffle the bytes of i to create a unique, semi-random global_id
|
||||
@ -99,41 +92,41 @@ struct generate_ids
|
||||
|
||||
local_2_global[i] = x.word;
|
||||
}
|
||||
|
||||
};
|
||||
|
||||
template <typename Device>
|
||||
struct fill_map
|
||||
{
|
||||
struct fill_map {
|
||||
typedef Device execution_space;
|
||||
typedef typename execution_space::size_type size_type;
|
||||
typedef Kokkos::View<const uint32_t*,execution_space, Kokkos::MemoryRandomAccess> local_id_view;
|
||||
typedef Kokkos::UnorderedMap<uint32_t,size_type,execution_space> global_id_view;
|
||||
typedef Kokkos::View<const uint32_t*, execution_space,
|
||||
Kokkos::MemoryRandomAccess>
|
||||
local_id_view;
|
||||
typedef Kokkos::UnorderedMap<uint32_t, size_type, execution_space>
|
||||
global_id_view;
|
||||
|
||||
global_id_view global_2_local;
|
||||
local_id_view local_2_global;
|
||||
|
||||
fill_map(global_id_view gIds, local_id_view lIds)
|
||||
: global_2_local(gIds) , local_2_global(lIds)
|
||||
{
|
||||
: global_2_local(gIds), local_2_global(lIds) {
|
||||
Kokkos::parallel_for(local_2_global.extent(0), *this);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(size_type i) const
|
||||
{
|
||||
void operator()(size_type i) const {
|
||||
global_2_local.insert(local_2_global[i], i);
|
||||
}
|
||||
|
||||
};
|
||||
|
||||
template <typename Device>
|
||||
struct find_test
|
||||
{
|
||||
struct find_test {
|
||||
typedef Device execution_space;
|
||||
typedef typename execution_space::size_type size_type;
|
||||
typedef Kokkos::View<const uint32_t*,execution_space, Kokkos::MemoryRandomAccess> local_id_view;
|
||||
typedef Kokkos::UnorderedMap<const uint32_t, const size_type,execution_space> global_id_view;
|
||||
typedef Kokkos::View<const uint32_t*, execution_space,
|
||||
Kokkos::MemoryRandomAccess>
|
||||
local_id_view;
|
||||
typedef Kokkos::UnorderedMap<const uint32_t, const size_type, execution_space>
|
||||
global_id_view;
|
||||
|
||||
global_id_view global_2_local;
|
||||
local_id_view local_2_global;
|
||||
@ -141,38 +134,34 @@ struct find_test
|
||||
typedef size_t value_type;
|
||||
|
||||
find_test(global_id_view gIds, local_id_view lIds, value_type& num_errors)
|
||||
: global_2_local(gIds) , local_2_global(lIds)
|
||||
{
|
||||
: global_2_local(gIds), local_2_global(lIds) {
|
||||
Kokkos::parallel_reduce(local_2_global.extent(0), *this, num_errors);
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void init(value_type & v) const
|
||||
{ v = 0; }
|
||||
void init(value_type& v) const { v = 0; }
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void join(volatile value_type & dst, volatile value_type const & src) const
|
||||
{ dst += src; }
|
||||
void join(volatile value_type& dst, volatile value_type const& src) const {
|
||||
dst += src;
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(size_type i, value_type & num_errors) const
|
||||
{
|
||||
void operator()(size_type i, value_type& num_errors) const {
|
||||
uint32_t index = global_2_local.find(local_2_global[i]);
|
||||
|
||||
if (global_2_local.value_at(index) != i) ++num_errors;
|
||||
}
|
||||
|
||||
};
|
||||
|
||||
template <typename Device>
|
||||
void test_global_to_local_ids(unsigned num_ids)
|
||||
{
|
||||
|
||||
void test_global_to_local_ids(unsigned num_ids) {
|
||||
typedef Device execution_space;
|
||||
typedef typename execution_space::size_type size_type;
|
||||
|
||||
typedef Kokkos::View<uint32_t*, execution_space> local_id_view;
|
||||
typedef Kokkos::UnorderedMap<uint32_t,size_type,execution_space> global_id_view;
|
||||
typedef Kokkos::UnorderedMap<uint32_t, size_type, execution_space>
|
||||
global_id_view;
|
||||
|
||||
// size
|
||||
std::cout << num_ids << ", ";
|
||||
@ -189,18 +178,14 @@ void test_global_to_local_ids(unsigned num_ids)
|
||||
timer.reset();
|
||||
|
||||
// generate unique ids
|
||||
{
|
||||
generate_ids<Device> gen(local_2_global);
|
||||
}
|
||||
{ generate_ids<Device> gen(local_2_global); }
|
||||
Device().fence();
|
||||
// generate
|
||||
elasped_time = timer.seconds();
|
||||
std::cout << elasped_time << ", ";
|
||||
timer.reset();
|
||||
|
||||
{
|
||||
fill_map<Device> fill(global_2_local, local_2_global);
|
||||
}
|
||||
{ fill_map<Device> fill(global_2_local, local_2_global); }
|
||||
Device().fence();
|
||||
|
||||
// fill
|
||||
@ -208,10 +193,8 @@ void test_global_to_local_ids(unsigned num_ids)
|
||||
std::cout << elasped_time << ", ";
|
||||
timer.reset();
|
||||
|
||||
|
||||
size_t num_errors = 0;
|
||||
for (int i=0; i<100; ++i)
|
||||
{
|
||||
for (int i = 0; i < 100; ++i) {
|
||||
find_test<Device> find(global_2_local, local_2_global, num_errors);
|
||||
}
|
||||
Device().fence();
|
||||
@ -223,9 +206,6 @@ void test_global_to_local_ids(unsigned num_ids)
|
||||
ASSERT_EQ(num_errors, 0u);
|
||||
}
|
||||
|
||||
|
||||
} // namespace Performance
|
||||
|
||||
|
||||
#endif // KOKKOS_TEST_GLOBAL_TO_LOCAL_IDS_HPP
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -61,70 +62,63 @@
|
||||
#include <string>
|
||||
#include <fstream>
|
||||
|
||||
|
||||
namespace Performance {
|
||||
|
||||
class hpx : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
static void SetUpTestCase() {
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
|
||||
Kokkos::initialize();
|
||||
Kokkos::print_configuration(std::cout);
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::finalize();
|
||||
}
|
||||
static void TearDownTestCase() { Kokkos::finalize(); }
|
||||
};
|
||||
|
||||
TEST_F( hpx, dynrankview_perf )
|
||||
{
|
||||
TEST_F(hpx, dynrankview_perf) {
|
||||
std::cout << "HPX" << std::endl;
|
||||
std::cout << " DynRankView vs View: Initialization Only " << std::endl;
|
||||
test_dynrankview_op_perf<Kokkos::Experimental::HPX>(8192);
|
||||
}
|
||||
|
||||
TEST_F( hpx, global_2_local)
|
||||
{
|
||||
TEST_F(hpx, global_2_local) {
|
||||
std::cout << "HPX" << std::endl;
|
||||
std::cout << "size, create, generate, fill, find" << std::endl;
|
||||
for (unsigned i=Performance::begin_id_size; i<=Performance::end_id_size; i *= Performance::id_step)
|
||||
for (unsigned i = Performance::begin_id_size; i <= Performance::end_id_size;
|
||||
i *= Performance::id_step)
|
||||
test_global_to_local_ids<Kokkos::Experimental::HPX>(i);
|
||||
}
|
||||
|
||||
TEST_F( hpx, unordered_map_performance_near)
|
||||
{
|
||||
TEST_F(hpx, unordered_map_performance_near) {
|
||||
unsigned num_hpx = 4;
|
||||
std::ostringstream base_file_name;
|
||||
base_file_name << "hpx-" << num_hpx << "-near";
|
||||
Perf::run_performance_tests<Kokkos::Experimental::HPX,true>(base_file_name.str());
|
||||
Perf::run_performance_tests<Kokkos::Experimental::HPX, true>(
|
||||
base_file_name.str());
|
||||
}
|
||||
|
||||
TEST_F( hpx, unordered_map_performance_far)
|
||||
{
|
||||
TEST_F(hpx, unordered_map_performance_far) {
|
||||
unsigned num_hpx = 4;
|
||||
std::ostringstream base_file_name;
|
||||
base_file_name << "hpx-" << num_hpx << "-far";
|
||||
Perf::run_performance_tests<Kokkos::Experimental::HPX,false>(base_file_name.str());
|
||||
Perf::run_performance_tests<Kokkos::Experimental::HPX, false>(
|
||||
base_file_name.str());
|
||||
}
|
||||
|
||||
TEST_F( hpx, scatter_view)
|
||||
{
|
||||
TEST_F(hpx, scatter_view) {
|
||||
std::cout << "ScatterView data-duplicated test:\n";
|
||||
Perf::test_scatter_view<Kokkos::Experimental::HPX, Kokkos::LayoutRight,
|
||||
Kokkos::Experimental::ScatterDuplicated,
|
||||
Kokkos::Experimental::ScatterNonAtomic>(10, 1000 * 1000);
|
||||
Kokkos::Experimental::ScatterNonAtomic>(10,
|
||||
1000 * 1000);
|
||||
// std::cout << "ScatterView atomics test:\n";
|
||||
// Perf::test_scatter_view<Kokkos::Experimental::HPX, Kokkos::LayoutRight,
|
||||
// Kokkos::Experimental::ScatterNonDuplicated,
|
||||
// Kokkos::Experimental::ScatterAtomic>(10, 1000 * 1000);
|
||||
}
|
||||
|
||||
} // namespace test
|
||||
} // namespace Performance
|
||||
#else
|
||||
void KOKKOS_CONTAINERS_PERFORMANCE_TESTS_TESTHPX_PREVENT_EMPTY_LINK_ERROR() {}
|
||||
#endif
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -50,4 +51,3 @@ int main(int argc, char *argv[]) {
|
||||
::testing::InitGoogleTest(&argc, argv);
|
||||
return RUN_ALL_TESTS();
|
||||
}
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -61,82 +62,72 @@
|
||||
#include <string>
|
||||
#include <fstream>
|
||||
|
||||
|
||||
namespace Performance {
|
||||
|
||||
class openmp : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
static void SetUpTestCase() {
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
|
||||
Kokkos::initialize();
|
||||
Kokkos::OpenMP::print_configuration(std::cout);
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::finalize();
|
||||
}
|
||||
static void TearDownTestCase() { Kokkos::finalize(); }
|
||||
};
|
||||
|
||||
TEST_F( openmp, dynrankview_perf )
|
||||
{
|
||||
TEST_F(openmp, dynrankview_perf) {
|
||||
std::cout << "OpenMP" << std::endl;
|
||||
std::cout << " DynRankView vs View: Initialization Only " << std::endl;
|
||||
test_dynrankview_op_perf<Kokkos::OpenMP>(8192);
|
||||
}
|
||||
|
||||
TEST_F( openmp, global_2_local)
|
||||
{
|
||||
TEST_F(openmp, global_2_local) {
|
||||
std::cout << "OpenMP" << std::endl;
|
||||
std::cout << "size, create, generate, fill, find" << std::endl;
|
||||
for (unsigned i=Performance::begin_id_size; i<=Performance::end_id_size; i *= Performance::id_step)
|
||||
for (unsigned i = Performance::begin_id_size; i <= Performance::end_id_size;
|
||||
i *= Performance::id_step)
|
||||
test_global_to_local_ids<Kokkos::OpenMP>(i);
|
||||
}
|
||||
|
||||
TEST_F( openmp, unordered_map_performance_near)
|
||||
{
|
||||
TEST_F(openmp, unordered_map_performance_near) {
|
||||
unsigned num_openmp = 4;
|
||||
if (Kokkos::hwloc::available()) {
|
||||
num_openmp = Kokkos::hwloc::get_available_numa_count() *
|
||||
Kokkos::hwloc::get_available_cores_per_numa() *
|
||||
Kokkos::hwloc::get_available_threads_per_core();
|
||||
|
||||
}
|
||||
std::ostringstream base_file_name;
|
||||
base_file_name << "openmp-" << num_openmp << "-near";
|
||||
Perf::run_performance_tests<Kokkos::OpenMP, true>(base_file_name.str());
|
||||
}
|
||||
|
||||
TEST_F( openmp, unordered_map_performance_far)
|
||||
{
|
||||
TEST_F(openmp, unordered_map_performance_far) {
|
||||
unsigned num_openmp = 4;
|
||||
if (Kokkos::hwloc::available()) {
|
||||
num_openmp = Kokkos::hwloc::get_available_numa_count() *
|
||||
Kokkos::hwloc::get_available_cores_per_numa() *
|
||||
Kokkos::hwloc::get_available_threads_per_core();
|
||||
|
||||
}
|
||||
std::ostringstream base_file_name;
|
||||
base_file_name << "openmp-" << num_openmp << "-far";
|
||||
Perf::run_performance_tests<Kokkos::OpenMP, false>(base_file_name.str());
|
||||
}
|
||||
|
||||
TEST_F( openmp, scatter_view)
|
||||
{
|
||||
TEST_F(openmp, scatter_view) {
|
||||
std::cout << "ScatterView data-duplicated test:\n";
|
||||
Perf::test_scatter_view<Kokkos::OpenMP, Kokkos::LayoutRight,
|
||||
Kokkos::Experimental::ScatterDuplicated,
|
||||
Kokkos::Experimental::ScatterNonAtomic>(10, 1000 * 1000);
|
||||
Kokkos::Experimental::ScatterNonAtomic>(10,
|
||||
1000 * 1000);
|
||||
// std::cout << "ScatterView atomics test:\n";
|
||||
// Perf::test_scatter_view<Kokkos::OpenMP, Kokkos::LayoutRight,
|
||||
// Kokkos::Experimental::ScatterNonDuplicated,
|
||||
// Kokkos::Experimental::ScatterAtomic>(10, 1000 * 1000);
|
||||
}
|
||||
|
||||
} // namespace test
|
||||
} // namespace Performance
|
||||
#else
|
||||
void KOKKOS_CONTAINERS_PERFORMANCE_TESTS_TESTOPENMP_PREVENT_EMPTY_LINK_ERROR() {}
|
||||
void KOKKOS_CONTAINERS_PERFORMANCE_TESTS_TESTOPENMP_PREVENT_EMPTY_LINK_ERROR() {
|
||||
}
|
||||
#endif
|
||||
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -67,14 +68,13 @@ namespace Performance {
|
||||
|
||||
class rocm : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
static void SetUpTestCase() {
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
Kokkos::HostSpace::execution_space::initialize();
|
||||
Kokkos::Experimental::ROCm::initialize( Kokkos::Experimental::ROCm::SelectDevice(0) );
|
||||
Kokkos::Experimental::ROCm::initialize(
|
||||
Kokkos::Experimental::ROCm::SelectDevice(0));
|
||||
}
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
static void TearDownTestCase() {
|
||||
Kokkos::Experimental::ROCm::finalize();
|
||||
Kokkos::HostSpace::execution_space::finalize();
|
||||
}
|
||||
@ -97,17 +97,15 @@ TEST_F( rocm, global_2_local)
|
||||
}
|
||||
|
||||
#endif
|
||||
TEST_F( rocm, unordered_map_performance_near)
|
||||
{
|
||||
TEST_F(rocm, unordered_map_performance_near) {
|
||||
Perf::run_performance_tests<Kokkos::Experimental::ROCm, true>("rocm-near");
|
||||
}
|
||||
|
||||
TEST_F( rocm, unordered_map_performance_far)
|
||||
{
|
||||
TEST_F(rocm, unordered_map_performance_far) {
|
||||
Perf::run_performance_tests<Kokkos::Experimental::ROCm, false>("rocm-far");
|
||||
}
|
||||
|
||||
}
|
||||
} // namespace Performance
|
||||
#else
|
||||
void KOKKOS_CONTAINERS_PERFORMANCE_TESTS_TESTROCM_PREVENT_EMPTY_LINK_ERROR() {}
|
||||
#endif /* #if defined( KOKKOS_ENABLE_ROCM ) */
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -49,16 +50,15 @@
|
||||
|
||||
namespace Perf {
|
||||
|
||||
template <typename ExecSpace, typename Layout, int duplication, int contribution>
|
||||
void test_scatter_view(int m, int n)
|
||||
template <typename ExecSpace, typename Layout, int duplication,
|
||||
int contribution>
|
||||
void test_scatter_view(int m, int n) {
|
||||
Kokkos::View<double * [3], Layout, ExecSpace> original_view("original_view",
|
||||
n);
|
||||
{
|
||||
Kokkos::View<double *[3], Layout, ExecSpace> original_view("original_view", n);
|
||||
{
|
||||
auto scatter_view = Kokkos::Experimental::create_scatter_view
|
||||
< Kokkos::Experimental::ScatterSum
|
||||
, duplication
|
||||
, contribution
|
||||
> (original_view);
|
||||
auto scatter_view = Kokkos::Experimental::create_scatter_view<
|
||||
Kokkos::Experimental::ScatterSum, duplication, contribution>(
|
||||
original_view);
|
||||
Kokkos::Experimental::UniqueToken<
|
||||
ExecSpace, Kokkos::Experimental::UniqueTokenScope::Global>
|
||||
unique_token{ExecSpace()};
|
||||
@ -68,7 +68,8 @@ void test_scatter_view(int m, int n)
|
||||
{
|
||||
auto num_threads = unique_token.size();
|
||||
std::cout << "num_threads " << num_threads << '\n';
|
||||
Kokkos::View<double **[3], Layout, ExecSpace> hand_coded_duplicate_view("hand_coded_duplicate", num_threads, n);
|
||||
Kokkos::View<double* * [3], Layout, ExecSpace>
|
||||
hand_coded_duplicate_view("hand_coded_duplicate", num_threads, n);
|
||||
auto f2 = KOKKOS_LAMBDA(int i) {
|
||||
auto thread_id = unique_token.acquire();
|
||||
for (int j = 0; j < 10; ++j) {
|
||||
@ -81,7 +82,8 @@ void test_scatter_view(int m, int n)
|
||||
Kokkos::Timer timer;
|
||||
timer.reset();
|
||||
for (int k = 0; k < m; ++k) {
|
||||
Kokkos::parallel_for(policy, f2, "hand_coded_duplicate_scatter_view_test");
|
||||
Kokkos::parallel_for(policy, f2,
|
||||
"hand_coded_duplicate_scatter_view_test");
|
||||
}
|
||||
Kokkos::fence();
|
||||
auto t = timer.seconds();
|
||||
@ -110,6 +112,6 @@ void test_scatter_view(int m, int n)
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
} // namespace Perf
|
||||
|
||||
#endif
|
||||
|
||||
@ -2,10 +2,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -23,10 +24,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -66,8 +67,7 @@ namespace Performance {
|
||||
|
||||
class threads : public ::testing::Test {
|
||||
protected:
|
||||
static void SetUpTestCase()
|
||||
{
|
||||
static void SetUpTestCase() {
|
||||
std::cout << std::setprecision(5) << std::scientific;
|
||||
|
||||
unsigned num_threads = 4;
|
||||
@ -76,7 +76,6 @@ protected:
|
||||
num_threads = Kokkos::hwloc::get_available_numa_count() *
|
||||
Kokkos::hwloc::get_available_cores_per_numa() *
|
||||
Kokkos::hwloc::get_available_threads_per_core();
|
||||
|
||||
}
|
||||
|
||||
std::cout << "Threads: " << num_threads << std::endl;
|
||||
@ -84,49 +83,41 @@ protected:
|
||||
Kokkos::initialize(Kokkos::InitArguments(num_threads));
|
||||
}
|
||||
|
||||
static void TearDownTestCase()
|
||||
{
|
||||
Kokkos::finalize();
|
||||
}
|
||||
static void TearDownTestCase() { Kokkos::finalize(); }
|
||||
};
|
||||
|
||||
TEST_F( threads, dynrankview_perf )
|
||||
{
|
||||
TEST_F(threads, dynrankview_perf) {
|
||||
std::cout << "Threads" << std::endl;
|
||||
std::cout << " DynRankView vs View: Initialization Only " << std::endl;
|
||||
test_dynrankview_op_perf<Kokkos::Threads>(8192);
|
||||
}
|
||||
|
||||
TEST_F( threads, global_2_local)
|
||||
{
|
||||
TEST_F(threads, global_2_local) {
|
||||
std::cout << "Threads" << std::endl;
|
||||
std::cout << "size, create, generate, fill, find" << std::endl;
|
||||
for (unsigned i=Performance::begin_id_size; i<=Performance::end_id_size; i *= Performance::id_step)
|
||||
for (unsigned i = Performance::begin_id_size; i <= Performance::end_id_size;
|
||||
i *= Performance::id_step)
|
||||
test_global_to_local_ids<Kokkos::Threads>(i);
|
||||
}
|
||||
|
||||
TEST_F( threads, unordered_map_performance_near)
|
||||
{
|
||||
TEST_F(threads, unordered_map_performance_near) {
|
||||
unsigned num_threads = 4;
|
||||
if (Kokkos::hwloc::available()) {
|
||||
num_threads = Kokkos::hwloc::get_available_numa_count() *
|
||||
Kokkos::hwloc::get_available_cores_per_numa() *
|
||||
Kokkos::hwloc::get_available_threads_per_core();
|
||||
|
||||
}
|
||||
std::ostringstream base_file_name;
|
||||
base_file_name << "threads-" << num_threads << "-near";
|
||||
Perf::run_performance_tests<Kokkos::Threads, true>(base_file_name.str());
|
||||
}
|
||||
|
||||
TEST_F( threads, unordered_map_performance_far)
|
||||
{
|
||||
TEST_F(threads, unordered_map_performance_far) {
|
||||
unsigned num_threads = 4;
|
||||
if (Kokkos::hwloc::available()) {
|
||||
num_threads = Kokkos::hwloc::get_available_numa_count() *
|
||||
Kokkos::hwloc::get_available_cores_per_numa() *
|
||||
Kokkos::hwloc::get_available_threads_per_core();
|
||||
|
||||
}
|
||||
std::ostringstream base_file_name;
|
||||
base_file_name << "threads-" << num_threads << "-far";
|
||||
@ -136,6 +127,6 @@ TEST_F( threads, unordered_map_performance_far)
|
||||
} // namespace Performance
|
||||
|
||||
#else
|
||||
void KOKKOS_CONTAINERS_PERFORMANCE_TESTS_TESTTHREADS_PREVENT_EMPTY_LINK_ERROR() {}
|
||||
void KOKKOS_CONTAINERS_PERFORMANCE_TESTS_TESTTHREADS_PREVENT_EMPTY_LINK_ERROR() {
|
||||
}
|
||||
#endif
|
||||
|
||||
|
||||
@ -1,10 +1,11 @@
|
||||
//@HEADER
|
||||
// ************************************************************************
|
||||
//
|
||||
// Kokkos v. 2.0
|
||||
// Copyright (2014) Sandia Corporation
|
||||
// Kokkos v. 3.0
|
||||
// Copyright (2020) National Technology & Engineering
|
||||
// Solutions of Sandia, LLC (NTESS).
|
||||
//
|
||||
// Under the terms of Contract DE-AC04-94AL85000 with Sandia Corporation,
|
||||
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||
// the U.S. Government retains certain rights in this software.
|
||||
//
|
||||
// Redistribution and use in source and binary forms, with or without
|
||||
@ -22,10 +23,10 @@
|
||||
// contributors may be used to endorse or promote products derived from
|
||||
// this software without specific prior written permission.
|
||||
//
|
||||
// THIS SOFTWARE IS PROVIDED BY SANDIA CORPORATION "AS IS" AND ANY
|
||||
// THIS SOFTWARE IS PROVIDED BY NTESS "AS IS" AND ANY
|
||||
// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
||||
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL SANDIA CORPORATION OR THE
|
||||
// PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NTESS OR THE
|
||||
// CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
|
||||
// EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
|
||||
// PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
|
||||
@ -50,12 +51,10 @@
|
||||
#include <string>
|
||||
#include <sstream>
|
||||
|
||||
|
||||
namespace Perf {
|
||||
|
||||
template <typename Device, bool Near>
|
||||
struct UnorderedMapTest
|
||||
{
|
||||
struct UnorderedMapTest {
|
||||
typedef Device execution_space;
|
||||
typedef Kokkos::UnorderedMap<uint32_t, uint32_t, execution_space> map_type;
|
||||
typedef typename map_type::histogram_type histogram_type;
|
||||
@ -72,14 +71,14 @@ struct UnorderedMapTest
|
||||
map_type map;
|
||||
histogram_type histogram;
|
||||
|
||||
UnorderedMapTest( uint32_t arg_capacity, uint32_t arg_inserts, uint32_t arg_collisions)
|
||||
: capacity(arg_capacity)
|
||||
, inserts(arg_inserts)
|
||||
, collisions(arg_collisions)
|
||||
, seconds(0)
|
||||
, map(capacity)
|
||||
, histogram(map.get_histogram())
|
||||
{
|
||||
UnorderedMapTest(uint32_t arg_capacity, uint32_t arg_inserts,
|
||||
uint32_t arg_collisions)
|
||||
: capacity(arg_capacity),
|
||||
inserts(arg_inserts),
|
||||
collisions(arg_collisions),
|
||||
seconds(0),
|
||||
map(capacity),
|
||||
histogram(map.get_histogram()) {
|
||||
Kokkos::Timer wall_clock;
|
||||
wall_clock.reset();
|
||||
|
||||
@ -92,27 +91,29 @@ struct UnorderedMapTest
|
||||
Kokkos::parallel_reduce(inserts, *this, v);
|
||||
|
||||
if (v.failed_count > 0u) {
|
||||
const uint32_t new_capacity = map.capacity() + ((map.capacity()*3ull)/20u) + v.failed_count/collisions ;
|
||||
const uint32_t new_capacity = map.capacity() +
|
||||
((map.capacity() * 3ull) / 20u) +
|
||||
v.failed_count / collisions;
|
||||
map.rehash(new_capacity);
|
||||
}
|
||||
} while (v.failed_count > 0u);
|
||||
|
||||
seconds = wall_clock.seconds();
|
||||
|
||||
switch (loop_count)
|
||||
{
|
||||
switch (loop_count) {
|
||||
case 1u: std::cout << " \033[0;32m" << loop_count << "\033[0m "; break;
|
||||
case 2u: std::cout << " \033[1;31m" << loop_count << "\033[0m "; break;
|
||||
default: std::cout << " \033[0;31m" << loop_count << "\033[0m "; break;
|
||||
}
|
||||
std::cout << std::setprecision(2) << std::fixed << std::setw(5) << (1e9*(seconds/(inserts))) << "; " << std::flush;
|
||||
std::cout << std::setprecision(2) << std::fixed << std::setw(5)
|
||||
<< (1e9 * (seconds / (inserts))) << "; " << std::flush;
|
||||
|
||||
histogram.calculate();
|
||||
Device().fence();
|
||||
}
|
||||
|
||||
void print(std::ostream & metrics_out, std::ostream & length_out, std::ostream & distance_out, std::ostream & block_distance_out)
|
||||
{
|
||||
void print(std::ostream& metrics_out, std::ostream& length_out,
|
||||
std::ostream& distance_out, std::ostream& block_distance_out) {
|
||||
metrics_out << map.capacity() << " , ";
|
||||
metrics_out << inserts / collisions << " , ";
|
||||
metrics_out << (100.0 * inserts / collisions) / map.capacity() << " , ";
|
||||
@ -133,40 +134,36 @@ struct UnorderedMapTest
|
||||
histogram.print_distance(distance_out);
|
||||
|
||||
block_distance_out << map.capacity() << " , ";
|
||||
block_distance_out << ((100.0 *inserts/collisions) / map.capacity()) << " , ";
|
||||
block_distance_out << ((100.0 * inserts / collisions) / map.capacity())
|
||||
<< " , ";
|
||||
block_distance_out << collisions << " , ";
|
||||
histogram.print_block_distance(block_distance_out);
|
||||
}
|
||||
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void init( value_type & v ) const
|
||||
{
|
||||
void init(value_type& v) const {
|
||||
v.failed_count = 0;
|
||||
v.max_list = 0;
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void join( volatile value_type & dst, const volatile value_type & src ) const
|
||||
{
|
||||
void join(volatile value_type& dst, const volatile value_type& src) const {
|
||||
dst.failed_count += src.failed_count;
|
||||
dst.max_list = src.max_list < dst.max_list ? dst.max_list : src.max_list;
|
||||
}
|
||||
|
||||
KOKKOS_INLINE_FUNCTION
|
||||
void operator()(uint32_t i, value_type & v) const
|
||||
{
|
||||
void operator()(uint32_t i, value_type& v) const {
|
||||
const uint32_t key = Near ? i / collisions : i % (inserts / collisions);
|
||||
typename map_type::insert_result result = map.insert(key, i);
|
||||
v.failed_count += !result.failed() ? 0 : 1;
|
||||
v.max_list = result.list_position() < v.max_list ? v.max_list : result.list_position();
|
||||
v.max_list = result.list_position() < v.max_list ? v.max_list
|
||||
: result.list_position();
|
||||
}
|
||||
|
||||
};
|
||||
|
||||
template <typename Device, bool Near>
|
||||
void run_performance_tests(std::string const & base_file_name)
|
||||
{
|
||||
void run_performance_tests(std::string const& base_file_name) {
|
||||
#if 0
|
||||
std::string metrics_file_name = base_file_name + std::string("-metrics.csv");
|
||||
std::string length_file_name = base_file_name + std::string("-length.csv");
|
||||
@ -254,7 +251,6 @@ void run_performance_tests(std::string const & base_file_name)
|
||||
#endif
|
||||
}
|
||||
|
||||
|
||||
} // namespace Perf
|
||||
|
||||
#endif // KOKKOS_TEST_UNORDERED_MAP_PERFORMANCE_HPP
|
||||
|
||||
@ -1,47 +1,34 @@
|
||||
|
||||
TRIBITS_CONFIGURE_FILE(${PACKAGE_NAME}_config.h)
|
||||
KOKKOS_CONFIGURE_FILE(${PACKAGE_NAME}_config.h)
|
||||
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR})
|
||||
#need these here for now
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_BINARY_DIR})
|
||||
KOKKOS_INCLUDE_DIRECTORIES(${CMAKE_CURRENT_SOURCE_DIR})
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
SET(TRILINOS_INCDIR ${CMAKE_INSTALL_PREFIX}/${${PROJECT_NAME}_INSTALL_INCLUDE_DIR})
|
||||
|
||||
if(KOKKOS_LEGACY_TRIBITS)
|
||||
|
||||
SET(HEADERS "")
|
||||
SET(SOURCES "")
|
||||
|
||||
SET(HEADERS_IMPL "")
|
||||
|
||||
FILE(GLOB HEADERS *.hpp)
|
||||
FILE(GLOB HEADERS_IMPL impl/*.hpp)
|
||||
FILE(GLOB SOURCES impl/*.cpp)
|
||||
|
||||
INSTALL(FILES ${HEADERS_IMPL} DESTINATION ${TRILINOS_INCDIR}/impl/)
|
||||
|
||||
TRIBITS_ADD_LIBRARY(
|
||||
kokkoscontainers
|
||||
HEADERS ${HEADERS}
|
||||
NOINSTALLHEADERS ${HEADERS_IMPL}
|
||||
SOURCES ${SOURCES}
|
||||
DEPLIBS
|
||||
)
|
||||
|
||||
else()
|
||||
SET(KOKKOS_CONTAINERS_SRCS)
|
||||
APPEND_GLOB(KOKKOS_CONTAINERS_SRCS ${CMAKE_CURRENT_SOURCE_DIR}/impl/*.cpp)
|
||||
|
||||
INSTALL (
|
||||
DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/"
|
||||
DESTINATION ${TRILINOS_INCDIR}
|
||||
DESTINATION ${KOKKOS_HEADER_DIR}
|
||||
FILES_MATCHING PATTERN "*.hpp"
|
||||
)
|
||||
|
||||
TRIBITS_ADD_LIBRARY(
|
||||
KOKKOS_ADD_LIBRARY(
|
||||
kokkoscontainers
|
||||
SOURCES ${KOKKOS_CONTAINERS_SRCS}
|
||||
DEPLIBS
|
||||
)
|
||||
|
||||
endif()
|
||||
SET_TARGET_PROPERTIES(kokkoscontainers PROPERTIES VERSION ${Kokkos_VERSION})
|
||||
|
||||
KOKKOS_LIB_INCLUDE_DIRECTORIES(kokkoscontainers
|
||||
${KOKKOS_TOP_BUILD_DIR}
|
||||
${CMAKE_CURRENT_BINARY_DIR}
|
||||
${CMAKE_CURRENT_SOURCE_DIR}
|
||||
)
|
||||
KOKKOS_LINK_INTERNAL_LIBRARY(kokkoscontainers kokkoscore)
|
||||
|
||||
#-----------------------------------------------------------------------------
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user