Update Kokkos library to r2.6.00

This commit is contained in:
Stan Moore
2018-03-08 10:57:08 -07:00
parent 0c4c002f34
commit 39786b1740
694 changed files with 12261 additions and 6745 deletions

View File

@ -1,5 +1,49 @@
# Change Log # Change Log
## [2.6.00](https://github.com/kokkos/kokkos/tree/2.6.00) (2018-03-07)
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.5.00...2.6.00)
**Part of the Kokkos C++ Performance Portability Programming EcoSystem 2.6**
**Implemented enhancements:**
- Support NVIDIA Volta microarchitecture [\#1466](https://github.com/kokkos/kokkos/issues/1466)
- Kokkos - Define empty functions when profiling disabled [\#1424](https://github.com/kokkos/kokkos/issues/1424)
- Don't use \_\_constant\_\_ cache for lock arrays, enable once per run update instead of once per call [\#1385](https://github.com/kokkos/kokkos/issues/1385)
- task dag enhancement. [\#1354](https://github.com/kokkos/kokkos/issues/1354)
- Cuda task team collectives and stack size [\#1353](https://github.com/kokkos/kokkos/issues/1353)
- Replace View operator acceptance of more than rank integers with 'access' function [\#1333](https://github.com/kokkos/kokkos/issues/1333)
- Interoperability: Do not shut down backend execution space runtimes upon calling finalize. [\#1305](https://github.com/kokkos/kokkos/issues/1305)
- shmem\_size for LayoutStride [\#1291](https://github.com/kokkos/kokkos/issues/1291)
- Kokkos::resize performs poorly on 1D Views [\#1270](https://github.com/kokkos/kokkos/issues/1270)
- stride\(\) is inconsistent with dimension\(\), extent\(\), etc. [\#1214](https://github.com/kokkos/kokkos/issues/1214)
- Kokkos::sort defaults to std::sort on host [\#1208](https://github.com/kokkos/kokkos/issues/1208)
- DynamicView with host size grow [\#1206](https://github.com/kokkos/kokkos/issues/1206)
- Unmanaged View with Anonymous Memory Space [\#1175](https://github.com/kokkos/kokkos/issues/1175)
- Sort subset of Kokkos::DynamicView [\#1160](https://github.com/kokkos/kokkos/issues/1160)
- MDRange policy doesn't support lambda reductions [\#1054](https://github.com/kokkos/kokkos/issues/1054)
- Add ability to set hook on Kokkos::finalize [\#714](https://github.com/kokkos/kokkos/issues/714)
- Atomics with Serial Backend - Default should be Disable? [\#549](https://github.com/kokkos/kokkos/issues/549)
- KOKKOS\_ENABLE\_DEPRECATED\_CODE [\#1359](https://github.com/kokkos/kokkos/issues/1359)
**Fixed bugs:**
- cuda\_internal\_maximum\_warp\_count returns 8, but I believe it should return 16 for P100 [\#1269](https://github.com/kokkos/kokkos/issues/1269)
- Cuda: level 1 scratch memory bug \(reported by Stan Moore\) [\#1434](https://github.com/kokkos/kokkos/issues/1434)
- MDRangePolicy Reduction requires value\_type typedef in Functor [\#1379](https://github.com/kokkos/kokkos/issues/1379)
- Kokkos DeepCopy between empty views fails [\#1369](https://github.com/kokkos/kokkos/issues/1369)
- Several issues with new CMake build infrastructure \(reported by Eric Phipps\) [\#1365](https://github.com/kokkos/kokkos/issues/1365)
- deep\_copy between rank-1 host/device views of differing layouts without UVM no longer works \(reported by Eric Phipps\) [\#1363](https://github.com/kokkos/kokkos/issues/1363)
- Profiling can't be disabled in CMake, and a parallel\_for is missing for tasks \(reported by Kyungjoo Kim\) [\#1349](https://github.com/kokkos/kokkos/issues/1349)
- get\_work\_partition int overflow \(reported by berryj5\) [\#1327](https://github.com/kokkos/kokkos/issues/1327)
- Kokkos::deep\_copy must fence even if the two views are the same [\#1303](https://github.com/kokkos/kokkos/issues/1303)
- CudaUVMSpace::allocate/deallocate must fence [\#1302](https://github.com/kokkos/kokkos/issues/1302)
- ViewResize on CUDA fails in Debug because of too many resources requested [\#1299](https://github.com/kokkos/kokkos/issues/1299)
- Cuda 9 and intrepid2 calls from Panzer. [\#1183](https://github.com/kokkos/kokkos/issues/1183)
- Slowdown due to tracking\_enabled\(\) in 2.04.00 \(found by Albany app\) [\#1016](https://github.com/kokkos/kokkos/issues/1016)
- Bounds checking fails with zero-span Views \(reported by Stan Moore\) [\#1411](https://github.com/kokkos/kokkos/issues/1411)
## [2.5.00](https://github.com/kokkos/kokkos/tree/2.5.00) (2017-12-15) ## [2.5.00](https://github.com/kokkos/kokkos/tree/2.5.00) (2017-12-15)
[Full Changelog](https://github.com/kokkos/kokkos/compare/2.04.11...2.5.00) [Full Changelog](https://github.com/kokkos/kokkos/compare/2.04.11...2.5.00)

View File

@ -7,7 +7,7 @@ ELSE()
ENDIF() ENDIF()
IF(NOT KOKKOS_HAS_TRILINOS) IF(NOT KOKKOS_HAS_TRILINOS)
cmake_minimum_required(VERSION 3.1 FATAL_ERROR) cmake_minimum_required(VERSION 3.3 FATAL_ERROR)
# Define Project Name if this is a standalone build # Define Project Name if this is a standalone build
IF(NOT DEFINED ${PROJECT_NAME}) IF(NOT DEFINED ${PROJECT_NAME})
@ -37,9 +37,19 @@ IF(NOT KOKKOS_HAS_TRILINOS)
COMMAND ${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings COMMAND ${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings
WORKING_DIRECTORY "${Kokkos_BINARY_DIR}" WORKING_DIRECTORY "${Kokkos_BINARY_DIR}"
OUTPUT_FILE ${Kokkos_BINARY_DIR}/core_src_make.out OUTPUT_FILE ${Kokkos_BINARY_DIR}/core_src_make.out
RESULT_VARIABLE res RESULT_VARIABLE GEN_SETTINGS_RESULT
) )
if (GEN_SETTINGS_RESULT)
message(FATAL_ERROR "Kokkos settings generation failed:\n"
"${KOKKOS_SETTINGS} make -f ${KOKKOS_SRC_PATH}/cmake/Makefile.generate_cmake_settings CXX=${CMAKE_CXX_COMPILER} generate_build_settings")
endif()
include(${Kokkos_BINARY_DIR}/kokkos_generated_settings.cmake) include(${Kokkos_BINARY_DIR}/kokkos_generated_settings.cmake)
string(REPLACE " " ";" KOKKOS_TPL_INCLUDE_DIRS "${KOKKOS_GMAKE_TPL_INCLUDE_DIRS}")
string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_DIRS "${KOKKOS_GMAKE_TPL_LIBRARY_DIRS}")
string(REPLACE " " ";" KOKKOS_TPL_LIBRARY_NAMES "${KOKKOS_GMAKE_TPL_LIBRARY_NAMES}")
list(REMOVE_ITEM KOKKOS_TPL_INCLUDE_DIRS "")
list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_DIRS "")
list(REMOVE_ITEM KOKKOS_TPL_LIBRARY_NAMES "")
set_kokkos_srcs(KOKKOS_SRC ${KOKKOS_SRC}) set_kokkos_srcs(KOKKOS_SRC ${KOKKOS_SRC})
#------------ NOW BUILD ------------------------------------------------------ #------------ NOW BUILD ------------------------------------------------------

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -19,7 +19,7 @@ snapshot Kokkos from github.com/kokkos to Trilinos.
3) Snapshot the current commit in the Kokkos clone into the Trilinos clone. 3) Snapshot the current commit in the Kokkos clone into the Trilinos clone.
This overwrites ${TRILINOS}/packages/kokkos with the content of ${KOKKOS}: This overwrites ${TRILINOS}/packages/kokkos with the content of ${KOKKOS}:
${KOKKOS}/config/snapshot.py --verbose ${KOKKOS} ${TRILINOS}/packages ${KOKKOS}/scripts/snapshot.py --verbose ${KOKKOS} ${TRILINOS}/packages
4) Verify the snapshot commit happened as expected 4) Verify the snapshot commit happened as expected
cd ${TRILINOS}/packages/kokkos cd ${TRILINOS}/packages/kokkos

View File

@ -36,7 +36,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -9,8 +9,8 @@ KOKKOS_DEVICES ?= "OpenMP"
#KOKKOS_DEVICES ?= "Pthreads" #KOKKOS_DEVICES ?= "Pthreads"
# Options: # Options:
# Intel: KNC,KNL,SNB,HSW,BDW,SKX # Intel: KNC,KNL,SNB,HSW,BDW,SKX
# NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61 # NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72
# ARM: ARMv80,ARMv81,ARMv8-ThunderX # ARM: ARMv80,ARMv81,ARMv8-ThunderX,ARMv8-TX2
# IBM: BGQ,Power7,Power8,Power9 # IBM: BGQ,Power7,Power8,Power9
# AMD-GPUS: Kaveri,Carrizo,Fiji,Vega # AMD-GPUS: Kaveri,Carrizo,Fiji,Vega
# AMD-CPUS: AMDAVX,Ryzen,Epyc # AMD-CPUS: AMDAVX,Ryzen,Epyc
@ -21,7 +21,7 @@ KOKKOS_DEBUG ?= "no"
KOKKOS_USE_TPLS ?= "" KOKKOS_USE_TPLS ?= ""
# Options: c++11,c++1z # Options: c++11,c++1z
KOKKOS_CXX_STANDARD ?= "c++11" KOKKOS_CXX_STANDARD ?= "c++11"
# Options: aggressive_vectorization,disable_profiling # Options: aggressive_vectorization,disable_profiling,disable_deprecated_code
KOKKOS_OPTIONS ?= "" KOKKOS_OPTIONS ?= ""
# Default settings specific options. # Default settings specific options.
@ -48,6 +48,7 @@ KOKKOS_INTERNAL_USE_MEMKIND := $(call kokkos_has_string,$(KOKKOS_USE_TPLS),exper
KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS := $(call kokkos_has_string,$(KOKKOS_OPTIONS),compiler_warnings) KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS := $(call kokkos_has_string,$(KOKKOS_OPTIONS),compiler_warnings)
KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION := $(call kokkos_has_string,$(KOKKOS_OPTIONS),aggressive_vectorization) KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION := $(call kokkos_has_string,$(KOKKOS_OPTIONS),aggressive_vectorization)
KOKKOS_INTERNAL_DISABLE_PROFILING := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_profiling) KOKKOS_INTERNAL_DISABLE_PROFILING := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_profiling)
KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_deprecated_code)
KOKKOS_INTERNAL_DISABLE_DUALVIEW_MODIFY_CHECK := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_dualview_modify_check) KOKKOS_INTERNAL_DISABLE_DUALVIEW_MODIFY_CHECK := $(call kokkos_has_string,$(KOKKOS_OPTIONS),disable_dualview_modify_check)
KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_profile_load_print) KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT := $(call kokkos_has_string,$(KOKKOS_OPTIONS),enable_profile_load_print)
KOKKOS_INTERNAL_CUDA_USE_LDG := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),use_ldg) KOKKOS_INTERNAL_CUDA_USE_LDG := $(call kokkos_has_string,$(KOKKOS_CUDA_OPTIONS),use_ldg)
@ -93,7 +94,7 @@ KOKKOS_INTERNAL_COMPILER_INTEL := $(call kokkos_has_string,$(KOKKOS_CXX_VE
KOKKOS_INTERNAL_COMPILER_PGI := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),PGI) KOKKOS_INTERNAL_COMPILER_PGI := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),PGI)
KOKKOS_INTERNAL_COMPILER_XL := $(strip $(shell $(CXX) -qversion 2>&1 | grep XL | wc -l)) KOKKOS_INTERNAL_COMPILER_XL := $(strip $(shell $(CXX) -qversion 2>&1 | grep XL | wc -l))
KOKKOS_INTERNAL_COMPILER_CRAY := $(strip $(shell $(CXX) -craype-verbose 2>&1 | grep "CC-" | wc -l)) KOKKOS_INTERNAL_COMPILER_CRAY := $(strip $(shell $(CXX) -craype-verbose 2>&1 | grep "CC-" | wc -l))
KOKKOS_INTERNAL_COMPILER_NVCC := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); $(CXX) --version 2>&1 | grep nvcc | wc -l)) KOKKOS_INTERNAL_COMPILER_NVCC := $(strip $(shell export OMPI_CXX=$(OMPI_CXX); export MPICH_CXX=$(MPICH_CXX); $(CXX) --version 2>&1 | grep nvcc | wc -l))
KOKKOS_INTERNAL_COMPILER_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),clang) KOKKOS_INTERNAL_COMPILER_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),clang)
KOKKOS_INTERNAL_COMPILER_APPLE_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),apple-darwin) KOKKOS_INTERNAL_COMPILER_APPLE_CLANG := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),apple-darwin)
KOKKOS_INTERNAL_COMPILER_HCC := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),HCC) KOKKOS_INTERNAL_COMPILER_HCC := $(call kokkos_has_string,$(KOKKOS_CXX_VERSION),HCC)
@ -229,12 +230,16 @@ KOKKOS_INTERNAL_USE_ARCH_MAXWELL52 := $(call kokkos_has_string,$(KOKKOS_ARCH),Ma
KOKKOS_INTERNAL_USE_ARCH_MAXWELL53 := $(call kokkos_has_string,$(KOKKOS_ARCH),Maxwell53) KOKKOS_INTERNAL_USE_ARCH_MAXWELL53 := $(call kokkos_has_string,$(KOKKOS_ARCH),Maxwell53)
KOKKOS_INTERNAL_USE_ARCH_PASCAL61 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal61) KOKKOS_INTERNAL_USE_ARCH_PASCAL61 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal61)
KOKKOS_INTERNAL_USE_ARCH_PASCAL60 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal60) KOKKOS_INTERNAL_USE_ARCH_PASCAL60 := $(call kokkos_has_string,$(KOKKOS_ARCH),Pascal60)
KOKKOS_INTERNAL_USE_ARCH_VOLTA70 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta70)
KOKKOS_INTERNAL_USE_ARCH_VOLTA72 := $(call kokkos_has_string,$(KOKKOS_ARCH),Volta72)
KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLER30) \ KOKKOS_INTERNAL_USE_ARCH_NVIDIA := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_KEPLER30) \
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER32) \ + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER32) \
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER35) \ + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER35) \
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37) \ + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37) \
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61) \ + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61) \
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \ + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \ + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \ + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53)) + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -249,6 +254,8 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
+ $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37) \ + $(KOKKOS_INTERNAL_USE_ARCH_KEPLER37) \
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61) \ + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL61) \
+ $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \ + $(KOKKOS_INTERNAL_USE_ARCH_PASCAL60) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA70) \
+ $(KOKKOS_INTERNAL_USE_ARCH_VOLTA72) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \ + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL50) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \ + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL52) \
+ $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53)) + $(KOKKOS_INTERNAL_USE_ARCH_MAXWELL53))
@ -267,7 +274,8 @@ endif
KOKKOS_INTERNAL_USE_ARCH_ARMV80 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv80) KOKKOS_INTERNAL_USE_ARCH_ARMV80 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv80)
KOKKOS_INTERNAL_USE_ARCH_ARMV81 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv81) KOKKOS_INTERNAL_USE_ARCH_ARMV81 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv81)
KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-ThunderX) KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-ThunderX)
KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX) | bc)) KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-TX2)
KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2) | bc))
# IBM based. # IBM based.
KOKKOS_INTERNAL_USE_ARCH_BGQ := $(call kokkos_has_string,$(KOKKOS_ARCH),BGQ) KOKKOS_INTERNAL_USE_ARCH_BGQ := $(call kokkos_has_string,$(KOKKOS_ARCH),BGQ)
@ -316,6 +324,9 @@ endif
# Generating the list of Flags. # Generating the list of Flags.
KOKKOS_CPPFLAGS = -I./ -I$(KOKKOS_PATH)/core/src -I$(KOKKOS_PATH)/containers/src -I$(KOKKOS_PATH)/algorithms/src KOKKOS_CPPFLAGS = -I./ -I$(KOKKOS_PATH)/core/src -I$(KOKKOS_PATH)/containers/src -I$(KOKKOS_PATH)/algorithms/src
KOKKOS_TPL_INCLUDE_DIRS =
KOKKOS_TPL_LIBRARY_DIRS =
KOKKOS_TPL_LIBRARY_NAMES =
KOKKOS_CXXFLAGS = KOKKOS_CXXFLAGS =
ifeq ($(KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS), 1) ifeq ($(KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS), 1)
@ -323,7 +334,9 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_COMPILER_WARNINGS), 1)
endif endif
KOKKOS_LIBS = -ldl KOKKOS_LIBS = -ldl
KOKKOS_TPL_LIBRARY_NAMES += dl
KOKKOS_LDFLAGS = -L$(shell pwd) KOKKOS_LDFLAGS = -L$(shell pwd)
KOKKOS_LINK_FLAGS =
KOKKOS_SRC = KOKKOS_SRC =
KOKKOS_HEADERS = KOKKOS_HEADERS =
@ -437,21 +450,32 @@ ifeq ($(KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT), 1)
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_HWLOC), 1) ifeq ($(KOKKOS_INTERNAL_USE_HWLOC), 1)
KOKKOS_CPPFLAGS += -I$(HWLOC_PATH)/include ifneq ($(HWLOC_PATH),)
KOKKOS_LDFLAGS += -L$(HWLOC_PATH)/lib KOKKOS_CPPFLAGS += -I$(HWLOC_PATH)/include
KOKKOS_LDFLAGS += -L$(HWLOC_PATH)/lib
KOKKOS_TPL_INCLUDE_DIRS += $(HWLOC_PATH)/include
KOKKOS_TPL_LIBRARY_DIRS += $(HWLOC_PATH)/lib
endif
KOKKOS_LIBS += -lhwloc KOKKOS_LIBS += -lhwloc
KOKKOS_TPL_LIBRARY_NAMES += hwloc
tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HWLOC") tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HWLOC")
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_LIBRT), 1) ifeq ($(KOKKOS_INTERNAL_USE_LIBRT), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_USE_LIBRT") tmp := $(call kokkos_append_header,"\#define KOKKOS_USE_LIBRT")
KOKKOS_LIBS += -lrt KOKKOS_LIBS += -lrt
KOKKOS_TPL_LIBRARY_NAMES += rt
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1) ifeq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1)
KOKKOS_CPPFLAGS += -I$(MEMKIND_PATH)/include ifneq ($(MEMKIND_PATH),)
KOKKOS_LDFLAGS += -L$(MEMKIND_PATH)/lib KOKKOS_CPPFLAGS += -I$(MEMKIND_PATH)/include
KOKKOS_LDFLAGS += -L$(MEMKIND_PATH)/lib
KOKKOS_TPL_INCLUDE_DIRS += $(MEMKIND_PATH)/include
KOKKOS_TPL_LIBRARY_DIRS += $(MEMKIND_PATH)/lib
endif
KOKKOS_LIBS += -lmemkind -lnuma KOKKOS_LIBS += -lmemkind -lnuma
KOKKOS_TPL_LIBRARY_NAMES += memkind numa
tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HBWSPACE") tmp := $(call kokkos_append_header,"\#define KOKKOS_HAVE_HBWSPACE")
endif endif
@ -459,6 +483,10 @@ ifeq ($(KOKKOS_INTERNAL_DISABLE_PROFILING), 0)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_PROFILING") tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_PROFILING")
endif endif
ifeq ($(KOKKOS_INTERNAL_DISABLE_DEPRECATED_CODE), 0)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ENABLE_DEPRECATED_CODE")
endif
tmp := $(call kokkos_append_header,"/* Optimization Settings */") tmp := $(call kokkos_append_header,"/* Optimization Settings */")
ifeq ($(KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION), 1) ifeq ($(KOKKOS_INTERNAL_OPT_RANGE_AGGRESSIVE_VECTORIZATION), 1)
@ -560,6 +588,24 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX), 1)
endif endif
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV81")
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_ARMV8_THUNDERX2")
ifeq ($(KOKKOS_INTERNAL_COMPILER_CRAY), 1)
KOKKOS_CXXFLAGS +=
KOKKOS_LDFLAGS +=
else
ifeq ($(KOKKOS_INTERNAL_COMPILER_PGI), 1)
KOKKOS_CXXFLAGS +=
KOKKOS_LDFLAGS +=
else
KOKKOS_CXXFLAGS += -mtune=thunderx2t99 -mcpu=thunderx2t99
KOKKOS_LDFLAGS += -mtune=thunderx2t99 -mcpu=thunderx2t99
endif
endif
endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_SSE42), 1) ifeq ($(KOKKOS_INTERNAL_USE_ARCH_SSE42), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_SSE42") tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_SSE42")
@ -754,10 +800,11 @@ endif
ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1) ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
ifeq ($(KOKKOS_INTERNAL_COMPILER_NVCC), 1) ifeq ($(KOKKOS_INTERNAL_COMPILER_NVCC), 1)
KOKKOS_INTERNAL_CUDA_ARCH_FLAG=-arch KOKKOS_INTERNAL_CUDA_ARCH_FLAG=-arch
endif else ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1) KOKKOS_INTERNAL_CUDA_ARCH_FLAG=--cuda-gpu-arch
KOKKOS_INTERNAL_CUDA_ARCH_FLAG=--cuda-gpu-arch KOKKOS_CXXFLAGS += -x cuda
KOKKOS_CXXFLAGS += -x cuda else
$(error Makefile.kokkos: CUDA is enabled but the compiler is neither NVCC nor Clang)
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_KEPLER30), 1) ifeq ($(KOKKOS_INTERNAL_USE_ARCH_KEPLER30), 1)
@ -805,6 +852,16 @@ ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_PASCAL61") tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_PASCAL61")
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_61 KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_61
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_VOLTA70), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA")
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA70")
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_70
endif
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_VOLTA72), 1)
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA")
tmp := $(call kokkos_append_header,"\#define KOKKOS_ARCH_VOLTA72")
KOKKOS_INTERNAL_CUDA_ARCH_FLAG := $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)=sm_72
endif
ifneq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0) ifneq ($(KOKKOS_INTERNAL_USE_ARCH_NVIDIA), 0)
KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG) KOKKOS_CXXFLAGS += $(KOKKOS_INTERNAL_CUDA_ARCH_FLAG)
@ -850,6 +907,7 @@ ifeq ($(KOKKOS_INTERNAL_USE_ROCM), 1)
KOKKOS_CXXFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --cxxflags) KOKKOS_CXXFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --cxxflags)
KOKKOS_LDFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --ldflags) -lhc_am -lm KOKKOS_LDFLAGS += $(shell $(ROCM_HCC_PATH)/bin/hcc-config --ldflags) -lhc_am -lm
KOKKOS_TPL_LIBRARY_NAMES += hc_am m
KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_ROCM_ARCH_FLAG) KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_ROCM_ARCH_FLAG)
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/ROCm/*.cpp) KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/ROCm/*.cpp)
@ -880,13 +938,17 @@ KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/containers/src/impl/*.cpp)
ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1) ifeq ($(KOKKOS_INTERNAL_USE_CUDA), 1)
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.cpp) KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.cpp)
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.hpp) KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Cuda/*.hpp)
KOKKOS_CPPFLAGS += -I$(CUDA_PATH)/include ifneq ($(CUDA_PATH),)
KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib64 KOKKOS_CPPFLAGS += -I$(CUDA_PATH)/include
KOKKOS_LIBS += -lcudart -lcuda KOKKOS_LDFLAGS += -L$(CUDA_PATH)/lib64
KOKKOS_TPL_INCLUDE_DIRS += $(CUDA_PATH)/include
ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1) KOKKOS_TPL_LIBRARY_DIRS += $(CUDA_PATH)/lib64
KOKKOS_CXXFLAGS += --cuda-path=$(CUDA_PATH) ifeq ($(KOKKOS_INTERNAL_COMPILER_CLANG), 1)
KOKKOS_CXXFLAGS += --cuda-path=$(CUDA_PATH)
endif
endif endif
KOKKOS_LIBS += -lcudart -lcuda
KOKKOS_TPL_LIBRARY_NAMES += cudart cuda
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_OPENMPTARGET), 1) ifeq ($(KOKKOS_INTERNAL_USE_OPENMPTARGET), 1)
@ -911,20 +973,27 @@ ifeq ($(KOKKOS_INTERNAL_USE_OPENMP), 1)
endif endif
KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_OPENMP_FLAG) KOKKOS_LDFLAGS += $(KOKKOS_INTERNAL_OPENMP_FLAG)
KOKKOS_LINK_FLAGS += $(KOKKOS_INTERNAL_OPENMP_FLAG)
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_PTHREADS), 1) ifeq ($(KOKKOS_INTERNAL_USE_PTHREADS), 1)
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.cpp) KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.cpp)
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.hpp) KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Threads/*.hpp)
KOKKOS_LIBS += -lpthread KOKKOS_LIBS += -lpthread
KOKKOS_TPL_LIBRARY_NAMES += pthread
endif endif
ifeq ($(KOKKOS_INTERNAL_USE_QTHREADS), 1) ifeq ($(KOKKOS_INTERNAL_USE_QTHREADS), 1)
KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.cpp) KOKKOS_SRC += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.cpp)
KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.hpp) KOKKOS_HEADERS += $(wildcard $(KOKKOS_PATH)/core/src/Qthreads/*.hpp)
KOKKOS_CPPFLAGS += -I$(QTHREADS_PATH)/include ifneq ($(QTHREADS_PATH),)
KOKKOS_LDFLAGS += -L$(QTHREADS_PATH)/lib KOKKOS_CPPFLAGS += -I$(QTHREADS_PATH)/include
KOKKOS_LDFLAGS += -L$(QTHREADS_PATH)/lib
KOKKOS_TPL_INCLUDE_DIRS += $(QTHREADS_PATH)/include
KOKKOS_TPL_LIBRARY_DIRS += $(QTHREADS_PATH)/lib64
endif
KOKKOS_LIBS += -lqthread KOKKOS_LIBS += -lqthread
KOKKOS_TPL_LIBRARY_NAMES += qthread
endif endif
# Explicitly set the GCC Toolchain for Clang. # Explicitly set the GCC Toolchain for Clang.
@ -940,11 +1009,6 @@ ifneq ($(KOKKOS_INTERNAL_USE_MEMKIND), 1)
KOKKOS_SRC := $(filter-out $(KOKKOS_PATH)/core/src/impl/Kokkos_HBWSpace.cpp,$(KOKKOS_SRC)) KOKKOS_SRC := $(filter-out $(KOKKOS_PATH)/core/src/impl/Kokkos_HBWSpace.cpp,$(KOKKOS_SRC))
endif endif
# Don't include Kokkos_Profiling_Interface.cpp if not using profiling to avoid a link warning.
ifeq ($(KOKKOS_INTERNAL_DISABLE_PROFILING), 1)
KOKKOS_SRC := $(filter-out $(KOKKOS_PATH)/core/src/impl/Kokkos_Profiling_Interface.cpp,$(KOKKOS_SRC))
endif
# Don't include Kokkos_Serial.cpp or Kokkos_Serial_Task.cpp if not using Serial # Don't include Kokkos_Serial.cpp or Kokkos_Serial_Task.cpp if not using Serial
# device to avoid a link warning. # device to avoid a link warning.
ifneq ($(KOKKOS_INTERNAL_USE_SERIAL), 1) ifneq ($(KOKKOS_INTERNAL_USE_SERIAL), 1)

View File

@ -1,87 +1,101 @@
Kokkos implements a programming model in C++ for writing performance portable Kokkos Core implements a programming model in C++ for writing performance portable
applications targeting all major HPC platforms. For that purpose it provides applications targeting all major HPC platforms. For that purpose it provides
abstractions for both parallel execution of code and data management. abstractions for both parallel execution of code and data management.
Kokkos is designed to target complex node architectures with N-level memory Kokkos is designed to target complex node architectures with N-level memory
hierarchies and multiple types of execution resources. It currently can use hierarchies and multiple types of execution resources. It currently can use
OpenMP, Pthreads and CUDA as backend programming models. OpenMP, Pthreads and CUDA as backend programming models.
Kokkos is licensed under standard 3-clause BSD terms of use. For specifics Kokkos Core is part of the Kokkos C++ Performance Portability Programming EcoSystem,
see the LICENSE file contained in the repository or distribution. which also provides math kernels (https://github.com/kokkos/kokkos-kernels), as well as
profiling and debugging tools (https://github.com/kokkos/kokkos-tools).
The core developers of Kokkos are Carter Edwards and Christian Trott # Learning about Kokkos
at the Computer Science Research Institute of the Sandia National
Laboratories.
The KokkosP interface and associated tools are developed by the Application A programming guide can be found on the Wiki, the API reference is under development.
Performance Team and Kokkos core developers at Sandia National Laboratories.
To learn more about Kokkos consider watching one of our presentations: For questions find us on Slack: https://kokkosteam.slack.com or open a github issue.
GTC 2015:
http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf
A programming guide can be found under doc/Kokkos_PG.pdf. This is an initial version For non-public questions send an email to
and feedback is greatly appreciated. crtrott(at)sandia.gov
A separate repository with extensive tutorial material can be found under A separate repository with extensive tutorial material can be found under
https://github.com/kokkos/kokkos-tutorials. https://github.com/kokkos/kokkos-tutorials.
If you have a patch to contribute please feel free to issue a pull request against Furthermore, the 'example/tutorial' directory provides step by step tutorial
the develop branch. For major contributions it is better to contact us first examples which explain many of the features of Kokkos. They work with
for guidance. simple Makefiles. To build with g++ and OpenMP simply type 'make'
in the 'example/tutorial' directory. This will build all examples in the
subfolders. To change the build options refer to the Programming Guide
in the compilation section.
For questions please send an email to To learn more about Kokkos consider watching one of our presentations:
kokkos-users@software.sandia.gov * GTC 2015:
- http://on-demand.gputechconf.com/gtc/2015/video/S5166.html
- http://on-demand.gputechconf.com/gtc/2015/presentation/S5166-H-Carter-Edwards.pdf
For non-public questions send an email to
hcedwar(at)sandia.gov and crtrott(at)sandia.gov
============================================================================ # Contributing to Kokkos
====Requirements============================================================
============================================================================
Primary tested compilers on X86 are: We are open and try to encourage contributions from external developers.
GCC 4.8.4 To do so please first open an issue describing the contribution and then issue
GCC 4.9.3 a pull request against the develop branch. For larger features it may be good
GCC 5.1.0 to get guidance from the core development team first through the github issue.
GCC 5.3.0
GCC 6.1.0
Intel 15.0.2
Intel 16.0.1
Intel 17.1.043
Intel 17.4.196
Intel 18.0.128
Clang 3.5.2
Clang 3.6.1
Clang 3.7.1
Clang 3.8.1
Clang 3.9.0
Clang 4.0.0
Clang 4.0.0 for CUDA (CUDA Toolkit 8.0.44)
PGI 17.10
NVCC 7.0 for CUDA (with gcc 4.8.4)
NVCC 7.5 for CUDA (with gcc 4.8.4)
NVCC 8.0.44 for CUDA (with gcc 5.3.0)
Primary tested compilers on Power 8 are: Note that Kokkos Core is licensed under standard 3-clause BSD terms of use.
GCC 5.4.0 (OpenMP,Serial) Which means contributing to Kokkos allows anyone else to use your contributions
IBM XL 13.1.5 (OpenMP, Serial) (There is a workaround in place to avoid a compiler bug) not just for public purposes but also for closed source commercial projects.
NVCC 8.0.44 for CUDA (with gcc 5.4.0) For specifics see the LICENSE file contained in the repository or distribution.
NVCC 9.0.103 for CUDA (with gcc 6.3.0)
Primary tested compilers on Intel KNL are: # Requirements
GCC 6.2.0
Intel 16.4.258 (with gcc 4.7.2)
Intel 17.2.174 (with gcc 4.9.3)
Intel 18.0.128 (with gcc 4.9.3)
Other compilers working: ### Primary tested compilers on X86 are:
X86: * GCC 4.8.4
Cygwin 2.1.0 64bit with gcc 4.9.3 * GCC 4.9.3
* GCC 5.1.0
* GCC 5.3.0
* GCC 6.1.0
* Intel 15.0.2
* Intel 16.0.1
* Intel 17.1.043
* Intel 17.4.196
* Intel 18.0.128
* Clang 3.6.1
* Clang 3.7.1
* Clang 3.8.1
* Clang 3.9.0
* Clang 4.0.0
* Clang 4.0.0 for CUDA (CUDA Toolkit 8.0.44)
* Clang 6.0.0 for CUDA (CUDA Toolkit 9.1)
* PGI 17.10
* NVCC 7.0 for CUDA (with gcc 4.8.4)
* NVCC 7.5 for CUDA (with gcc 4.8.4)
* NVCC 8.0.44 for CUDA (with gcc 5.3.0)
* NVCC 9.1 for CUDA (with gcc 6.1.0)
Known non-working combinations: ### Primary tested compilers on Power 8 are:
Power8: * GCC 5.4.0 (OpenMP,Serial)
Pthreads backend * IBM XL 13.1.6 (OpenMP, Serial)
* NVCC 8.0.44 for CUDA (with gcc 5.4.0)
* NVCC 9.0.103 for CUDA (with gcc 6.3.0 and XL 13.1.6)
### Primary tested compilers on Intel KNL are:
* GCC 6.2.0
* Intel 16.4.258 (with gcc 4.7.2)
* Intel 17.2.174 (with gcc 4.9.3)
* Intel 18.0.128 (with gcc 4.9.3)
### Primary tested compilers on ARM
* GCC 6.1.0
### Other compilers working:
* X86:
- Cygwin 2.1.0 64bit with gcc 4.9.3
### Known non-working combinations:
* Power8:
- Pthreads backend
* ARM
- Pthreads backend
Primary tested compiler are passing in release mode Primary tested compiler are passing in release mode
@ -97,20 +111,7 @@ NVCC: -Wall -Wshadow -pedantic -Werror -Wsign-compare -Wtype-limits -Wuninitiali
Other compilers are tested occasionally, in particular when pushing from develop to Other compilers are tested occasionally, in particular when pushing from develop to
master branch, without -Werror and only for a select set of backends. master branch, without -Werror and only for a select set of backends.
============================================================================ # Running Unit Tests
====Getting started=========================================================
============================================================================
In the 'example/tutorial' directory you will find step by step tutorial
examples which explain many of the features of Kokkos. They work with
simple Makefiles. To build with g++ and OpenMP simply type 'make'
in the 'example/tutorial' directory. This will build all examples in the
subfolders. To change the build options refer to the Programming Guide
in the compilation section.
============================================================================
====Running Unit Tests======================================================
============================================================================
To run the unit tests create a build directory and run the following commands To run the unit tests create a build directory and run the following commands
@ -121,30 +122,35 @@ make test
Run KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as Run KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
changing the device type for which to build. changing the device type for which to build.
============================================================================ # Installing the library
====Install the library=====================================================
============================================================================
To install Kokkos as a library create a build directory and run the following To install Kokkos as a library create a build directory and run the following
KOKKOS_PATH/generate_makefile.bash --prefix=INSTALL_PATH KOKKOS_PATH/generate_makefile.bash --prefix=INSTALL_PATH
make lib make kokkoslib
make install make install
KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as KOKKOS_PATH/generate_makefile.bash --help for more detailed options such as
changing the device type for which to build. changing the device type for which to build.
============================================================================ Note that in many cases it is preferable to build Kokkos inline with an
====CMakeFiles============================================================== application. The main reason is that you may otherwise need many different
============================================================================ configurations of Kokkos installed depending on the required compile time
features an application needs. For example there is only one default
execution space, which means you need different installations to have OpenMP
or Pthreads as the default space. Also for the CUDA backend there are certain
choices, such as allowing relocatable device code, which must be made at
installation time. Building Kokkos inline uses largely the same process
as compiling an application against an installed Kokkos library. See for
example benchmarks/bytes_and_flops/Makefile which can be used with an installed
library and for an inline build.
The CMake files contained in this repository require Tribits and are used ### CMake
for integration with Trilinos. They do not currently support a standalone
CMake build.
=========================================================================== Kokkos supports being build as part of a CMake applications. An example can
====Kokkos and CUDA UVM==================================================== be found in example/cmake_build.
===========================================================================
# Kokkos and CUDA UVM
Kokkos does support UVM as a specific memory space called CudaUVMSpace. Kokkos does support UVM as a specific memory space called CudaUVMSpace.
Allocations made with that space are accessible from host and device. Allocations made with that space are accessible from host and device.
@ -154,25 +160,16 @@ In either case UVM comes with a number of restrictions:
running. This will lead to segfaults. To avoid that you either need to running. This will lead to segfaults. To avoid that you either need to
call Kokkos::Cuda::fence() (or just Kokkos::fence()), after kernels, or call Kokkos::Cuda::fence() (or just Kokkos::fence()), after kernels, or
you can set the environment variable CUDA_LAUNCH_BLOCKING=1. you can set the environment variable CUDA_LAUNCH_BLOCKING=1.
Furthermore in multi socket multi GPU machines, UVM defaults to using Furthermore in multi socket multi GPU machines without NVLINK, UVM defaults
zero copy allocations for technical reasons related to using multiple to using zero copy allocations for technical reasons related to using multiple
GPUs from the same process. If an executable doesn't do that (e.g. each GPUs from the same process. If an executable doesn't do that (e.g. each
MPI rank of an application uses a single GPU [can be the same GPU for MPI rank of an application uses a single GPU [can be the same GPU for
multiple MPI ranks]) you can set CUDA_MANAGED_FORCE_DEVICE_ALLOC=1. multiple MPI ranks]) you can set CUDA_MANAGED_FORCE_DEVICE_ALLOC=1.
This will enforce proper UVM allocations, but can lead to errors if This will enforce proper UVM allocations, but can lead to errors if
more than a single GPU is used by a single process. more than a single GPU is used by a single process.
===========================================================================
====Contributing===========================================================
===========================================================================
Contributions to Kokkos are welcome. In order to do so, please open an issue # Citing Kokkos
where a feature request or bug can be discussed. Then issue a pull request
with your contribution. Pull requests must be issued against the develop branch.
===========================================================================
====Citing Kokkos==========================================================
===========================================================================
If you publish work which mentions Kokkos, please cite the following paper: If you publish work which mentions Kokkos, please cite the following paper:

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -1530,7 +1530,7 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,1,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) if(idx<static_cast<IndexType>(a.extent(0)))
a(idx) = Rand::draw(gen,range); a(idx) = Rand::draw(gen,range);
} }
rand_pool.free_state(gen); rand_pool.free_state(gen);
@ -1555,8 +1555,8 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,2,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
a(idx,k) = Rand::draw(gen,range); a(idx,k) = Rand::draw(gen,range);
} }
} }
@ -1583,9 +1583,9 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,3,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
a(idx,k,l) = Rand::draw(gen,range); a(idx,k,l) = Rand::draw(gen,range);
} }
} }
@ -1611,10 +1611,10 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,4, IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
a(idx,k,l,m) = Rand::draw(gen,range); a(idx,k,l,m) = Rand::draw(gen,range);
} }
} }
@ -1640,11 +1640,11 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,5,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
a(idx,k,l,m,n) = Rand::draw(gen,range); a(idx,k,l,m,n) = Rand::draw(gen,range);
} }
} }
@ -1670,12 +1670,12 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,6,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++) for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
a(idx,k,l,m,n,o) = Rand::draw(gen,range); a(idx,k,l,m,n,o) = Rand::draw(gen,range);
} }
} }
@ -1701,13 +1701,13 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,7,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++) for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++) for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
a(idx,k,l,m,n,o,p) = Rand::draw(gen,range); a(idx,k,l,m,n,o,p) = Rand::draw(gen,range);
} }
} }
@ -1733,14 +1733,14 @@ struct fill_random_functor_range<ViewType,RandomPool,loops,8,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++) for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++) for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
for(IndexType q=0;q<static_cast<IndexType>(a.dimension_7());q++) for(IndexType q=0;q<static_cast<IndexType>(a.extent(7));q++)
a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,range); a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,range);
} }
} }
@ -1765,7 +1765,7 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,1,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) if(idx<static_cast<IndexType>(a.extent(0)))
a(idx) = Rand::draw(gen,begin,end); a(idx) = Rand::draw(gen,begin,end);
} }
rand_pool.free_state(gen); rand_pool.free_state(gen);
@ -1790,8 +1790,8 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,2,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
a(idx,k) = Rand::draw(gen,begin,end); a(idx,k) = Rand::draw(gen,begin,end);
} }
} }
@ -1818,9 +1818,9 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,3,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
a(idx,k,l) = Rand::draw(gen,begin,end); a(idx,k,l) = Rand::draw(gen,begin,end);
} }
} }
@ -1846,10 +1846,10 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,4,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
a(idx,k,l,m) = Rand::draw(gen,begin,end); a(idx,k,l,m) = Rand::draw(gen,begin,end);
} }
} }
@ -1875,11 +1875,11 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,5,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())){ if(idx<static_cast<IndexType>(a.extent(0))){
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_1());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(1));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_2());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(2));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_3());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(3));n++)
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_4());o++) for(IndexType o=0;o<static_cast<IndexType>(a.extent(4));o++)
a(idx,l,m,n,o) = Rand::draw(gen,begin,end); a(idx,l,m,n,o) = Rand::draw(gen,begin,end);
} }
} }
@ -1905,12 +1905,12 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,6,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++) for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
a(idx,k,l,m,n,o) = Rand::draw(gen,begin,end); a(idx,k,l,m,n,o) = Rand::draw(gen,begin,end);
} }
} }
@ -1937,13 +1937,13 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,7,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++) for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++) for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
a(idx,k,l,m,n,o,p) = Rand::draw(gen,begin,end); a(idx,k,l,m,n,o,p) = Rand::draw(gen,begin,end);
} }
} }
@ -1969,14 +1969,14 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,8,IndexType>{
typename RandomPool::generator_type gen = rand_pool.get_state(); typename RandomPool::generator_type gen = rand_pool.get_state();
for(IndexType j=0;j<loops;j++) { for(IndexType j=0;j<loops;j++) {
const IndexType idx = i*loops+j; const IndexType idx = i*loops+j;
if(idx<static_cast<IndexType>(a.dimension_0())) { if(idx<static_cast<IndexType>(a.extent(0))) {
for(IndexType k=0;k<static_cast<IndexType>(a.dimension_1());k++) for(IndexType k=0;k<static_cast<IndexType>(a.extent(1));k++)
for(IndexType l=0;l<static_cast<IndexType>(a.dimension_2());l++) for(IndexType l=0;l<static_cast<IndexType>(a.extent(2));l++)
for(IndexType m=0;m<static_cast<IndexType>(a.dimension_3());m++) for(IndexType m=0;m<static_cast<IndexType>(a.extent(3));m++)
for(IndexType n=0;n<static_cast<IndexType>(a.dimension_4());n++) for(IndexType n=0;n<static_cast<IndexType>(a.extent(4));n++)
for(IndexType o=0;o<static_cast<IndexType>(a.dimension_5());o++) for(IndexType o=0;o<static_cast<IndexType>(a.extent(5));o++)
for(IndexType p=0;p<static_cast<IndexType>(a.dimension_6());p++) for(IndexType p=0;p<static_cast<IndexType>(a.extent(6));p++)
for(IndexType q=0;q<static_cast<IndexType>(a.dimension_7());q++) for(IndexType q=0;q<static_cast<IndexType>(a.extent(7));q++)
a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,begin,end); a(idx,k,l,m,n,o,p,q) = Rand::draw(gen,begin,end);
} }
} }
@ -1988,14 +1988,14 @@ struct fill_random_functor_begin_end<ViewType,RandomPool,loops,8,IndexType>{
template<class ViewType, class RandomPool, class IndexType = int64_t> template<class ViewType, class RandomPool, class IndexType = int64_t>
void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type range) { void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type range) {
int64_t LDA = a.dimension_0(); int64_t LDA = a.extent(0);
if(LDA>0) if(LDA>0)
parallel_for((LDA+127)/128,Impl::fill_random_functor_range<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,range)); parallel_for((LDA+127)/128,Impl::fill_random_functor_range<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,range));
} }
template<class ViewType, class RandomPool, class IndexType = int64_t> template<class ViewType, class RandomPool, class IndexType = int64_t>
void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type begin,typename ViewType::const_value_type end ) { void fill_random(ViewType a, RandomPool g, typename ViewType::const_value_type begin,typename ViewType::const_value_type end ) {
int64_t LDA = a.dimension_0(); int64_t LDA = a.extent(0);
if(LDA>0) if(LDA>0)
parallel_for((LDA+127)/128,Impl::fill_random_functor_begin_end<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,begin,end)); parallel_for((LDA+127)/128,Impl::fill_random_functor_begin_end<ViewType,RandomPool,128,ViewType::Rank,IndexType>(a,g,begin,end));
} }

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -120,7 +120,6 @@ public:
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator() (const int& i) const { void operator() (const int& i) const {
// printf("copy: dst(%i) src(%i)\n",i+dst_offset,i);
copy_op::copy(dst_values,i+dst_offset,src_values,i); copy_op::copy(dst_values,i+dst_offset,src_values,i);
} }
}; };
@ -151,20 +150,22 @@ public:
DstViewType dst_values ; DstViewType dst_values ;
perm_view_type sort_order ; perm_view_type sort_order ;
src_view_type src_values ; src_view_type src_values ;
int src_offset ;
copy_permute_functor( DstViewType const & dst_values_ copy_permute_functor( DstViewType const & dst_values_
, PermuteViewType const & sort_order_ , PermuteViewType const & sort_order_
, SrcViewType const & src_values_ , SrcViewType const & src_values_
, int const & src_offset_
) )
: dst_values( dst_values_ ) : dst_values( dst_values_ )
, sort_order( sort_order_ ) , sort_order( sort_order_ )
, src_values( src_values_ ) , src_values( src_values_ )
, src_offset( src_offset_ )
{} {}
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator() (const int& i) const { void operator() (const int& i) const {
// printf("copy_permute: dst(%i) src(%i)\n",i,sort_order(i)); copy_op::copy(dst_values,i,src_values,src_offset+sort_order(i));
copy_op::copy(dst_values,i,src_values,sort_order(i));
} }
}; };
@ -259,19 +260,21 @@ public:
// Create the permutation vector, the bin_offset array and the bin_count array. Can be called again if keys changed // Create the permutation vector, the bin_offset array and the bin_count array. Can be called again if keys changed
void create_permute_vector() { void create_permute_vector() {
const size_t len = range_end - range_begin ; const size_t len = range_end - range_begin ;
Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_count_tag> (0,len),*this); Kokkos::parallel_for ("Kokkos::Sort::BinCount",Kokkos::RangePolicy<execution_space,bin_count_tag> (0,len),*this);
Kokkos::parallel_scan(Kokkos::RangePolicy<execution_space,bin_offset_tag> (0,bin_op.max_bins()) ,*this); Kokkos::parallel_scan("Kokkos::Sort::BinOffset",Kokkos::RangePolicy<execution_space,bin_offset_tag> (0,bin_op.max_bins()) ,*this);
Kokkos::deep_copy(bin_count_atomic,0); Kokkos::deep_copy(bin_count_atomic,0);
Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_binning_tag> (0,len),*this); Kokkos::parallel_for ("Kokkos::Sort::BinBinning",Kokkos::RangePolicy<execution_space,bin_binning_tag> (0,len),*this);
if(sort_within_bins) if(sort_within_bins)
Kokkos::parallel_for (Kokkos::RangePolicy<execution_space,bin_sort_bins_tag>(0,bin_op.max_bins()) ,*this); Kokkos::parallel_for ("Kokkos::Sort::BinSort",Kokkos::RangePolicy<execution_space,bin_sort_bins_tag>(0,bin_op.max_bins()) ,*this);
} }
// Sort a view with respect ot the first dimension using the permutation array // Sort a subset of a view with respect to the first dimension using the permutation array
template<class ValuesViewType> template<class ValuesViewType>
void sort( ValuesViewType const & values) void sort( ValuesViewType const & values
, int values_range_begin
, int values_range_end) const
{ {
typedef typedef
Kokkos::View< typename ValuesViewType::data_type, Kokkos::View< typename ValuesViewType::data_type,
@ -280,6 +283,10 @@ public:
scratch_view_type ; scratch_view_type ;
const size_t len = range_end - range_begin ; const size_t len = range_end - range_begin ;
const size_t values_len = values_range_end - values_range_begin ;
if (len != values_len) {
Kokkos::abort("BinSort::sort: values range length != permutation vector length");
}
scratch_view_type scratch_view_type
sorted_values("Scratch", sorted_values("Scratch",
@ -297,19 +304,25 @@ public:
, offset_type /* PermuteViewType */ , offset_type /* PermuteViewType */
, ValuesViewType /* SrcViewType */ , ValuesViewType /* SrcViewType */
> >
functor( sorted_values , sort_order , values ); functor( sorted_values , sort_order , values, values_range_begin - range_begin );
parallel_for( Kokkos::RangePolicy<execution_space>(0,len),functor); parallel_for("Kokkos::Sort::CopyPermute", Kokkos::RangePolicy<execution_space>(0,len),functor);
} }
{ {
copy_functor< ValuesViewType , scratch_view_type > copy_functor< ValuesViewType , scratch_view_type >
functor( values , range_begin , sorted_values ); functor( values , range_begin , sorted_values );
parallel_for( Kokkos::RangePolicy<execution_space>(0,len),functor); parallel_for("Kokkos::Sort::Copy", Kokkos::RangePolicy<execution_space>(0,len),functor);
} }
} }
template<class ValuesViewType>
void sort( ValuesViewType const & values ) const
{
this->sort( values, 0, /*values.extent(0)*/ range_end - range_begin );
}
// Get the permutation vector // Get the permutation vector
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
offset_type get_permute_vector() const { return sort_order;} offset_type get_permute_vector() const { return sort_order;}
@ -327,7 +340,7 @@ public:
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator() (const bin_count_tag& tag, const int& i) const { void operator() (const bin_count_tag& tag, const int& i) const {
const int j = range_begin + i ; const int j = range_begin + i ;
bin_count_atomic(bin_op.bin(keys,j))++; bin_count_atomic(bin_op.bin(keys, j))++;
} }
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
@ -512,7 +525,7 @@ void sort( ViewType const & view , bool const always_use_kokkos_sort = false)
Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result; Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result;
Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result); Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result);
parallel_reduce(Kokkos::RangePolicy<typename ViewType::execution_space>(0,view.extent(0)), parallel_reduce("Kokkos::Sort::FindExtent",Kokkos::RangePolicy<typename ViewType::execution_space>(0,view.extent(0)),
Impl::min_max_functor<ViewType>(view),reducer); Impl::min_max_functor<ViewType>(view),reducer);
if(result.min_val == result.max_val) return; if(result.min_val == result.max_val) return;
BinSort<ViewType, CompType> bin_sort(view,CompType(view.extent(0)/2,result.min_val,result.max_val),true); BinSort<ViewType, CompType> bin_sort(view,CompType(view.extent(0)/2,result.min_val,result.max_val),true);
@ -532,7 +545,7 @@ void sort( ViewType view
Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result; Kokkos::Experimental::MinMaxScalar<typename ViewType::non_const_value_type> result;
Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result); Kokkos::Experimental::MinMax<typename ViewType::non_const_value_type> reducer(result);
parallel_reduce( range_policy( begin , end ) parallel_reduce("Kokkos::Sort::FindExtent", range_policy( begin , end )
, Impl::min_max_functor<ViewType>(view),reducer ); , Impl::min_max_functor<ViewType>(view),reducer );
if(result.min_val == result.max_val) return; if(result.min_val == result.max_val) return;
@ -541,8 +554,9 @@ void sort( ViewType view
bin_sort(view,begin,end,CompType((end-begin)/2,result.min_val,result.max_val),true); bin_sort(view,begin,end,CompType((end-begin)/2,result.min_val,result.max_val),true);
bin_sort.create_permute_vector(); bin_sort.create_permute_vector();
bin_sort.sort(view); bin_sort.sort(view,begin,end);
} }
} }
#endif #endif

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -61,14 +61,9 @@ class cuda : public ::testing::Test {
protected: protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision(5) << std::scientific;
Kokkos::HostSpace::execution_space::initialize();
Kokkos::Cuda::initialize( Kokkos::Cuda::SelectDevice(0) );
} }
static void TearDownTestCase() static void TearDownTestCase()
{ {
Kokkos::Cuda::finalize();
Kokkos::HostSpace::execution_space::finalize();
} }
}; };

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -60,25 +60,10 @@ protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision(5) << std::scientific; std::cout << std::setprecision(5) << std::scientific;
int threads_count = 0;
#pragma omp parallel
{
#pragma omp atomic
++threads_count;
}
if (threads_count > 3) {
threads_count /= 2;
}
Kokkos::OpenMP::initialize( threads_count );
Kokkos::OpenMP::print_configuration( std::cout );
} }
static void TearDownTestCase() static void TearDownTestCase()
{ {
Kokkos::OpenMP::finalize();
} }
}; };

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -62,13 +62,9 @@ protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision(5) << std::scientific; std::cout << std::setprecision(5) << std::scientific;
Kokkos::HostSpace::execution_space::initialize();
Kokkos::Experimental::ROCm::initialize( Kokkos::Experimental::ROCm::SelectDevice(0) );
} }
static void TearDownTestCase() static void TearDownTestCase()
{ {
Kokkos::Experimental::ROCm::finalize();
Kokkos::HostSpace::execution_space::finalize();
} }
}; };

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -62,13 +62,10 @@ class serial : public ::testing::Test {
protected: protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision (5) << std::scientific;
Kokkos::Serial::initialize ();
} }
static void TearDownTestCase () static void TearDownTestCase ()
{ {
Kokkos::Serial::finalize ();
} }
}; };

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -171,10 +171,10 @@ void test_3D_sort(unsigned int n) {
double sum_after = 0.0; double sum_after = 0.0;
unsigned int sort_fails = 0; unsigned int sort_fails = 0;
Kokkos::parallel_reduce(keys.dimension_0(),sum3D<ExecutionSpace, KeyType>(keys),sum_before); Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_before);
int bin_1d = 1; int bin_1d = 1;
while( bin_1d*bin_1d*bin_1d*4< (int) keys.dimension_0() ) bin_1d*=2; while( bin_1d*bin_1d*bin_1d*4< (int) keys.extent(0) ) bin_1d*=2;
int bin_max[3] = {bin_1d,bin_1d,bin_1d}; int bin_max[3] = {bin_1d,bin_1d,bin_1d};
typename KeyViewType::value_type min[3] = {0,0,0}; typename KeyViewType::value_type min[3] = {0,0,0};
typename KeyViewType::value_type max[3] = {100,100,100}; typename KeyViewType::value_type max[3] = {100,100,100};
@ -186,8 +186,8 @@ void test_3D_sort(unsigned int n) {
Sorter.create_permute_vector(); Sorter.create_permute_vector();
Sorter.template sort< KeyViewType >(keys); Sorter.template sort< KeyViewType >(keys);
Kokkos::parallel_reduce(keys.dimension_0(),sum3D<ExecutionSpace, KeyType>(keys),sum_after); Kokkos::parallel_reduce(keys.extent(0),sum3D<ExecutionSpace, KeyType>(keys),sum_after);
Kokkos::parallel_reduce(keys.dimension_0()-1,bin3d_is_sorted_struct<ExecutionSpace, KeyType>(keys,bin_1d,min[0],max[0]),sort_fails); Kokkos::parallel_reduce(keys.extent(0)-1,bin3d_is_sorted_struct<ExecutionSpace, KeyType>(keys,bin_1d,min[0],max[0]),sort_fails);
double ratio = sum_before/sum_after; double ratio = sum_before/sum_after;
double epsilon = 1e-10; double epsilon = 1e-10;
@ -205,24 +205,13 @@ void test_3D_sort(unsigned int n) {
template<class ExecutionSpace, typename KeyType> template<class ExecutionSpace, typename KeyType>
void test_dynamic_view_sort(unsigned int n ) void test_dynamic_view_sort(unsigned int n )
{ {
typedef typename ExecutionSpace::memory_space memory_space ;
typedef Kokkos::Experimental::DynamicView<KeyType*,ExecutionSpace> KeyDynamicViewType; typedef Kokkos::Experimental::DynamicView<KeyType*,ExecutionSpace> KeyDynamicViewType;
typedef Kokkos::View<KeyType*,ExecutionSpace> KeyViewType; typedef Kokkos::View<KeyType*,ExecutionSpace> KeyViewType;
const size_t upper_bound = 2 * n ; const size_t upper_bound = 2 * n ;
const size_t min_chunk_size = 1024;
const size_t total_alloc_size = n * sizeof(KeyType) * 1.2 ; KeyDynamicViewType keys("Keys", min_chunk_size, upper_bound);
const size_t superblock_size = std::min(total_alloc_size, size_t(1000000));
typename KeyDynamicViewType::memory_pool
pool( memory_space()
, n * sizeof(KeyType) * 1.2
, 500 /* min block size in bytes */
, 30000 /* max block size in bytes */
, superblock_size
);
KeyDynamicViewType keys("Keys",pool,upper_bound);
keys.resize_serial(n); keys.resize_serial(n);
@ -230,13 +219,15 @@ void test_dynamic_view_sort(unsigned int n )
// Test sorting array with all numbers equal // Test sorting array with all numbers equal
Kokkos::deep_copy(keys_view,KeyType(1)); Kokkos::deep_copy(keys_view,KeyType(1));
Kokkos::Experimental::deep_copy(keys,keys_view); Kokkos::deep_copy(keys,keys_view);
Kokkos::sort(keys, 0 /* begin */ , n /* end */ ); Kokkos::sort(keys, 0 /* begin */ , n /* end */ );
Kokkos::Random_XorShift64_Pool<ExecutionSpace> g(1931); Kokkos::Random_XorShift64_Pool<ExecutionSpace> g(1931);
Kokkos::fill_random(keys_view,g,Kokkos::Random_XorShift64_Pool<ExecutionSpace>::generator_type::MAX_URAND); Kokkos::fill_random(keys_view,g,Kokkos::Random_XorShift64_Pool<ExecutionSpace>::generator_type::MAX_URAND);
Kokkos::Experimental::deep_copy(keys,keys_view); ExecutionSpace::fence();
Kokkos::deep_copy(keys,keys_view);
//ExecutionSpace::fence();
double sum_before = 0.0; double sum_before = 0.0;
double sum_after = 0.0; double sum_after = 0.0;
@ -246,7 +237,9 @@ void test_dynamic_view_sort(unsigned int n )
Kokkos::sort(keys, 0 /* begin */ , n /* end */ ); Kokkos::sort(keys, 0 /* begin */ , n /* end */ );
Kokkos::Experimental::deep_copy( keys_view , keys ); ExecutionSpace::fence(); // Need this fence to prevent BusError with Cuda
Kokkos::deep_copy( keys_view , keys );
//ExecutionSpace::fence();
Kokkos::parallel_reduce(n,sum<ExecutionSpace, KeyType>(keys_view),sum_after); Kokkos::parallel_reduce(n,sum<ExecutionSpace, KeyType>(keys_view),sum_after);
Kokkos::parallel_reduce(n-1,is_sorted_struct<ExecutionSpace, KeyType>(keys_view),sort_fails); Kokkos::parallel_reduce(n-1,is_sorted_struct<ExecutionSpace, KeyType>(keys_view),sort_fails);
@ -269,6 +262,74 @@ void test_dynamic_view_sort(unsigned int n )
//---------------------------------------------------------------------------- //----------------------------------------------------------------------------
template<class ExecutionSpace>
void test_issue_1160()
{
Kokkos::View<int*, ExecutionSpace> element_("element", 10);
Kokkos::View<double*, ExecutionSpace> x_("x", 10);
Kokkos::View<double*, ExecutionSpace> v_("y", 10);
auto h_element = Kokkos::create_mirror_view(element_);
auto h_x = Kokkos::create_mirror_view(x_);
auto h_v = Kokkos::create_mirror_view(v_);
h_element(0) = 9;
h_element(1) = 8;
h_element(2) = 7;
h_element(3) = 6;
h_element(4) = 5;
h_element(5) = 4;
h_element(6) = 3;
h_element(7) = 2;
h_element(8) = 1;
h_element(9) = 0;
for (int i = 0; i < 10; ++i) {
h_v.access(i, 0) = h_x.access(i, 0) = double(h_element(i));
}
Kokkos::deep_copy(element_, h_element);
Kokkos::deep_copy(x_, h_x);
Kokkos::deep_copy(v_, h_v);
typedef decltype(element_) KeyViewType;
typedef Kokkos::BinOp1D< KeyViewType > BinOp;
int begin = 3;
int end = 8;
auto max = h_element(begin);
auto min = h_element(end - 1);
BinOp binner(end - begin, min, max);
Kokkos::BinSort<KeyViewType , BinOp > Sorter(element_,begin,end,binner,false);
Sorter.create_permute_vector();
Sorter.sort(element_,begin,end);
Sorter.sort(x_,begin,end);
Sorter.sort(v_,begin,end);
Kokkos::deep_copy(h_element, element_);
Kokkos::deep_copy(h_x, x_);
Kokkos::deep_copy(h_v, v_);
ASSERT_EQ(h_element(0), 9);
ASSERT_EQ(h_element(1), 8);
ASSERT_EQ(h_element(2), 7);
ASSERT_EQ(h_element(3), 2);
ASSERT_EQ(h_element(4), 3);
ASSERT_EQ(h_element(5), 4);
ASSERT_EQ(h_element(6), 5);
ASSERT_EQ(h_element(7), 6);
ASSERT_EQ(h_element(8), 1);
ASSERT_EQ(h_element(9), 0);
for (int i = 0; i < 10; ++i) {
ASSERT_EQ(h_element(i), int(h_x.access(i, 0)));
ASSERT_EQ(h_element(i), int(h_v.access(i, 0)));
}
}
//----------------------------------------------------------------------------
template<class ExecutionSpace, typename KeyType> template<class ExecutionSpace, typename KeyType>
void test_sort(unsigned int N) void test_sort(unsigned int N)
{ {
@ -278,6 +339,7 @@ void test_sort(unsigned int N)
test_3D_sort<ExecutionSpace,KeyType>(N); test_3D_sort<ExecutionSpace,KeyType>(N);
test_dynamic_view_sort<ExecutionSpace,KeyType>(N*N); test_dynamic_view_sort<ExecutionSpace,KeyType>(N*N);
#endif #endif
test_issue_1160<ExecutionSpace>();
} }
} }

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -63,25 +63,10 @@ protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision(5) << std::scientific; std::cout << std::setprecision(5) << std::scientific;
unsigned num_threads = 4;
if (Kokkos::hwloc::available()) {
num_threads = Kokkos::hwloc::get_available_numa_count()
* Kokkos::hwloc::get_available_cores_per_numa()
// * Kokkos::hwloc::get_available_threads_per_core()
;
}
std::cout << "Threads: " << num_threads << std::endl;
Kokkos::Threads::initialize( num_threads );
} }
static void TearDownTestCase() static void TearDownTestCase()
{ {
Kokkos::Threads::finalize();
} }
}; };

View File

@ -35,16 +35,20 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
*/ */
#include <gtest/gtest.h> #include <gtest/gtest.h>
#include <Kokkos_Core.hpp>
int main(int argc, char *argv[]) { int main(int argc, char *argv[]) {
Kokkos::initialize(argc,argv);
::testing::InitGoogleTest(&argc,argv); ::testing::InitGoogleTest(&argc,argv);
return RUN_ALL_TESTS(); int result = RUN_ALL_TESTS();
Kokkos::finalize();
return result;
} }

View File

@ -10,7 +10,7 @@ default: build
ifneq (,$(findstring Cuda,$(KOKKOS_DEVICES))) ifneq (,$(findstring Cuda,$(KOKKOS_DEVICES)))
CXX = ${KOKKOS_PATH}/config/nvcc_wrapper CXX = ${KOKKOS_PATH}/bin/nvcc_wrapper
EXE = ${EXE_NAME}.cuda EXE = ${EXE_NAME}.cuda
KOKKOS_CUDA_OPTIONS = "enable_lambda" KOKKOS_CUDA_OPTIONS = "enable_lambda"
else else

View File

@ -3,7 +3,7 @@
# BytesAndFlops # BytesAndFlops
cd build/bytes_and_flops cd build/bytes_and_flops
USE_CUDA=`grep "_CUDA 1" KokkosCore_config.h | wc -l` USE_CUDA=`grep "_CUDA" KokkosCore_config.h | wc -l`
if [[ ${USE_CUDA} > 0 ]]; then if [[ ${USE_CUDA} > 0 ]]; then
BAF_EXE=bytes_and_flops.cuda BAF_EXE=bytes_and_flops.cuda

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -2,7 +2,7 @@
# FindHWLOC # FindHWLOC
# ---------- # ----------
# #
# Try to find HWLOC. # Try to find HWLOC, based on KOKKOS_HWLOC_DIR
# #
# The following variables are defined: # The following variables are defined:
# #
@ -10,8 +10,8 @@
# HWLOC_INCLUDE_DIR - HWLOC include directory # HWLOC_INCLUDE_DIR - HWLOC include directory
# HWLOC_LIBRARIES - Libraries needed to use HWLOC # HWLOC_LIBRARIES - Libraries needed to use HWLOC
find_path(HWLOC_INCLUDE_DIR hwloc.h) find_path(HWLOC_INCLUDE_DIR hwloc.h PATHS "${KOKKOS_HWLOC_DIR}/include")
find_library(HWLOC_LIBRARIES hwloc) find_library(HWLOC_LIBRARIES hwloc PATHS "${KOKKOS_HWLOC_DIR}/lib")
include(FindPackageHandleStandardArgs) include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(HWLOC DEFAULT_MSG find_package_handle_standard_args(HWLOC DEFAULT_MSG

View File

@ -1,7 +1,3 @@
# kokkos_generated_settings.cmake includes the kokkos library itself in KOKKOS_LIBS
# which we do not want to use for the cmake builds so clean this up
string(REGEX REPLACE "-lkokkos" "" KOKKOS_LIBS ${KOKKOS_LIBS})
############################ Detect if submodule ############################### ############################ Detect if submodule ###############################
# #
# With thanks to StackOverflow: # With thanks to StackOverflow:
@ -73,6 +69,19 @@ IF(KOKKOS_SEPARATE_LIBS)
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}> PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
) )
target_include_directories(
kokkoscore
PUBLIC
${KOKKOS_TPL_INCLUDE_DIRS}
)
foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
target_link_libraries(kokkoscore PUBLIC ${LIB_${lib}})
endforeach()
target_link_libraries(kokkoscore PUBLIC "${KOKKOS_LINK_FLAGS}")
# Install the kokkoscore library # Install the kokkoscore library
INSTALL (TARGETS kokkoscore INSTALL (TARGETS kokkoscore
EXPORT KokkosTargets EXPORT KokkosTargets
@ -81,12 +90,6 @@ IF(KOKKOS_SEPARATE_LIBS)
RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin RUNTIME DESTINATION ${CMAKE_INSTALL_PREFIX}/bin
) )
TARGET_LINK_LIBRARIES(
kokkoscore
${KOKKOS_LD_FLAGS}
${KOKKOS_EXTRA_LIBS_LIST}
)
# kokkoscontainers # kokkoscontainers
if (DEFINED KOKKOS_CONTAINERS_SRCS) if (DEFINED KOKKOS_CONTAINERS_SRCS)
ADD_LIBRARY( ADD_LIBRARY(
@ -144,12 +147,19 @@ ELSE()
PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}> PUBLIC $<$<COMPILE_LANGUAGE:CXX>:${KOKKOS_CXX_FLAGS}>
) )
TARGET_LINK_LIBRARIES( target_include_directories(
kokkos kokkos
${KOKKOS_LD_FLAGS} PUBLIC
${KOKKOS_EXTRA_LIBS_LIST} ${KOKKOS_TPL_INCLUDE_DIRS}
) )
foreach(lib IN LISTS KOKKOS_TPL_LIBRARY_NAMES)
find_library(LIB_${lib} ${lib} PATHS ${KOKKOS_TPL_LIBRARY_DIRS})
target_link_libraries(kokkos PUBLIC ${LIB_${lib}})
endforeach()
target_link_libraries(kokkos PUBLIC "${KOKKOS_LINK_FLAGS}")
# Install the kokkos library # Install the kokkos library
INSTALL (TARGETS kokkos INSTALL (TARGETS kokkos
EXPORT KokkosTargets EXPORT KokkosTargets

View File

@ -25,11 +25,12 @@ list(APPEND KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST
Cuda_LDG_Intrinsic Cuda_LDG_Intrinsic
Debug Debug
Debug_DualView_Modify_Check Debug_DualView_Modify_Check
Debug_Bounds_Checkt Debug_Bounds_Check
Compiler_Warnings Compiler_Warnings
Profiling Profiling
Profiling_Load_Print Profiling_Load_Print
Aggressive_Vectorization Aggressive_Vectorization
Deprecated_Code
) )
#------------------------------------------------------------------------------- #-------------------------------------------------------------------------------
@ -263,7 +264,8 @@ set(KOKKOS_ENABLE_PROFILING ${KOKKOS_INTERNAL_ENABLE_PROFILING_DEFAULT} CACHE BO
set_kokkos_default_default(PROFILING_LOAD_PRINT OFF) set_kokkos_default_default(PROFILING_LOAD_PRINT OFF)
set(KOKKOS_ENABLE_PROFILING_LOAD_PRINT ${KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT_DEFAULT} CACHE BOOL "Enable profile load print.") set(KOKKOS_ENABLE_PROFILING_LOAD_PRINT ${KOKKOS_INTERNAL_ENABLE_PROFILING_LOAD_PRINT_DEFAULT} CACHE BOOL "Enable profile load print.")
set_kokkos_default_default(DEPRECATED_CODE ON)
set(KOKKOS_ENABLE_DEPRECATED_CODE ${KOKKOS_INTERNAL_ENABLE_DEPRECATED_CODE_DEFAULT} CACHE BOOL "Enable deprecated code.")
#------------------------------------------------------------------------------- #-------------------------------------------------------------------------------

View File

@ -14,6 +14,13 @@
#------------------------------------------------------------------------------- #-------------------------------------------------------------------------------
# Ensure that KOKKOS_ARCH is in the ARCH_LIST # Ensure that KOKKOS_ARCH is in the ARCH_LIST
if (KOKKOS_ARCH MATCHES ",")
message("-- Detected a comma in: KOKKOS_ARCH=${KOKKOS_ARCH}")
message("-- Although we prefer KOKKOS_ARCH to be semicolon-delimited, we do allow")
message("-- comma-delimited values for compatibility with scripts (see github.com/trilinos/Trilinos/issues/2330)")
string(REPLACE "," ";" KOKKOS_ARCH "${KOKKOS_ARCH}")
message("-- Commas were changed to semicolons, now KOKKOS_ARCH=${KOKKOS_ARCH}")
endif()
foreach(arch ${KOKKOS_ARCH}) foreach(arch ${KOKKOS_ARCH})
list(FIND KOKKOS_ARCH_LIST ${arch} indx) list(FIND KOKKOS_ARCH_LIST ${arch} indx)
if (indx EQUAL -1) if (indx EQUAL -1)
@ -23,14 +30,13 @@ foreach(arch ${KOKKOS_ARCH})
endforeach() endforeach()
# KOKKOS_SETTINGS uses KOKKOS_ARCH # KOKKOS_SETTINGS uses KOKKOS_ARCH
string(REPLACE ";" "," KOKKOS_ARCH "${KOKKOS_ARCH}") string(REPLACE ";" "," KOKKOS_GMAKE_ARCH "${KOKKOS_ARCH}")
set(KOKKOS_ARCH ${KOKKOS_ARCH})
# From Makefile.kokkos: Options: yes,no # From Makefile.kokkos: Options: yes,no
if(${KOKKOS_ENABLE_DEBUG}) if(${KOKKOS_ENABLE_DEBUG})
set(KOKKOS_DEBUG yes) set(KOKKOS_GMAKE_DEBUG yes)
else() else()
set(KOKKOS_DEBUG no) set(KOKKOS_GMAKE_DEBUG no)
endif() endif()
#------------------------------- KOKKOS_DEVICES -------------------------------- #------------------------------- KOKKOS_DEVICES --------------------------------
@ -43,10 +49,10 @@ foreach(devopt ${KOKKOS_DEVICES_LIST})
endif () endif ()
endforeach() endforeach()
# List needs to be comma-delmitted # List needs to be comma-delmitted
string(REPLACE ";" "," KOKKOS_DEVICES "${KOKKOS_DEVICESl}") string(REPLACE ";" "," KOKKOS_GMAKE_DEVICES "${KOKKOS_DEVICESl}")
#------------------------------- KOKKOS_OPTIONS -------------------------------- #------------------------------- KOKKOS_OPTIONS --------------------------------
# From Makefile.kokkos: Options: aggressive_vectorization,disable_profiling # From Makefile.kokkos: Options: aggressive_vectorization,disable_profiling,disable_deprecated_code
#compiler_warnings, aggressive_vectorization, disable_profiling, disable_dualview_modify_check, enable_profile_load_print #compiler_warnings, aggressive_vectorization, disable_profiling, disable_dualview_modify_check, enable_profile_load_print
set(KOKKOS_OPTIONSl) set(KOKKOS_OPTIONSl)
@ -57,7 +63,10 @@ if(${KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION})
list(APPEND KOKKOS_OPTIONSl aggressive_vectorization) list(APPEND KOKKOS_OPTIONSl aggressive_vectorization)
endif() endif()
if(NOT ${KOKKOS_ENABLE_PROFILING}) if(NOT ${KOKKOS_ENABLE_PROFILING})
list(APPEND KOKKOS_OPTIONSl disable_vectorization) list(APPEND KOKKOS_OPTIONSl disable_profiling)
endif()
if(NOT ${KOKKOS_ENABLE_DEPRECATED_CODE})
list(APPEND KOKKOS_OPTIONSl disable_deprecated_code)
endif() endif()
if(NOT ${KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK}) if(NOT ${KOKKOS_ENABLE_DEBUG_DUALVIEW_MODIFY_CHECK})
list(APPEND KOKKOS_OPTIONSl disable_dualview_modify_check) list(APPEND KOKKOS_OPTIONSl disable_dualview_modify_check)
@ -66,7 +75,7 @@ if(${KOKKOS_ENABLE_PROFILING_LOAD_PRINT})
list(APPEND KOKKOS_OPTIONSl enable_profile_load_print) list(APPEND KOKKOS_OPTIONSl enable_profile_load_print)
endif() endif()
# List needs to be comma-delimitted # List needs to be comma-delimitted
string(REPLACE ";" "," KOKKOS_OPTIONS "${KOKKOS_OPTIONSl}") string(REPLACE ";" "," KOKKOS_GMAKE_OPTIONS "${KOKKOS_OPTIONSl}")
#------------------------------- KOKKOS_USE_TPLS ------------------------------- #------------------------------- KOKKOS_USE_TPLS -------------------------------
@ -78,19 +87,19 @@ foreach(tplopt ${KOKKOS_USE_TPLS_LIST})
endif () endif ()
endforeach() endforeach()
# List needs to be comma-delimitted # List needs to be comma-delimitted
string(REPLACE ";" "," KOKKOS_USE_TPLS "${KOKKOS_USE_TPLSl}") string(REPLACE ";" "," KOKKOS_GMAKE_USE_TPLS "${KOKKOS_USE_TPLSl}")
#------------------------------- KOKKOS_CUDA_OPTIONS --------------------------- #------------------------------- KOKKOS_CUDA_OPTIONS ---------------------------
# Construct the Makefile options # Construct the Makefile options
set(KOKKOS_CUDA_OPTIONS) set(KOKKOS_CUDA_OPTIONSl)
foreach(cudaopt ${KOKKOS_CUDA_OPTIONS_LIST}) foreach(cudaopt ${KOKKOS_CUDA_OPTIONS_LIST})
if (${KOKKOS_ENABLE_CUDA_${cudaopt}}) if (${KOKKOS_ENABLE_CUDA_${cudaopt}})
list(APPEND KOKKOS_CUDA_OPTIONSl ${KOKKOS_INTERNAL_${cudaopt}}) list(APPEND KOKKOS_CUDA_OPTIONSl ${KOKKOS_INTERNAL_${cudaopt}})
endif () endif ()
endforeach() endforeach()
# List needs to be comma-delmitted # List needs to be comma-delmitted
string(REPLACE ";" "," KOKKOS_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}") string(REPLACE ";" "," KOKKOS_GMAKE_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")
#------------------------------- PATH VARIABLES -------------------------------- #------------------------------- PATH VARIABLES --------------------------------
# Want makefile to use same executables specified which means modifying # Want makefile to use same executables specified which means modifying
@ -100,10 +109,10 @@ string(REPLACE ";" "," KOKKOS_CUDA_OPTIONS "${KOKKOS_CUDA_OPTIONSl}")
set(KOKKOS_INTERNAL_PATHS) set(KOKKOS_INTERNAL_PATHS)
set(addpathl) set(addpathl)
foreach(kvar "CUDA;QTHREADS;${KOKKOS_USE_TPLS_LIST}") foreach(kvar IN LISTS KOKKOS_USE_TPLS_LIST ITEMS CUDA QTHREADS)
if(${KOKKOS_ENABLE_${kvar}}) if(${KOKKOS_ENABLE_${kvar}})
if(DEFINED KOKKOS_${kvar}_DIR) if(DEFINED KOKKOS_${kvar}_DIR)
set(KOKKOS_INTERNAL_PATHS "${KOKKOS_INTERNAL_PATHS} ${kvar}_PATH=${KOKKOS_${kvar}_DIR}") set(KOKKOS_INTERNAL_PATHS ${KOKKOS_INTERNAL_PATHS} "${kvar}_PATH=${KOKKOS_${kvar}_DIR}")
if(IS_DIRECTORY ${KOKKOS_${kvar}_DIR}/bin) if(IS_DIRECTORY ${KOKKOS_${kvar}_DIR}/bin)
list(APPEND addpathl ${KOKKOS_${kvar}_DIR}/bin) list(APPEND addpathl ${KOKKOS_${kvar}_DIR}/bin)
endif() endif()
@ -124,10 +133,9 @@ set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_INSTALL_PATH=${CMAKE_INSTALL_PREFI
# Form of KOKKOS_foo=$KOKKOS_foo # Form of KOKKOS_foo=$KOKKOS_foo
foreach(kvar ARCH;DEVICES;DEBUG;OPTIONS;CUDA_OPTIONS;USE_TPLS) foreach(kvar ARCH;DEVICES;DEBUG;OPTIONS;CUDA_OPTIONS;USE_TPLS)
set(KOKKOS_VAR KOKKOS_${kvar}) if(DEFINED KOKKOS_GMAKE_${kvar})
if(DEFINED KOKKOS_${kvar}) if (NOT "${KOKKOS_GMAKE_${kvar}}" STREQUAL "")
if (NOT "${${KOKKOS_VAR}}" STREQUAL "") set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} KOKKOS_${kvar}=${KOKKOS_GMAKE_${kvar}})
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_VAR}=${${KOKKOS_VAR}})
endif() endif()
endif() endif()
endforeach() endforeach()
@ -147,7 +155,7 @@ if (NOT "${KOKKOS_INTERNAL_PATHS}" STREQUAL "")
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_INTERNAL_PATHS}) set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} ${KOKKOS_INTERNAL_PATHS})
endif() endif()
if (NOT "${KOKKOS_INTERNAL_ADDTOPATH}" STREQUAL "") if (NOT "${KOKKOS_INTERNAL_ADDTOPATH}" STREQUAL "")
set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} PATH=${KOKKOS_INTERNAL_ADDTOPATH}:\${PATH}) set(KOKKOS_SETTINGS ${KOKKOS_SETTINGS} "PATH=\"${KOKKOS_INTERNAL_ADDTOPATH}:$ENV{PATH}\"")
endif() endif()
# Final form that gets passed to make # Final form that gets passed to make
@ -185,7 +193,7 @@ if(KOKKOS_CMAKE_VERBOSE)
message(STATUS "") message(STATUS "")
message(STATUS "Architectures:") message(STATUS "Architectures:")
message(STATUS " ${KOKKOS_ARCH}") message(STATUS " ${KOKKOS_GMAKE_ARCH}")
message(STATUS "") message(STATUS "")
message(STATUS "Enabled options") message(STATUS "Enabled options")
@ -194,43 +202,14 @@ if(KOKKOS_CMAKE_VERBOSE)
message(STATUS " KOKKOS_SEPARATE_LIBS") message(STATUS " KOKKOS_SEPARATE_LIBS")
endif() endif()
if(KOKKOS_ENABLE_HWLOC) foreach(opt IN LISTS KOKKOS_INTERNAL_ENABLE_OPTIONS_LIST)
message(STATUS " KOKKOS_ENABLE_HWLOC") string(TOUPPER ${opt} OPT)
endif() if (KOKKOS_ENABLE_${OPT})
message(STATUS " KOKKOS_ENABLE_${OPT}")
if(KOKKOS_ENABLE_MEMKIND) endif()
message(STATUS " KOKKOS_ENABLE_MEMKIND") endforeach()
endif()
if(KOKKOS_ENABLE_DEBUG)
message(STATUS " KOKKOS_ENABLE_DEBUG")
endif()
if(KOKKOS_ENABLE_PROFILING)
message(STATUS " KOKKOS_ENABLE_PROFILING")
endif()
if(KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION)
message(STATUS " KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION")
endif()
if(KOKKOS_ENABLE_CUDA) if(KOKKOS_ENABLE_CUDA)
if(KOKKOS_ENABLE_CUDA_LDG_INTRINSIC)
message(STATUS " KOKKOS_ENABLE_CUDA_LDG_INTRINSIC")
endif()
if(KOKKOS_ENABLE_CUDA_UVM)
message(STATUS " KOKKOS_ENABLE_CUDA_UVM")
endif()
if(KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE)
message(STATUS " KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE")
endif()
if(KOKKOS_ENABLE_CUDA_LAMBDA)
message(STATUS " KOKKOS_ENABLE_CUDA_LAMBDA")
endif()
if(KOKKOS_CUDA_DIR) if(KOKKOS_CUDA_DIR)
message(STATUS " KOKKOS_CUDA_DIR: ${KOKKOS_CUDA_DIR}") message(STATUS " KOKKOS_CUDA_DIR: ${KOKKOS_CUDA_DIR}")
endif() endif()

View File

@ -3,7 +3,7 @@ INCLUDE(CTest)
cmake_policy(SET CMP0054 NEW) cmake_policy(SET CMP0054 NEW)
MESSAGE(WARNING "The project name is: ${PROJECT_NAME}") MESSAGE(STATUS "The project name is: ${PROJECT_NAME}")
IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_OpenMP) IF(NOT DEFINED ${PROJECT_NAME}_ENABLE_OpenMP)
SET(${PROJECT_NAME}_ENABLE_OpenMP OFF) SET(${PROJECT_NAME}_ENABLE_OpenMP OFF)
@ -84,9 +84,6 @@ ENDFUNCTION()
MACRO(TRIBITS_ADD_TEST_DIRECTORIES) MACRO(TRIBITS_ADD_TEST_DIRECTORIES)
message(STATUS "ProjectName: " ${PROJECT_NAME})
message(STATUS "Tests: " ${${PROJECT_NAME}_ENABLE_TESTS})
IF(${${PROJECT_NAME}_ENABLE_TESTS}) IF(${${PROJECT_NAME}_ENABLE_TESTS})
FOREACH(TEST_DIR ${ARGN}) FOREACH(TEST_DIR ${ARGN})
ADD_SUBDIRECTORY(${TEST_DIR}) ADD_SUBDIRECTORY(${TEST_DIR})
@ -95,13 +92,11 @@ MACRO(TRIBITS_ADD_TEST_DIRECTORIES)
ENDMACRO() ENDMACRO()
MACRO(TRIBITS_ADD_EXAMPLE_DIRECTORIES) MACRO(TRIBITS_ADD_EXAMPLE_DIRECTORIES)
IF(${PACKAGE_NAME}_ENABLE_EXAMPLES OR ${PARENT_PACKAGE_NAME}_ENABLE_EXAMPLES) IF(${PACKAGE_NAME}_ENABLE_EXAMPLES OR ${PARENT_PACKAGE_NAME}_ENABLE_EXAMPLES)
FOREACH(EXAMPLE_DIR ${ARGN}) FOREACH(EXAMPLE_DIR ${ARGN})
ADD_SUBDIRECTORY(${EXAMPLE_DIR}) ADD_SUBDIRECTORY(${EXAMPLE_DIR})
ENDFOREACH() ENDFOREACH()
ENDIF() ENDIF()
ENDMACRO() ENDMACRO()

View File

@ -1,190 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
# Additional command-line arguments given to this script will be
# passed directly to CMake.
#
#
# Force CMake to re-evaluate build options.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#-----------------------------------------------------------------------------
# Incrementally construct cmake configure options:
CMAKE_CONFIGURE=""
#-----------------------------------------------------------------------------
# Location of Trilinos source tree:
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
# Location for installation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/projects/kokkos/host/`date +%F`"
#-----------------------------------------------------------------------------
# General build options.
# Use a variable so options can be propagated to CUDA compiler.
CMAKE_VERBOSE_MAKEFILE=OFF
CMAKE_BUILD_TYPE=RELEASE
# CMAKE_BUILD_TYPE=DEBUG
#-----------------------------------------------------------------------------
# Build for CUDA architecture:
CUDA_ARCH=""
# CUDA_ARCH="20"
# CUDA_ARCH="30"
# CUDA_ARCH="35"
# Build with Intel compiler
INTEL=ON
# Build for MIC architecture:
# INTEL_XEON_PHI=ON
# Build with HWLOC at location:
HWLOC_BASE_DIR="/home/projects/libraries/host/hwloc/1.6.2"
# Location for MPI to use in examples:
MPI_BASE_DIR=""
#-----------------------------------------------------------------------------
# MPI configuation only used for examples:
#
# Must have the MPI_BASE_DIR so that the
# include path can be passed to the Cuda compiler
if [ -n "${MPI_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Pthread configuation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP configuation:
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Configure packages for kokkos-only:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Hardware locality cmake configuration:
if [ -n "${HWLOC_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
fi
#-----------------------------------------------------------------------------
# Cuda cmake configuration:
if [ -n "${CUDA_ARCH}" ] ;
then
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
# this is different than the standard CMAKE_CXX_FLAGS syntax.
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
fi
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
fi
#-----------------------------------------------------------------------------
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
fi
#-----------------------------------------------------------------------------
# Cross-compile for Intel Xeon Phi:
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
# Cannot cross-compile fortran compatibility checks on the MIC:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
# Tell cmake the answers to compile-and-execute tests
# to prevent cmake from executing a cross-compiled program.
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
fi
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
#-----------------------------------------------------------------------------
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,186 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
# Additional command-line arguments given to this script will be
# passed directly to CMake.
#
#
# Force CMake to re-evaluate build options.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#-----------------------------------------------------------------------------
# Incrementally construct cmake configure options:
CMAKE_CONFIGURE=""
#-----------------------------------------------------------------------------
# Location of Trilinos source tree:
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
# Location for installation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/projects/kokkos/mic/`date +%F`"
#-----------------------------------------------------------------------------
# General build options.
# Use a variable so options can be propagated to CUDA compiler.
CMAKE_VERBOSE_MAKEFILE=OFF
CMAKE_BUILD_TYPE=RELEASE
# CMAKE_BUILD_TYPE=DEBUG
#-----------------------------------------------------------------------------
# Build for CUDA architecture:
CUDA_ARCH=""
# CUDA_ARCH="20"
# CUDA_ARCH="30"
# CUDA_ARCH="35"
# Build for MIC architecture:
INTEL_XEON_PHI=ON
# Build with HWLOC at location:
HWLOC_BASE_DIR="/home/projects/libraries/mic/hwloc/1.6.2"
# Location for MPI to use in examples:
MPI_BASE_DIR=""
#-----------------------------------------------------------------------------
# MPI configuation only used for examples:
#
# Must have the MPI_BASE_DIR so that the
# include path can be passed to the Cuda compiler
if [ -n "${MPI_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Pthread configuation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP configuation:
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Configure packages for kokkos-only:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Hardware locality cmake configuration:
if [ -n "${HWLOC_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
fi
#-----------------------------------------------------------------------------
# Cuda cmake configuration:
if [ -n "${CUDA_ARCH}" ] ;
then
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
# this is different than the standard CMAKE_CXX_FLAGS syntax.
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
fi
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
fi
#-----------------------------------------------------------------------------
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
fi
#-----------------------------------------------------------------------------
# Cross-compile for Intel Xeon Phi:
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
# Cannot cross-compile fortran compatibility checks on the MIC:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
# Tell cmake the answers to compile-and-execute tests
# to prevent cmake from executing a cross-compiled program.
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
fi
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
#-----------------------------------------------------------------------------
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,293 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# General build options.
# Use a variable so options can be propagated to CUDA compiler.
CMAKE_BUILD_TYPE=RELEASE
# CMAKE_BUILD_TYPE=DEBUG
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
#-----------------------------------------------------------------------------
USE_CUDA_ARCH=
USE_THREAD=
USE_OPENMP=
USE_INTEL=
USE_XEON_PHI=
HWLOC_BASE_DIR=
MPI_BASE_DIR=
BLAS_LIB_DIR=
LAPACK_LIB_DIR=
if [ 1 ] ; then
# Platform 'kokkos-dev' with Cuda, OpenMP, hwloc, mpi, gnu
USE_CUDA_ARCH="35"
USE_OPENMP=ON
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.4.7"
MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.4.7"
BLAS_LIB_DIR="/home/projects/blas/host/gnu/lib"
LAPACK_LIB_DIR="/home/projects/lapack/host/gnu/lib"
elif [ ] ; then
# Platform 'kokkos-dev' with Cuda, Threads, hwloc, mpi, gnu
USE_CUDA_ARCH="35"
USE_THREAD=ON
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.4.7"
MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.4.7"
BLAS_LIB_DIR="/home/projects/blas/host/gnu/lib"
LAPACK_LIB_DIR="/home/projects/lapack/host/gnu/lib"
elif [ ] ; then
# Platform 'kokkos-dev' with Xeon Phi and hwloc
USE_OPENMP=ON
USE_INTEL=ON
USE_XEON_PHI=ON
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/mic/intel/13.SP1.1.106"
elif [ ] ; then
# Platform 'kokkos-nvidia' with Cuda, OpenMP, hwloc, mpi, gnu
USE_CUDA_ARCH="20"
USE_OPENMP=ON
HWLOC_BASE_DIR="/home/sems/common/hwloc/current"
MPI_BASE_DIR="/home/sems/common/openmpi/current"
elif [ ] ; then
# Platform 'kokkos-nvidia' with Cuda, Threads, hwloc, mpi, gnu
USE_CUDA_ARCH="20"
USE_THREAD=ON
HWLOC_BASE_DIR="/home/sems/common/hwloc/current"
MPI_BASE_DIR="/home/sems/common/openmpi/current"
fi
#-----------------------------------------------------------------------------
# Incrementally construct cmake configure command line options:
CMAKE_CONFIGURE=""
CMAKE_CXX_FLAGS=""
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
if [ 1 ] ; then
# Configure for Tpetra/Kokkos:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${BLAS_LIB_DIR}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_DIRS:FILEPATH=${LAPACK_LIB_DIR}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Tpetra:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Kokkos:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraClassic:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TeuchosKokkosCompat:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TeuchosKokkosComm:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Tpetra_ENABLE_Kokkos_Refactor:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D KokkosClassic_DefaultNode:STRING=Kokkos::Compat::KokkosOpenMPWrapperNode"
CMAKE_CXX_FLAGS="${CMAKE_CXX_FLAGS}-DKOKKOS_FAST_COMPILE"
if [ -n "${USE_CUDA_ARCH}" ] ; then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Cuda:BOOL=ON"
fi
fi
if [ 1 ] ; then
# Configure for Stokhos:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Sacado:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Stokhos:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Stokhos_ENABLE_Belos:BOOL=ON"
fi
if [ 1 ] ; then
# Configure for TrilinosCouplings:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TrilinosCouplings:BOOL=ON"
fi
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=ON"
if [ "${CMAKE_BUILD_TYPE}" == "DEBUG" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
fi
#-----------------------------------------------------------------------------
# Location for installation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# MPI configuation only used for examples:
#
# Must have the MPI_BASE_DIR so that the
# include path can be passed to the Cuda compiler
if [ -n "${MPI_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Kokkos use pthread configuation:
if [ "${USE_THREAD}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=ON"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Kokkos use OpenMP configuation:
if [ "${USE_OPENMP}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Hardware locality configuration:
if [ -n "${HWLOC_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
fi
#-----------------------------------------------------------------------------
# Cuda cmake configuration:
if [ -n "${USE_CUDA_ARCH}" ] ;
then
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
# this is different than the standard CMAKE_CXX_FLAGS syntax.
CUDA_NVCC_FLAGS="-DKOKKOS_HAVE_CUDA_ARCH=${USE_CUDA_ARCH}0;-gencode;arch=compute_${USE_CUDA_ARCH},code=sm_${USE_CUDA_ARCH}"
if [ "${USE_OPENMP}" = "ON" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
fi
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
fi
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
fi
#-----------------------------------------------------------------------------
if [ "${USE_INTEL}" = "ON" -o "${USE_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
fi
# Cross-compile for Intel Xeon Phi:
if [ "${USE_XEON_PHI}" = "ON" ] ;
then
CMAKE_CXX_FLAGS="${CMAKE_CXX_FLAGS} -mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
# Cannot cross-compile fortran compatibility checks on the MIC:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
# Tell cmake the answers to compile-and-execute tests
# to prevent cmake from executing a cross-compiled program.
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
fi
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
if [ -n "${CMAKE_CXX_FLAGS}" ] ; then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING='${CMAKE_CXX_FLAGS}'"
fi
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo "cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}"
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,88 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
# Additional command-line arguments given to this script will be
# passed directly to CMake.
#
# to build:
# build on bgq-b[1-12]
# module load sierra-devel
# run this configure file
# make
# to run:
# ssh bgq-login
# cd /scratch/username/...
# export OMP_PROC_BIND and XLSMPOPTS environment variables
# run with srun
# Note: hwloc does not work to get or set cpubindings on bgq.
# Use the openmp backend and the openmp environment variables.
#
# Only the mpi wrappers seem to be setup for cross-compile,
# so it is important that this configure enables MPI and uses mpigcc wrappers.
#
# Force CMake to re-evaluate build options.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#-----------------------------------------------------------------------------
# Incrementally construct cmake configure options:
CMAKE_CONFIGURE=""
#-----------------------------------------------------------------------------
# Location of Trilinos source tree:
CMAKE_PROJECT_DIR="../Trilinos"
# Location for installation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=../TrilinosInstall/`date +%F`"
#-----------------------------------------------------------------------------
# General build options.
# Use a variable so options can be propagated to CUDA compiler.
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=mpigcc-4.7.2"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=mpig++-4.7.2"
CMAKE_VERBOSE_MAKEFILE=OFF
CMAKE_BUILD_TYPE=RELEASE
# CMAKE_BUILD_TYPE=DEBUG
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Configure packages for kokkos-only:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
#-----------------------------------------------------------------------------
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,216 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
# Additional command-line arguments given to this script will be
# passed directly to CMake.
#
#
# Force CMake to re-evaluate build options.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#-----------------------------------------------------------------------------
# Incrementally construct cmake configure options:
CMAKE_CONFIGURE=""
#-----------------------------------------------------------------------------
# Location of Trilinos source tree:
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
# Location for installation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${HOME}/TrilinosInstall/`date +%F`"
#-----------------------------------------------------------------------------
# General build options.
# Use a variable so options can be propagated to CUDA compiler.
CMAKE_VERBOSE_MAKEFILE=OFF
CMAKE_BUILD_TYPE=RELEASE
#CMAKE_BUILD_TYPE=DEBUG
#CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
#-----------------------------------------------------------------------------
# Build for CUDA architecture:
#CUDA_ARCH=""
#CUDA_ARCH="20"
#CUDA_ARCH="30"
CUDA_ARCH="35"
# Build with OpenMP
OPENMP=ON
PTHREADS=ON
# Build host code with Intel compiler:
INTEL=OFF
# Build for MIC architecture:
INTEL_XEON_PHI=OFF
# Build with HWLOC at location:
#HWLOC_BASE_DIR=""
#HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.4.7"
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
# Location for MPI to use in examples:
#MPI_BASE_DIR=""
#MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.4.7"
MPI_BASE_DIR="/home/projects/mvapich/2.0.0b/gnu/4.7.3"
#MPI_BASE_DIR="/home/projects/openmpi/1.7.3/llvm/2013-12-02/"
#-----------------------------------------------------------------------------
# MPI configuation only used for examples:
#
# Must have the MPI_BASE_DIR so that the
# include path can be passed to the Cuda compiler
if [ -n "${MPI_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Pthread configuation:
if [ "${PTHREADS}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# OpenMP configuation:
if [ "${OPENMP}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Configure packages for kokkos-only:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Hardware locality cmake configuration:
if [ -n "${HWLOC_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
fi
#-----------------------------------------------------------------------------
# Cuda cmake configuration:
if [ -n "${CUDA_ARCH}" ] ;
then
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
# this is different than the standard CMAKE_CXX_FLAGS syntax.
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
if [ "${OPENMP}" = "ON" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
fi
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
fi
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
fi
#-----------------------------------------------------------------------------
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
fi
#-----------------------------------------------------------------------------
# Cross-compile for Intel Xeon Phi:
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
# Cannot cross-compile fortran compatibility checks on the MIC:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
# Tell cmake the answers to compile-and-execute tests
# to prevent cmake from executing a cross-compiled program.
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
fi
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
#-----------------------------------------------------------------------------
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,204 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
# Additional command-line arguments given to this script will be
# passed directly to CMake.
#
#
# Force CMake to re-evaluate build options.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#-----------------------------------------------------------------------------
# Incrementally construct cmake configure options:
CMAKE_CONFIGURE=""
#-----------------------------------------------------------------------------
# Location of Trilinos source tree:
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
# Location for installation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/sems/common/kokkos/`date +%F`"
#-----------------------------------------------------------------------------
# General build options.
# Use a variable so options can be propagated to CUDA compiler.
CMAKE_VERBOSE_MAKEFILE=OFF
CMAKE_BUILD_TYPE=RELEASE
# CMAKE_BUILD_TYPE=DEBUG
#-----------------------------------------------------------------------------
# Build for CUDA architecture:
# CUDA_ARCH=""
CUDA_ARCH="20"
# CUDA_ARCH="30"
# CUDA_ARCH="35"
# Build with OpenMP
OPENMP=ON
# Build host code with Intel compiler:
# INTEL=ON
# Build for MIC architecture:
# INTEL_XEON_PHI=ON
# Build with HWLOC at location:
HWLOC_BASE_DIR="/home/sems/common/hwloc/current"
# Location for MPI to use in examples:
MPI_BASE_DIR="/home/sems/common/openmpi/current"
#-----------------------------------------------------------------------------
# MPI configuation only used for examples:
#
# Must have the MPI_BASE_DIR so that the
# include path can be passed to the Cuda compiler
if [ -n "${MPI_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Pthread configuation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP configuation:
if [ "${OPENMP}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Configure packages for kokkos-only:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Hardware locality cmake configuration:
if [ -n "${HWLOC_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
fi
#-----------------------------------------------------------------------------
# Cuda cmake configuration:
if [ -n "${CUDA_ARCH}" ] ;
then
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
# this is different than the standard CMAKE_CXX_FLAGS syntax.
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
if [ "${OPENMP}" = "ON" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
fi
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
fi
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
fi
#-----------------------------------------------------------------------------
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
fi
#-----------------------------------------------------------------------------
# Cross-compile for Intel Xeon Phi:
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
# Cannot cross-compile fortran compatibility checks on the MIC:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
# Tell cmake the answers to compile-and-execute tests
# to prevent cmake from executing a cross-compiled program.
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
fi
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
#-----------------------------------------------------------------------------
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,190 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
# Additional command-line arguments given to this script will be
# passed directly to CMake.
#
#
# Force CMake to re-evaluate build options.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#-----------------------------------------------------------------------------
# Incrementally construct cmake configure options:
CMAKE_CONFIGURE=""
#-----------------------------------------------------------------------------
# Location of Trilinos source tree:
CMAKE_PROJECT_DIR="${HOME}/Trilinos"
# Location for installation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=/home/projects/kokkos/`date +%F`"
#-----------------------------------------------------------------------------
# General build options.
# Use a variable so options can be propagated to CUDA compiler.
CMAKE_VERBOSE_MAKEFILE=OFF
CMAKE_BUILD_TYPE=RELEASE
# CMAKE_BUILD_TYPE=DEBUG
#-----------------------------------------------------------------------------
# Build for CUDA architecture:
# CUDA_ARCH=""
# CUDA_ARCH="20"
# CUDA_ARCH="30"
CUDA_ARCH="35"
# Build host code with Intel compiler:
INTEL=ON
# Build for MIC architecture:
# INTEL_XEON_PHI=ON
# Build with HWLOC at location:
HWLOC_BASE_DIR="/home/projects/hwloc/1.6.2"
# Location for MPI to use in examples:
MPI_BASE_DIR=""
#-----------------------------------------------------------------------------
# MPI configuation only used for examples:
#
# Must have the MPI_BASE_DIR so that the
# include path can be passed to the Cuda compiler
if [ -n "${MPI_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D MPI_BASE_DIR:PATH=${MPI_BASE_DIR}"
else
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_MPI:BOOL=OFF"
fi
#-----------------------------------------------------------------------------
# Pthread configuation:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP configuation:
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=OFF"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Configure packages for kokkos-only:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Hardware locality cmake configuration:
if [ -n "${HWLOC_BASE_DIR}" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
fi
#-----------------------------------------------------------------------------
# Cuda cmake configuration:
if [ -n "${CUDA_ARCH}" ] ;
then
# Options to CUDA_NVCC_FLAGS must be semi-colon delimited,
# this is different than the standard CMAKE_CXX_FLAGS syntax.
CUDA_NVCC_FLAGS="-gencode;arch=compute_${CUDA_ARCH},code=sm_${CUDA_ARCH}"
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi"
if [ "${CMAKE_BUILD_TYPE}" = "DEBUG" ] ;
then
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-g"
else
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3"
fi
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_VERBOSE_BUILD:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS}"
fi
#-----------------------------------------------------------------------------
if [ "${INTEL}" = "ON" -o "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
fi
#-----------------------------------------------------------------------------
# Cross-compile for Intel Xeon Phi:
if [ "${INTEL_XEON_PHI}" = "ON" ] ;
then
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_SYSTEM_NAME=Linux"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_FLAGS:STRING=-mmic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_Fortran_COMPILER:FILEPATH=ifort"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_DIRS:FILEPATH=${MKLROOT}/lib/mic"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BLAS_LIBRARY_NAMES='mkl_intel_lp64;mkl_sequential;mkl_core;pthread;m'"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D BUILD_SHARED_LIBS:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D DART_TESTING_TIMEOUT:STRING=600"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_LIBRARY_NAMES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_LAPACK_LIBRARIES=''"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_BinUtils=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_Pthread_LIBRARIES=pthread"
# Cannot cross-compile fortran compatibility checks on the MIC:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
# Tell cmake the answers to compile-and-execute tests
# to prevent cmake from executing a cross-compiled program.
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_GCC_ABI_DEMANGLE_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HAVE_TEUCHOS_BLASFLOAT_EXITCODE=0"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D LAPACK_SLAPY2_WORKS_EXITCODE=0"
fi
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=${CMAKE_BUILD_TYPE}"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_VERBOSE_MAKEFILE:BOOL=${CMAKE_VERBOSE_MAKEFILE}"
#-----------------------------------------------------------------------------
echo "cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}"
cmake ${CMAKE_CONFIGURE} ${CMAKE_PROJECT_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,140 +0,0 @@
#!/bin/bash
#
# This script uses CUDA, OpenMP, and MPI.
#
# Before invoking this script, set the OMPI_CXX environment variable
# to point to nvcc_wrapper, wherever it happens to live. (If you use
# an MPI implementation other than OpenMPI, set the corresponding
# environment variable instead.)
#
rm -f CMakeCache.txt;
rm -rf CMakeFiles
EXTRA_ARGS=$@
MPI_PATH="/opt/mpi/openmpi/1.8.2/nvcc-gcc/4.8.3-6.5"
CUDA_PATH="/opt/nvidia/cuda/6.5.14"
#
# As long as there are any .cu files in Trilinos, we'll need to set
# CUDA_NVCC_FLAGS. If Trilinos gets rid of all of its .cu files and
# lets nvcc_wrapper handle them as .cpp files, then we won't need to
# set CUDA_NVCC_FLAGS. As it is, given that we need to set
# CUDA_NVCC_FLAGS, we must make sure that they are the same flags as
# nvcc_wrapper passes to nvcc.
#
CUDA_NVCC_FLAGS="-gencode;arch=compute_35,code=sm_35;-I${MPI_PATH}/include"
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-Xcompiler;-Wall,-ansi,-fopenmp"
CUDA_NVCC_FLAGS="${CUDA_NVCC_FLAGS};-O3;-DKOKKOS_USE_CUDA_UVM"
cmake \
-D CMAKE_INSTALL_PREFIX:PATH="$PWD/../install/" \
-D CMAKE_BUILD_TYPE:STRING=DEBUG \
-D CMAKE_CXX_FLAGS:STRING="-g -Wall" \
-D CMAKE_C_FLAGS:STRING="-g -Wall" \
-D CMAKE_FORTRAN_FLAGS:STRING="" \
-D CMAKE_SHARED_LIBRARY_LINK_CXX_FLAGS="" \
-D Trilinos_ENABLE_Triutils=OFF \
-D Trilinos_ENABLE_INSTALL_CMAKE_CONFIG_FILES:BOOL=OFF \
-D Trilinos_ENABLE_DEBUG:BOOL=OFF \
-D Trilinos_ENABLE_CHECKED_STL:BOOL=OFF \
-D Trilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=OFF \
-D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \
-D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF \
-D Trilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=OFF \
-D BUILD_SHARED_LIBS:BOOL=OFF \
-D DART_TESTING_TIMEOUT:STRING=600 \
-D CMAKE_VERBOSE_MAKEFILE:BOOL=OFF \
\
\
-D CMAKE_CXX_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicxx" \
-D CMAKE_C_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicc" \
-D MPI_CXX_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicxx" \
-D MPI_C_COMPILER:FILEPATH="${MPI_PATH}/bin/mpicc" \
-D CMAKE_Fortran_COMPILER:FILEPATH="${MPI_PATH}/bin/mpif77" \
-D MPI_EXEC:FILEPATH="${MPI_PATH}/bin/mpirun" \
-D MPI_EXEC_POST_NUMPROCS_FLAGS:STRING="-bind-to;socket;--map-by;socket;env;CUDA_MANAGED_FORCE_DEVICE_ALLOC=1;CUDA_LAUNCH_BLOCKING=1;OMP_NUM_THREADS=2" \
\
\
-D Trilinos_ENABLE_CXX11:BOOL=OFF \
-D TPL_ENABLE_MPI:BOOL=ON \
-D Trilinos_ENABLE_OpenMP:BOOL=ON \
-D Trilinos_ENABLE_ThreadPool:BOOL=ON \
\
\
-D TPL_ENABLE_CUDA:BOOL=ON \
-D CUDA_TOOLKIT_ROOT_DIR:FILEPATH="${CUDA_PATH}" \
-D CUDA_PROPAGATE_HOST_FLAGS:BOOL=OFF \
-D TPL_ENABLE_Thrust:BOOL=OFF \
-D Thrust_INCLUDE_DIRS:FILEPATH="${CUDA_PATH}/include" \
-D TPL_ENABLE_CUSPARSE:BOOL=OFF \
-D TPL_ENABLE_Cusp:BOOL=OFF \
-D Cusp_INCLUDE_DIRS="/home/crtrott/Software/cusp" \
-D CUDA_VERBOSE_BUILD:BOOL=OFF \
-D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS} \
\
\
-D TPL_ENABLE_HWLOC=OFF \
-D HWLOC_INCLUDE_DIRS="/usr/local/software/hwloc/current/include" \
-D HWLOC_LIBRARY_DIRS="/usr/local/software/hwloc/current/lib" \
-D TPL_ENABLE_BinUtils=OFF \
-D TPL_ENABLE_BLAS:STRING=ON \
-D TPL_ENABLE_LAPACK:STRING=ON \
-D TPL_ENABLE_MKL:STRING=OFF \
-D TPL_ENABLE_HWLOC:STRING=OFF \
-D TPL_ENABLE_GTEST:STRING=ON \
-D TPL_ENABLE_SuperLU=ON \
-D TPL_ENABLE_BLAS=ON \
-D TPL_ENABLE_LAPACK=ON \
-D TPL_SuperLU_LIBRARIES="/home/crtrott/Software/SuperLU_4.3/lib/libsuperlu_4.3.a" \
-D TPL_SuperLU_INCLUDE_DIRS="/home/crtrott/Software/SuperLU_4.3/SRC" \
\
\
-D Trilinos_Enable_Kokkos:BOOL=ON \
-D Trilinos_ENABLE_KokkosCore:BOOL=ON \
-D Trilinos_ENABLE_TeuchosKokkosCompat:BOOL=ON \
-D Trilinos_ENABLE_KokkosContainers:BOOL=ON \
-D Trilinos_ENABLE_TpetraKernels:BOOL=ON \
-D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON \
-D Trilinos_ENABLE_TeuchosKokkosComm:BOOL=ON \
-D Trilinos_ENABLE_KokkosExample:BOOL=ON \
-D Kokkos_ENABLE_EXAMPLES:BOOL=ON \
-D Kokkos_ENABLE_TESTS:BOOL=OFF \
-D KokkosClassic_DefaultNode:STRING="Kokkos::Compat::KokkosCudaWrapperNode" \
-D TpetraClassic_ENABLE_OpenMPNode=OFF \
-D TpetraClassic_ENABLE_TPINode=OFF \
-D TpetraClassic_ENABLE_MKL=OFF \
-D Kokkos_ENABLE_Cuda_UVM=ON \
\
\
-D Trilinos_ENABLE_Teuchos:BOOL=ON \
-D Teuchos_ENABLE_COMPLEX:BOOL=OFF \
\
\
-D Trilinos_ENABLE_Tpetra:BOOL=ON \
-D Tpetra_ENABLE_KokkosCore=ON \
-D Tpetra_ENABLE_Kokkos_DistObject=OFF \
-D Tpetra_ENABLE_Kokkos_Refactor=ON \
-D Tpetra_ENABLE_TESTS=ON \
-D Tpetra_ENABLE_EXAMPLES=ON \
-D Tpetra_ENABLE_MPI_CUDA_RDMA:BOOL=ON \
\
\
-D Trilinos_ENABLE_Belos=OFF \
-D Trilinos_ENABLE_Amesos=OFF \
-D Trilinos_ENABLE_Amesos2=OFF \
-D Trilinos_ENABLE_Ifpack=OFF \
-D Trilinos_ENABLE_Ifpack2=OFF \
-D Trilinos_ENABLE_Epetra=OFF \
-D Trilinos_ENABLE_EpetraExt=OFF \
-D Trilinos_ENABLE_Zoltan=OFF \
-D Trilinos_ENABLE_Zoltan2=OFF \
-D Trilinos_ENABLE_MueLu=OFF \
-D Belos_ENABLE_TESTS=ON \
-D Belos_ENABLE_EXAMPLES=ON \
-D MueLu_ENABLE_TESTS=ON \
-D MueLu_ENABLE_EXAMPLES=ON \
-D Ifpack2_ENABLE_TESTS=ON \
-D Ifpack2_ENABLE_EXAMPLES=ON \
$EXTRA_ARGS \
${HOME}/Trilinos

View File

@ -1,148 +0,0 @@
// -------------------------------------------------------------------------------- //
The following steps are for workstations/servers with the SEMS environment installed.
// -------------------------------------------------------------------------------- //
Summary:
- Step 1: Rigorous testing of Kokkos' develop branch for each backend (Serial, OpenMP, Threads, Cuda) with all supported compilers.
- Step 2: Snapshot Kokkos' develop branch into current Trilinos develop branch.
- Step 3: Build and test Trilinos with combinations of compilers, types, backends.
- Step 4: Promote Kokkos develop branch to master if the snapshot does not cause any new tests to fail; else track/fix causes of new failures.
- Step 5: Snapshot Kokkos tagged master branch into Trilinos and push Trilinos.
// -------------------------------------------------------------------------------- //
// -------------------------------------------------------------------------------- //
Step 1:
1.1. Update kokkos develop branch (NOT a fork)
(From kokkos directory):
git fetch --all
git checkout develop
git reset --hard origin/develop
1.2. Create a testing directory - here the directory is created within the kokkos directory
mkdir testing
cd testing
1.3. Run the test_all_sandia script; various compiler and build-list options can be specified
../config/test_all_sandia
1.4 Clean repository of untracked files
cd ../
git clean -df
// -------------------------------------------------------------------------------- //
Step 2:
2.1 Update Trilinos develop branch
(From Trilinos directory):
git checkout develop
git fetch --all
git reset --hard origin/develop
git clean -df
2.2 Snapshot Kokkos into Trilinos - this requires python/2.7.9 and that both Trilinos and Kokkos be clean - no untracked or modified files
module load python/2.7.9
python KOKKOS_PATH/config/snapshot.py KOKKOS_PATH TRILINOS_PATH/packages
// -------------------------------------------------------------------------------- //
Step 3:
3.1. Build and test Trilinos with 4 different configurations; Run scripts for white and shepard are provided in kokkos/config/trilinos-integration
Usually its a good idea to run those script via nohup.
You can run all four at the same time, use separate directories for each.
3.2. Compare the failed test output between the pristine and the updated runs; investigate and fix problems if new tests fail after the Kokkos snapshot
// -------------------------------------------------------------------------------- //
Step 4: Once all Trilinos tests pass promote Kokkos develop branch to master on Github
4.1. Generate Changelog (You need a github API token)
Close all Open issues with "InDevelop" tag on github
(Not from kokkos directory)
gitthub_changelog_generator kokkos/kokkos --token TOKEN --no-pull-requests --include-labels 'InDevelop' --enhancement-labels 'enhancement,Feature Request' --future-release 'NEWTAG' --between-tags 'NEWTAG,OLDTAG'
(Copy the new section from the generated CHANGELOG.md to the kokkos/CHANGELOG.md)
(Make desired changes to CHANGELOG.md to enhance clarity)
(Commit and push the CHANGELOG to develop)
4.2 Merge develop into Master
- DO NOT fast-forward the merge!!!!
(From kokkos directory):
git checkout master
git fetch --all
# Ensure we are on the current origin/master
git reset --hard origin/master
git merge --no-ff origin/develop
4.3. Update the tag in kokkos/config/master_history.txt
Tag description: MajorNumber.MinorNumber.WeeksSinceMinorNumberUpdate
Tag format: #.#.##
# Prepend master_history.txt with
# tag: #.#.##
# date: mm/dd/yyyy
# master: sha1
# develop: sha1
# -----------------------
git commit --amend -a
git tag -a #.#.##
tag: #.#.##
date: mm/dd/yyyy
master: sha1
develop: sha1
4.4. Do NOT push yet
// -------------------------------------------------------------------------------- //
Step 5:
5.1. Make sure Trilinos is up-to-date - chances are other changes have been committed since the integration testing process began. If a substantial change has occurred that may be affected by the snapshot the testing procedure may need to be repeated
(From Trilinos directory):
git checkout develop
git fetch --all
git reset --hard origin/develop
git clean -df
5.2. Snapshot Kokkos master branch into Trilinos
(From kokkos directory):
git fetch --all
git checkout tags/#.#.##
git clean -df
python KOKKOS_PATH/config/snapshot.py KOKKOS_PATH TRILINOS_PATH/packages
5.3. Run checkin-test to push to trilinos using the CI build modules (gcc/4.9.3)
The modules are listed in kokkos/config/trilinos-integration/checkin-test
Run checkin-test, forward dependencies and optional dependencies must be enabled
If push failed because someone else clearly broke something, push manually.
If push failed for unclear reasons, investigate, fix, and potentially start over from step 2 after reseting your local kokkos/master branch
Step 6: Push Kokkos to master
git push --follow-tags origin master
// -------------------------------------------------------------------------------- //

View File

@ -1,110 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# Cuda, OpenMP, Threads, Qthreads, hwloc
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 gcc/4.8.3 cuda/6.5.14 nvcc-wrapper/gnu
#
# The 'nvcc-wrapper' module should load a script that matches
# kokkos/bin/nvcc_wrapper
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
#-----------------------------------------------------------------------------
# Cuda using GNU, use the nvcc_wrapper to build CUDA source
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# Hardware locality configuration:
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
#-----------------------------------------------------------------------------
# Pthread
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=ON"
#-----------------------------------------------------------------------------
# OpenMP
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
#-----------------------------------------------------------------------------
# Qthreads
QTHREADS_BASE_DIR="/home/projects/qthreads/2014-07-08/host/gnu/4.7.3"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_QTHREADS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D QTHREADS_INCLUDE_DIRS:FILEPATH=${QTHREADS_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D QTHREADS_LIBRARY_DIRS:FILEPATH=${QTHREADS_BASE_DIR}/lib"
#-----------------------------------------------------------------------------
# C++11
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}

View File

@ -1,104 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# Cuda, OpenMP, hwloc
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 gcc/4.8.3 cuda/6.5.14 nvcc-wrapper/gnu
#
# The 'nvcc-wrapper' module should load a script that matches
# kokkos/bin/nvcc_wrapper
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
#-----------------------------------------------------------------------------
# Cuda using GNU, use the nvcc_wrapper to build CUDA source
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# Hardware locality configuration:
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
#-----------------------------------------------------------------------------
# Pthread explicitly OFF so tribits doesn't automatically turn it on
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
#-----------------------------------------------------------------------------
# C++11
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,88 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# Cuda
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 gcc/4.8.3 cuda/6.5.14 nvcc-wrapper/gnu
#
# The 'nvcc-wrapper' module should load a script that matches
# kokkos/bin/nvcc_wrapper
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
#-----------------------------------------------------------------------------
# Cuda using GNU, use the nvcc_wrapper to build CUDA source
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
# Pthread explicitly OFF, otherwise tribits will automatically turn it on
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# C++11
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,84 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# C++11, OpenMP
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 gcc/4.8.3
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# Pthread explicitly OFF so tribits doesn't automatically activate
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
#-----------------------------------------------------------------------------
# C++11
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,78 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# <none>
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 gcc/4.8.3
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# Kokkos Pthread explicitly OFF, TPL Pthread ON for gtest
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# C++11
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,89 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# Intel, OpenMP, Cuda
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 cuda/7.0.4 intel/2015.0.090 nvcc-wrapper/intel
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=nvcc_wrapper"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUDA:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_CUSPARSE:BOOL=ON"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# Pthread explicitly OFF
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
#-----------------------------------------------------------------------------
# C++11
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,84 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# Intel, OpenMP
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 intel/13.SP1.1.106
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=icc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=icpc"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# Pthread explicitly OFF
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
# OpenMP
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
#-----------------------------------------------------------------------------
# C++11
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,77 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# OpenMP
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 gcc/4.8.3
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# OpenMP
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_OpenMP:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_OpenMP:BOOL=ON"
# Pthread explicitly OFF, otherwise tribits will automatically turn it on
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=OFF"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,87 +0,0 @@
#!/bin/sh
#
# Copy this script, put it outside the Trilinos source directory, and
# build there.
#
#-----------------------------------------------------------------------------
# Building on 'kokkos-dev.sandia.gov' with enabled capabilities:
#
# Threads, hwloc
#
# module loaded on 'kokkos-dev.sandia.gov' for this build
#
# module load cmake/2.8.11.2 gcc/4.8.3
#
#-----------------------------------------------------------------------------
# Source and installation directories:
TRILINOS_SOURCE_DIR=${HOME}/Trilinos
TRILINOS_INSTALL_DIR=${HOME}/TrilinosInstall/`date +%F`
CMAKE_CONFIGURE=""
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_INSTALL_PREFIX=${TRILINOS_INSTALL_DIR}"
#-----------------------------------------------------------------------------
# Debug/optimized
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=DEBUG"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_BOUNDS_CHECK:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_BUILD_TYPE:STRING=RELEASE"
#-----------------------------------------------------------------------------
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_FLAGS:STRING=-Wall"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_C_COMPILER=gcc"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D CMAKE_CXX_COMPILER=g++"
#-----------------------------------------------------------------------------
# Configure for Kokkos subpackages and tests:
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_Fortran:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_EXAMPLES:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TESTS:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosCore:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosContainers:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_TpetraKernels:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_KokkosExample:BOOL=ON"
#-----------------------------------------------------------------------------
# Hardware locality configuration:
HWLOC_BASE_DIR="/home/projects/hwloc/1.7.1/host/gnu/4.7.3"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_HWLOC:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_INCLUDE_DIRS:FILEPATH=${HWLOC_BASE_DIR}/include"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D HWLOC_LIBRARY_DIRS:FILEPATH=${HWLOC_BASE_DIR}/lib"
#-----------------------------------------------------------------------------
# Pthread
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D TPL_ENABLE_Pthread:BOOL=ON"
CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_Pthread:BOOL=ON"
#-----------------------------------------------------------------------------
# C++11
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Trilinos_ENABLE_CXX11:BOOL=ON"
# CMAKE_CONFIGURE="${CMAKE_CONFIGURE} -D Kokkos_ENABLE_CXX11:BOOL=ON"
#-----------------------------------------------------------------------------
#
# Remove CMake output files to force reconfigure from scratch.
#
rm -rf CMake* Trilinos* packages Dart* Testing cmake_install.cmake MakeFile*
#
echo cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
cmake ${CMAKE_CONFIGURE} ${TRILINOS_SOURCE_DIR}
#-----------------------------------------------------------------------------

View File

@ -1,340 +0,0 @@
#!/bin/bash
#
# This shell script (nvcc_wrapper) wraps both the host compiler and
# NVCC, if you are building legacy C or C++ code with CUDA enabled.
# The script remedies some differences between the interface of NVCC
# and that of the host compiler, in particular for linking.
# It also means that a legacy code doesn't need separate .cu files;
# it can just use .cpp files.
#
# Default settings: change those according to your machine. For
# example, you may have have two different wrappers with either icpc
# or g++ as their back-end compiler. The defaults can be overwritten
# by using the usual arguments (e.g., -arch=sm_30 -ccbin icpc).
default_arch="sm_35"
#default_arch="sm_50"
#
# The default C++ compiler.
#
host_compiler=${NVCC_WRAPPER_DEFAULT_COMPILER:-"g++"}
#host_compiler="icpc"
#host_compiler="/usr/local/gcc/4.8.3/bin/g++"
#host_compiler="/usr/local/gcc/4.9.1/bin/g++"
#
# Internal variables
#
# C++ files
cpp_files=""
# Host compiler arguments
xcompiler_args=""
# Cuda (NVCC) only arguments
cuda_args=""
# Arguments for both NVCC and Host compiler
shared_args=""
# Argument -c
compile_arg=""
# Argument -o <obj>
output_arg=""
# Linker arguments
xlinker_args=""
# Object files passable to NVCC
object_files=""
# Link objects for the host linker only
object_files_xlinker=""
# Shared libraries with version numbers are not handled correctly by NVCC
shared_versioned_libraries_host=""
shared_versioned_libraries=""
# Does the User set the architecture
arch_set=0
# Does the user overwrite the host compiler
ccbin_set=0
#Error code of compilation
error_code=0
# Do a dry run without actually compiling
dry_run=0
# Skip NVCC compilation and use host compiler directly
host_only=0
host_only_args=""
# Enable workaround for CUDA 6.5 for pragma ident
replace_pragma_ident=0
# Mark first host compiler argument
first_xcompiler_arg=1
temp_dir=${TMPDIR:-/tmp}
# Check if we have an optimization argument already
optimization_applied=0
# Check if we have -std=c++X or --std=c++X already
stdcxx_applied=0
# Run nvcc a second time to generate dependencies if needed
depfile_separate=0
depfile_output_arg=""
depfile_target_arg=""
#echo "Arguments: $# $@"
while [ $# -gt 0 ]
do
case $1 in
#show the executed command
--show|--nvcc-wrapper-show)
dry_run=1
;;
#run host compilation only
--host-only)
host_only=1
;;
#replace '#pragma ident' with '#ident' this is needed to compile OpenMPI due to a configure script bug and a non standardized behaviour of pragma with macros
--replace-pragma-ident)
replace_pragma_ident=1
;;
#handle source files to be compiled as cuda files
*.cpp|*.cxx|*.cc|*.C|*.c++|*.cu)
cpp_files="$cpp_files $1"
;;
# Ensure we only have one optimization flag because NVCC doesn't allow muliple
-O*)
if [ $optimization_applied -eq 1 ]; then
echo "nvcc_wrapper - *warning* you have set multiple optimization flags (-O*), only the first is used because nvcc can only accept a single optimization setting."
else
shared_args="$shared_args $1"
optimization_applied=1
fi
;;
#Handle shared args (valid for both nvcc and the host compiler)
-D*|-I*|-L*|-l*|-g|--help|--version|-E|-M|-shared)
shared_args="$shared_args $1"
;;
#Handle compilation argument
-c)
compile_arg="$1"
;;
#Handle output argument
-o)
output_arg="$output_arg $1 $2"
shift
;;
# Handle depfile arguments. We map them to a separate call to nvcc.
-MD|-MMD)
depfile_separate=1
host_only_args="$host_only_args $1"
;;
-MF)
depfile_output_arg="-o $2"
host_only_args="$host_only_args $1 $2"
shift
;;
-MT)
depfile_target_arg="$1 $2"
host_only_args="$host_only_args $1 $2"
shift
;;
#Handle known nvcc args
-gencode*|--dryrun|--verbose|--keep|--keep-dir*|-G|--relocatable-device-code*|-lineinfo|-expt-extended-lambda|--resource-usage|-Xptxas*)
cuda_args="$cuda_args $1"
;;
#Handle more known nvcc args
--expt-extended-lambda|--expt-relaxed-constexpr)
cuda_args="$cuda_args $1"
;;
#Handle known nvcc args that have an argument
-rdc|-maxrregcount|--default-stream)
cuda_args="$cuda_args $1 $2"
shift
;;
#Handle c++11
--std=c++11|-std=c++11|--std=c++14|-std=c++14|--std=c++1z|-std=c++1z)
if [ $stdcxx_applied -eq 1 ]; then
echo "nvcc_wrapper - *warning* you have set multiple optimization flags (-std=c++1* or --std=c++1*), only the first is used because nvcc can only accept a single std setting"
else
shared_args="$shared_args $1"
stdcxx_applied=1
fi
;;
#strip of -std=c++98 due to nvcc warnings and Tribits will place both -std=c++11 and -std=c++98
-std=c++98|--std=c++98)
;;
#strip of pedantic because it produces endless warnings about #LINE added by the preprocessor
-pedantic|-Wpedantic|-ansi)
;;
#strip of -Woverloaded-virtual to avoid "cc1: warning: command line option -Woverloaded-virtual is valid for C++/ObjC++ but not for C"
-Woverloaded-virtual)
;;
#strip -Xcompiler because we add it
-Xcompiler)
if [ $first_xcompiler_arg -eq 1 ]; then
xcompiler_args="$2"
first_xcompiler_arg=0
else
xcompiler_args="$xcompiler_args,$2"
fi
shift
;;
#strip of "-x cu" because we add that
-x)
if [[ $2 != "cu" ]]; then
if [ $first_xcompiler_arg -eq 1 ]; then
xcompiler_args="-x,$2"
first_xcompiler_arg=0
else
xcompiler_args="$xcompiler_args,-x,$2"
fi
fi
shift
;;
#Handle -ccbin (if its not set we can set it to a default value)
-ccbin)
cuda_args="$cuda_args $1 $2"
ccbin_set=1
host_compiler=$2
shift
;;
#Handle -arch argument (if its not set use a default
-arch*)
cuda_args="$cuda_args $1"
arch_set=1
;;
#Handle -Xcudafe argument
-Xcudafe)
cuda_args="$cuda_args -Xcudafe $2"
shift
;;
#Handle args that should be sent to the linker
-Wl*)
xlinker_args="$xlinker_args -Xlinker ${1:4:${#1}}"
host_linker_args="$host_linker_args ${1:4:${#1}}"
;;
#Handle object files: -x cu applies to all input files, so give them to linker, except if only linking
*.a|*.so|*.o|*.obj)
object_files="$object_files $1"
object_files_xlinker="$object_files_xlinker -Xlinker $1"
;;
#Handle object files which always need to use "-Xlinker": -x cu applies to all input files, so give them to linker, except if only linking
@*|*.dylib)
object_files="$object_files -Xlinker $1"
object_files_xlinker="$object_files_xlinker -Xlinker $1"
;;
#Handle shared libraries with *.so.* names which nvcc can't do.
*.so.*)
shared_versioned_libraries_host="$shared_versioned_libraries_host $1"
shared_versioned_libraries="$shared_versioned_libraries -Xlinker $1"
;;
#All other args are sent to the host compiler
*)
if [ $first_xcompiler_arg -eq 1 ]; then
xcompiler_args=$1
first_xcompiler_arg=0
else
xcompiler_args="$xcompiler_args,$1"
fi
;;
esac
shift
done
#Add default host compiler if necessary
if [ $ccbin_set -ne 1 ]; then
cuda_args="$cuda_args -ccbin $host_compiler"
fi
#Add architecture command
if [ $arch_set -ne 1 ]; then
cuda_args="$cuda_args -arch=$default_arch"
fi
#Compose compilation command
nvcc_command="nvcc $cuda_args $shared_args $xlinker_args $shared_versioned_libraries"
if [ $first_xcompiler_arg -eq 0 ]; then
nvcc_command="$nvcc_command -Xcompiler $xcompiler_args"
fi
#Compose host only command
host_command="$host_compiler $shared_args $host_only_args $compile_arg $output_arg $xcompiler_args $host_linker_args $shared_versioned_libraries_host"
#nvcc does not accept '#pragma ident SOME_MACRO_STRING' but it does accept '#ident SOME_MACRO_STRING'
if [ $replace_pragma_ident -eq 1 ]; then
cpp_files2=""
for file in $cpp_files
do
var=`grep pragma ${file} | grep ident | grep "#"`
if [ "${#var}" -gt 0 ]
then
sed 's/#[\ \t]*pragma[\ \t]*ident/#ident/g' $file > $temp_dir/nvcc_wrapper_tmp_$file
cpp_files2="$cpp_files2 $temp_dir/nvcc_wrapper_tmp_$file"
else
cpp_files2="$cpp_files2 $file"
fi
done
cpp_files=$cpp_files2
#echo $cpp_files
fi
if [ "$cpp_files" ]; then
nvcc_command="$nvcc_command $object_files_xlinker -x cu $cpp_files"
else
nvcc_command="$nvcc_command $object_files"
fi
if [ "$cpp_files" ]; then
host_command="$host_command $object_files $cpp_files"
else
host_command="$host_command $object_files"
fi
if [ $depfile_separate -eq 1 ]; then
# run nvcc a second time to generate dependencies (without compiling)
nvcc_depfile_command="$nvcc_command -M $depfile_target_arg $depfile_output_arg"
else
nvcc_depfile_command=""
fi
nvcc_command="$nvcc_command $compile_arg $output_arg"
#Print command for dryrun
if [ $dry_run -eq 1 ]; then
if [ $host_only -eq 1 ]; then
echo $host_command
elif [ -n "$nvcc_depfile_command" ]; then
echo $nvcc_command "&&" $nvcc_depfile_command
else
echo $nvcc_command
fi
exit 0
fi
#Run compilation command
if [ $host_only -eq 1 ]; then
$host_command
elif [ -n "$nvcc_depfile_command" ]; then
$nvcc_command && $nvcc_depfile_command
else
$nvcc_command
fi
error_code=$?
#Report error code
exit $error_code

View File

@ -14,25 +14,52 @@ PROCESSOR=`uname -p`
if [[ "$HOSTNAME" =~ (white|ride).* ]]; then if [[ "$HOSTNAME" =~ (white|ride).* ]]; then
MACHINE=white MACHINE=white
elif [[ "$HOSTNAME" =~ .*bowman.* ]]; then module load git
fi
if [[ "$HOSTNAME" =~ .*bowman.* ]]; then
MACHINE=bowman MACHINE=bowman
elif [[ "$HOSTNAME" =~ n.* ]]; then # Warning: very generic name module load git
fi
if [[ "$HOSTNAME" =~ n.* ]]; then # Warning: very generic name
if [[ "$PROCESSOR" = "aarch64" ]]; then if [[ "$PROCESSOR" = "aarch64" ]]; then
MACHINE=sullivan MACHINE=sullivan
module load git
fi fi
elif [[ "$HOSTNAME" =~ node.* ]]; then # Warning: very generic name fi
if [[ "$HOSTNAME" =~ node.* ]]; then # Warning: very generic name
if [[ "$MACHINE" = "" ]]; then
MACHINE=shepard MACHINE=shepard
elif [[ "$HOSTNAME" =~ apollo ]]; then module load git
fi
fi
if [[ "$HOSTNAME" =~ apollo ]]; then
MACHINE=apollo MACHINE=apollo
elif [[ "$HOSTNAME" =~ sullivan ]]; then module load git
fi
if [[ "$HOSTNAME" =~ sullivan ]]; then
MACHINE=sullivan MACHINE=sullivan
elif [ ! -z "$SEMS_MODULEFILES_ROOT" ]; then module load git
MACHINE=sems fi
else
if [ ! -z "$SEMS_MODULEFILES_ROOT" ]; then
if [[ "$MACHINE" = "" ]]; then
MACHINE=sems
module load sems-git
fi
fi
if [[ "$MACHINE" = "" ]]; then
echo "Unrecognized machine" >&2 echo "Unrecognized machine" >&2
exit 1 exit 1
fi fi
echo "Running on machine: $MACHINE"
GCC_BUILD_LIST="OpenMP,Pthread,Serial,OpenMP_Serial,Pthread_Serial" GCC_BUILD_LIST="OpenMP,Pthread,Serial,OpenMP_Serial,Pthread_Serial"
IBM_BUILD_LIST="OpenMP,Serial,OpenMP_Serial" IBM_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"
ARM_GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial" ARM_GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"
@ -45,7 +72,8 @@ GCC_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits
IBM_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized" IBM_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
CLANG_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized" CLANG_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
INTEL_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized" INTEL_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
CUDA_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized" #CUDA_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Werror,-Wsign-compare,-Wtype-limits,-Wuninitialized"
CUDA_WARNING_FLAGS="-Wall,-Wshadow,-pedantic,-Wsign-compare,-Wtype-limits,-Wuninitialized"
PGI_WARNING_FLAGS="" PGI_WARNING_FLAGS=""
# Default. Machine specific can override. # Default. Machine specific can override.
@ -142,6 +170,18 @@ else
KOKKOS_PATH=$( cd $KOKKOS_PATH && pwd ) KOKKOS_PATH=$( cd $KOKKOS_PATH && pwd )
fi fi
UNCOMMITTED=`cd ${KOKKOS_PATH}; git status --porcelain 2>/dev/null`
if ! [ -z "$UNCOMMITTED" ]; then
echo "WARNING!! THE FOLLOWING CHANGES ARE UNCOMMITTED!! :"
echo "$UNCOMMITTED"
echo ""
fi
GITSTATUS=`cd ${KOKKOS_PATH}; git log -n 1 --format=oneline`
echo "Repository Status: " ${GITSTATUS}
echo ""
echo ""
# #
# Machine specific config. # Machine specific config.
# #
@ -149,7 +189,7 @@ fi
if [ "$MACHINE" = "sems" ]; then if [ "$MACHINE" = "sems" ]; then
source /projects/sems/modulefiles/utils/sems-modules-init.sh source /projects/sems/modulefiles/utils/sems-modules-init.sh
BASE_MODULE_LIST="sems-env,kokkos-env,sems-<COMPILER_NAME>/<COMPILER_VERSION>,kokkos-hwloc/1.10.1/base" BASE_MODULE_LIST="sems-env,kokkos-env,kokkos-hwloc/1.10.1/base,sems-<COMPILER_NAME>/<COMPILER_VERSION>"
CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base" CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base"
CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base" CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"
@ -178,9 +218,9 @@ if [ "$MACHINE" = "sems" ]; then
"clang/3.7.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS" "clang/3.7.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
"clang/3.8.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS" "clang/3.8.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
"clang/3.9.0 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS" "clang/3.9.0 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
"cuda/7.0.28 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/config/nvcc_wrapper $CUDA_WARNING_FLAGS" "cuda/7.0.28 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"cuda/7.5.18 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/config/nvcc_wrapper $CUDA_WARNING_FLAGS" "cuda/7.5.18 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"cuda/8.0.44 $CUDA8_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/config/nvcc_wrapper $CUDA_WARNING_FLAGS" "cuda/8.0.44 $CUDA8_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
) )
fi fi
elif [ "$MACHINE" = "white" ]; then elif [ "$MACHINE" = "white" ]; then
@ -191,14 +231,14 @@ elif [ "$MACHINE" = "white" ]; then
BASE_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>" BASE_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>"
IBM_MODULE_LIST="<COMPILER_NAME>/xl/<COMPILER_VERSION>" IBM_MODULE_LIST="<COMPILER_NAME>/xl/<COMPILER_VERSION>"
CUDA_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/5.4.0" CUDA_MODULE_LIST="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/5.4.0"
CUDA_MODULE_LIST2="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/6.3.0,ibm/xl/13.1.6-BETA" CUDA_MODULE_LIST2="<COMPILER_NAME>/<COMPILER_VERSION>,gcc/6.3.0,ibm/xl/13.1.6"
# Don't do pthread on white. # Don't do pthread on white.
GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial" GCC_BUILD_LIST="OpenMP,Serial,OpenMP_Serial"
# Format: (compiler module-list build-list exe-name warning-flag) # Format: (compiler module-list build-list exe-name warning-flag)
COMPILERS=("gcc/5.4.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS" COMPILERS=("gcc/5.4.0 $BASE_MODULE_LIST $IBM_BUILD_LIST g++ $GCC_WARNING_FLAGS"
"ibm/13.1.3 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS" "ibm/13.1.6 $IBM_MODULE_LIST $IBM_BUILD_LIST xlC $IBM_WARNING_FLAGS"
"cuda/8.0.44 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS" "cuda/8.0.44 $CUDA_MODULE_LIST $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"cuda/9.0.103 $CUDA_MODULE_LIST2 $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS" "cuda/9.0.103 $CUDA_MODULE_LIST2 $CUDA_IBM_BUILD_LIST ${KOKKOS_PATH}/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
) )
@ -281,7 +321,7 @@ elif [ "$MACHINE" = "apollo" ]; then
CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base" CUDA_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/4.8.4,kokkos-hwloc/1.10.1/base"
CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base" CUDA8_MODULE_LIST="sems-env,kokkos-env,kokkos-<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0,kokkos-hwloc/1.10.1/base"
CLANG_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/8.0.44" CLANG_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,cuda/9.0.69"
NVCC_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0" NVCC_MODULE_LIST="sems-env,kokkos-env,sems-git,sems-cmake/3.5.2,<COMPILER_NAME>/<COMPILER_VERSION>,sems-gcc/5.3.0"
BUILD_LIST_CUDA_NVCC="Cuda_Serial,Cuda_OpenMP" BUILD_LIST_CUDA_NVCC="Cuda_Serial,Cuda_OpenMP"
@ -294,13 +334,13 @@ elif [ "$MACHINE" = "apollo" ]; then
"gcc/5.1.0 $BASE_MODULE_LIST "Serial" g++ $GCC_WARNING_FLAGS" "gcc/5.1.0 $BASE_MODULE_LIST "Serial" g++ $GCC_WARNING_FLAGS"
"intel/16.0.1 $BASE_MODULE_LIST "OpenMP" icpc $INTEL_WARNING_FLAGS" "intel/16.0.1 $BASE_MODULE_LIST "OpenMP" icpc $INTEL_WARNING_FLAGS"
"clang/3.9.0 $BASE_MODULE_LIST "Pthread_Serial" clang++ $CLANG_WARNING_FLAGS" "clang/3.9.0 $BASE_MODULE_LIST "Pthread_Serial" clang++ $CLANG_WARNING_FLAGS"
"clang/4.0.0 $CLANG_MODULE_LIST "Cuda_Pthread" clang++ $CUDA_WARNING_FLAGS" "clang/6.0 $CLANG_MODULE_LIST "Cuda_Pthread" clang++ $CUDA_WARNING_FLAGS"
"cuda/8.0.44 $CUDA_MODULE_LIST "Cuda_OpenMP" $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS" "cuda/9.1 $CUDA_MODULE_LIST "Cuda_OpenMP" $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
) )
else else
# Format: (compiler module-list build-list exe-name warning-flag) # Format: (compiler module-list build-list exe-name warning-flag)
COMPILERS=("cuda/8.0.44 $CUDA8_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS" COMPILERS=("cuda/9.1 $CUDA8_MODULE_LIST $BUILD_LIST_CUDA_NVCC $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"clang/4.0.0 $CLANG_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS" "clang/6.0 $CLANG_MODULE_LIST $BUILD_LIST_CUDA_CLANG clang++ $CUDA_WARNING_FLAGS"
"clang/3.9.0 $CLANG_MODULE_LIST $BUILD_LIST_CLANG clang++ $CLANG_WARNING_FLAGS" "clang/3.9.0 $CLANG_MODULE_LIST $BUILD_LIST_CLANG clang++ $CLANG_WARNING_FLAGS"
"gcc/4.8.4 $BASE_MODULE_LIST $GCC_BUILD_LIST g++ $GCC_WARNING_FLAGS" "gcc/4.8.4 $BASE_MODULE_LIST $GCC_BUILD_LIST g++ $GCC_WARNING_FLAGS"
"gcc/4.9.3 $BASE_MODULE_LIST $GCC_BUILD_LIST g++ $GCC_WARNING_FLAGS" "gcc/4.9.3 $BASE_MODULE_LIST $GCC_BUILD_LIST g++ $GCC_WARNING_FLAGS"
@ -311,13 +351,11 @@ elif [ "$MACHINE" = "apollo" ]; then
"intel/17.0.1 $BASE_MODULE_LIST $INTEL_BUILD_LIST icpc $INTEL_WARNING_FLAGS" "intel/17.0.1 $BASE_MODULE_LIST $INTEL_BUILD_LIST icpc $INTEL_WARNING_FLAGS"
"clang/3.5.2 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS" "clang/3.5.2 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
"clang/3.6.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS" "clang/3.6.1 $BASE_MODULE_LIST $CLANG_BUILD_LIST clang++ $CLANG_WARNING_FLAGS"
"cuda/7.0.28 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
"cuda/7.5.18 $CUDA_MODULE_LIST $CUDA_BUILD_LIST $KOKKOS_PATH/bin/nvcc_wrapper $CUDA_WARNING_FLAGS"
) )
fi fi
if [ -z "$ARCH_FLAG" ]; then if [ -z "$ARCH_FLAG" ]; then
ARCH_FLAG="--arch=SNB,Kepler35" ARCH_FLAG="--arch=SNB,Volta70"
fi fi
NUM_JOBS_TO_RUN_IN_PARALLEL=2 NUM_JOBS_TO_RUN_IN_PARALLEL=2
@ -700,17 +738,19 @@ wait_summarize_and_exit() {
echo $passed_test $(cat $PASSED_DIR/$passed_test) echo $passed_test $(cat $PASSED_DIR/$passed_test)
done done
echo "#######################################################"
echo "FAILED TESTS"
echo "#######################################################"
local failed_test
local -i rv=0 local -i rv=0
for failed_test in $(\ls -1 $FAILED_DIR | sort) if [ "$(ls -A $FAILED_DIR)" ]; then
do echo "#######################################################"
echo $failed_test "("$(cat $FAILED_DIR/$failed_test)" failed)" echo "FAILED TESTS"
rv=$rv+1 echo "#######################################################"
done
local failed_test
for failed_test in $(\ls -1 $FAILED_DIR | sort)
do
echo $failed_test "("$(cat $FAILED_DIR/$failed_test)" failed)"
rv=$rv+1
done
fi
exit $rv exit $rv
} }

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -64,8 +64,8 @@ struct InitViewFunctor {
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator()(const int i) const { void operator()(const int i) const {
for (unsigned j = 0; j < _inview.dimension(1); ++j) { for (unsigned j = 0; j < _inview.extent(1); ++j) {
for (unsigned k = 0; k < _inview.dimension(2); ++k) { for (unsigned k = 0; k < _inview.extent(2); ++k) {
_inview(i,j,k) = i/2 -j*j + k/3; _inview(i,j,k) = i/2 -j*j + k/3;
} }
} }
@ -84,8 +84,8 @@ struct InitViewFunctor {
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator()(const int i) const { void operator()(const int i) const {
for (unsigned j = 0; j < _inview.dimension(1); ++j) { for (unsigned j = 0; j < _inview.extent(1); ++j) {
for (unsigned k = 0; k < _inview.dimension(2); ++k) { for (unsigned k = 0; k < _inview.extent(2); ++k) {
_outview(i) += _inview(i,j,k) ; _outview(i) += _inview(i,j,k) ;
} }
} }
@ -104,8 +104,8 @@ struct InitStrideViewFunctor {
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator()(const int i) const { void operator()(const int i) const {
for (unsigned j = 0; j < _inview.dimension(1); ++j) { for (unsigned j = 0; j < _inview.extent(1); ++j) {
for (unsigned k = 0; k < _inview.dimension(2); ++k) { for (unsigned k = 0; k < _inview.extent(2); ++k) {
_inview(i,j,k) = i/2 -j*j + k/3; _inview(i,j,k) = i/2 -j*j + k/3;
} }
} }
@ -123,8 +123,8 @@ struct InitViewRank7Functor {
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator()(const int i) const { void operator()(const int i) const {
for (unsigned j = 0; j < _inview.dimension(1); ++j) { for (unsigned j = 0; j < _inview.extent(1); ++j) {
for (unsigned k = 0; k < _inview.dimension(2); ++k) { for (unsigned k = 0; k < _inview.extent(2); ++k) {
_inview(i,j,k,0,0,0,0) = i/2 -j*j + k/3; _inview(i,j,k,0,0,0,0) = i/2 -j*j + k/3;
} }
} }
@ -143,8 +143,8 @@ struct InitDynRankViewFunctor {
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator()(const int i) const { void operator()(const int i) const {
for (unsigned j = 0; j < _inview.dimension(1); ++j) { for (unsigned j = 0; j < _inview.extent(1); ++j) {
for (unsigned k = 0; k < _inview.dimension(2); ++k) { for (unsigned k = 0; k < _inview.extent(2); ++k) {
_inview(i,j,k) = i/2 -j*j + k/3; _inview(i,j,k) = i/2 -j*j + k/3;
} }
} }
@ -163,8 +163,8 @@ struct InitDynRankViewFunctor {
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator()(const int i) const { void operator()(const int i) const {
for (unsigned j = 0; j < _inview.dimension(1); ++j) { for (unsigned j = 0; j < _inview.extent(1); ++j) {
for (unsigned k = 0; k < _inview.dimension(2); ++k) { for (unsigned k = 0; k < _inview.extent(2); ++k) {
_outview(i) += _inview(i,j,k) ; _outview(i) += _inview(i,j,k) ;
} }
} }

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -76,7 +76,7 @@ struct generate_ids
generate_ids( local_id_view & ids) generate_ids( local_id_view & ids)
: local_2_global(ids) : local_2_global(ids)
{ {
Kokkos::parallel_for(local_2_global.dimension_0(), *this); Kokkos::parallel_for(local_2_global.extent(0), *this);
} }
@ -116,7 +116,7 @@ struct fill_map
fill_map( global_id_view gIds, local_id_view lIds) fill_map( global_id_view gIds, local_id_view lIds)
: global_2_local(gIds) , local_2_global(lIds) : global_2_local(gIds) , local_2_global(lIds)
{ {
Kokkos::parallel_for(local_2_global.dimension_0(), *this); Kokkos::parallel_for(local_2_global.extent(0), *this);
} }
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
@ -143,7 +143,7 @@ struct find_test
find_test( global_id_view gIds, local_id_view lIds, value_type & num_errors) find_test( global_id_view gIds, local_id_view lIds, value_type & num_errors)
: global_2_local(gIds) , local_2_global(lIds) : global_2_local(gIds) , local_2_global(lIds)
{ {
Kokkos::parallel_reduce(local_2_global.dimension_0(), *this, num_errors); Kokkos::parallel_reduce(local_2_global.extent(0), *this, num_errors);
} }
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -147,7 +147,7 @@ public:
if (m_last_block_mask) { if (m_last_block_mask) {
//clear the unused bits in the last block //clear the unused bits in the last block
typedef Kokkos::Impl::DeepCopy< typename execution_space::memory_space, Kokkos::HostSpace > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< typename execution_space::memory_space, Kokkos::HostSpace > raw_deep_copy;
raw_deep_copy( m_blocks.ptr_on_device() + (m_blocks.dimension_0() -1u), &m_last_block_mask, sizeof(unsigned)); raw_deep_copy( m_blocks.data() + (m_blocks.extent(0) -1u), &m_last_block_mask, sizeof(unsigned));
} }
} }
@ -212,7 +212,7 @@ public:
KOKKOS_FORCEINLINE_FUNCTION KOKKOS_FORCEINLINE_FUNCTION
unsigned max_hint() const unsigned max_hint() const
{ {
return m_blocks.dimension_0(); return m_blocks.extent(0);
} }
/// find a bit set to 1 near the hint /// find a bit set to 1 near the hint
@ -221,10 +221,10 @@ public:
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
Kokkos::pair<bool, unsigned> find_any_set_near( unsigned hint , unsigned scan_direction = BIT_SCAN_FORWARD_MOVE_HINT_FORWARD ) const Kokkos::pair<bool, unsigned> find_any_set_near( unsigned hint , unsigned scan_direction = BIT_SCAN_FORWARD_MOVE_HINT_FORWARD ) const
{ {
const unsigned block_idx = (hint >> block_shift) < m_blocks.dimension_0() ? (hint >> block_shift) : 0; const unsigned block_idx = (hint >> block_shift) < m_blocks.extent(0) ? (hint >> block_shift) : 0;
const unsigned offset = hint & block_mask; const unsigned offset = hint & block_mask;
unsigned block = volatile_load(&m_blocks[ block_idx ]); unsigned block = volatile_load(&m_blocks[ block_idx ]);
block = !m_last_block_mask || (block_idx < (m_blocks.dimension_0()-1)) ? block : block & m_last_block_mask ; block = !m_last_block_mask || (block_idx < (m_blocks.extent(0)-1)) ? block : block & m_last_block_mask ;
return find_any_helper(block_idx, offset, block, scan_direction); return find_any_helper(block_idx, offset, block, scan_direction);
} }
@ -238,7 +238,7 @@ public:
const unsigned block_idx = hint >> block_shift; const unsigned block_idx = hint >> block_shift;
const unsigned offset = hint & block_mask; const unsigned offset = hint & block_mask;
unsigned block = volatile_load(&m_blocks[ block_idx ]); unsigned block = volatile_load(&m_blocks[ block_idx ]);
block = !m_last_block_mask || (block_idx < (m_blocks.dimension_0()-1) ) ? ~block : ~block & m_last_block_mask ; block = !m_last_block_mask || (block_idx < (m_blocks.extent(0)-1) ) ? ~block : ~block & m_last_block_mask ;
return find_any_helper(block_idx, offset, block, scan_direction); return find_any_helper(block_idx, offset, block, scan_direction);
} }
@ -281,8 +281,8 @@ private:
unsigned update_hint( long long block_idx, unsigned offset, unsigned scan_direction ) const unsigned update_hint( long long block_idx, unsigned offset, unsigned scan_direction ) const
{ {
block_idx += scan_direction & MOVE_HINT_BACKWARD ? -1 : 1; block_idx += scan_direction & MOVE_HINT_BACKWARD ? -1 : 1;
block_idx = block_idx >= 0 ? block_idx : m_blocks.dimension_0() - 1; block_idx = block_idx >= 0 ? block_idx : m_blocks.extent(0) - 1;
block_idx = block_idx < static_cast<long long>(m_blocks.dimension_0()) ? block_idx : 0; block_idx = block_idx < static_cast<long long>(m_blocks.extent(0)) ? block_idx : 0;
return static_cast<unsigned>(block_idx)*block_size + offset; return static_cast<unsigned>(block_idx)*block_size + offset;
} }
@ -407,7 +407,7 @@ void deep_copy( Bitset<DstDevice> & dst, Bitset<SrcDevice> const& src)
} }
typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy;
raw_deep_copy(dst.m_blocks.ptr_on_device(), src.m_blocks.ptr_on_device(), sizeof(unsigned)*src.m_blocks.dimension_0()); raw_deep_copy(dst.m_blocks.data(), src.m_blocks.data(), sizeof(unsigned)*src.m_blocks.extent(0));
} }
template <typename DstDevice, typename SrcDevice> template <typename DstDevice, typename SrcDevice>
@ -418,7 +418,7 @@ void deep_copy( Bitset<DstDevice> & dst, ConstBitset<SrcDevice> const& src)
} }
typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy;
raw_deep_copy(dst.m_blocks.ptr_on_device(), src.m_blocks.ptr_on_device(), sizeof(unsigned)*src.m_blocks.dimension_0()); raw_deep_copy(dst.m_blocks.data(), src.m_blocks.data(), sizeof(unsigned)*src.m_blocks.extent(0));
} }
template <typename DstDevice, typename SrcDevice> template <typename DstDevice, typename SrcDevice>
@ -429,7 +429,7 @@ void deep_copy( ConstBitset<DstDevice> & dst, ConstBitset<SrcDevice> const& src)
} }
typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< typename DstDevice::memory_space, typename SrcDevice::memory_space > raw_deep_copy;
raw_deep_copy(dst.m_blocks.ptr_on_device(), src.m_blocks.ptr_on_device(), sizeof(unsigned)*src.m_blocks.dimension_0()); raw_deep_copy(dst.m_blocks.data(), src.m_blocks.data(), sizeof(unsigned)*src.m_blocks.extent(0));
} }
} // namespace Kokkos } // namespace Kokkos

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -262,14 +262,14 @@ public:
modified_host (View<unsigned int,LayoutLeft,typename t_host::execution_space> ("DualView::modified_host")) modified_host (View<unsigned int,LayoutLeft,typename t_host::execution_space> ("DualView::modified_host"))
{ {
if ( int(d_view.rank) != int(h_view.rank) || if ( int(d_view.rank) != int(h_view.rank) ||
d_view.dimension_0() != h_view.dimension_0() || d_view.extent(0) != h_view.extent(0) ||
d_view.dimension_1() != h_view.dimension_1() || d_view.extent(1) != h_view.extent(1) ||
d_view.dimension_2() != h_view.dimension_2() || d_view.extent(2) != h_view.extent(2) ||
d_view.dimension_3() != h_view.dimension_3() || d_view.extent(3) != h_view.extent(3) ||
d_view.dimension_4() != h_view.dimension_4() || d_view.extent(4) != h_view.extent(4) ||
d_view.dimension_5() != h_view.dimension_5() || d_view.extent(5) != h_view.extent(5) ||
d_view.dimension_6() != h_view.dimension_6() || d_view.extent(6) != h_view.extent(6) ||
d_view.dimension_7() != h_view.dimension_7() || d_view.extent(7) != h_view.extent(7) ||
d_view.stride_0() != h_view.stride_0() || d_view.stride_0() != h_view.stride_0() ||
d_view.stride_1() != h_view.stride_1() || d_view.stride_1() != h_view.stride_1() ||
d_view.stride_2() != h_view.stride_2() || d_view.stride_2() != h_view.stride_2() ||
@ -503,6 +503,18 @@ public:
/* Realloc on Device */ /* Realloc on Device */
::Kokkos::realloc(d_view,n0,n1,n2,n3,n4,n5,n6,n7); ::Kokkos::realloc(d_view,n0,n1,n2,n3,n4,n5,n6,n7);
const bool sizeMismatch = ( h_view.extent(0) != n0 ) ||
( h_view.extent(1) != n1 ) ||
( h_view.extent(2) != n2 ) ||
( h_view.extent(3) != n3 ) ||
( h_view.extent(4) != n4 ) ||
( h_view.extent(5) != n5 ) ||
( h_view.extent(6) != n6 ) ||
( h_view.extent(7) != n7 );
if ( sizeMismatch )
::Kokkos::resize(h_view,n0,n1,n2,n3,n4,n5,n6,n7);
t_host temp_view = create_mirror_view( d_view ); t_host temp_view = create_mirror_view( d_view );
/* Remap on Host */ /* Remap on Host */
@ -510,6 +522,8 @@ public:
h_view = temp_view; h_view = temp_view;
d_view = create_mirror_view( typename t_dev::execution_space(), h_view );
/* Mark Host copy as modified */ /* Mark Host copy as modified */
modified_host() = modified_host()+1; modified_host() = modified_host()+1;
} }
@ -530,22 +544,34 @@ public:
d_view.stride(stride_); d_view.stride(stride_);
} }
template< typename iType >
KOKKOS_INLINE_FUNCTION constexpr
typename std::enable_if< std::is_integral<iType>::value , size_t >::type
extent( const iType & r ) const
{ return d_view.extent(r); }
template< typename iType >
KOKKOS_INLINE_FUNCTION constexpr
typename std::enable_if< std::is_integral<iType>::value , int >::type
extent_int( const iType & r ) const
{ return static_cast<int>(d_view.extent(r)); }
/* \brief return size of dimension 0 */ /* \brief return size of dimension 0 */
size_t dimension_0() const {return d_view.dimension_0();} size_t dimension_0() const {return d_view.extent(0);}
/* \brief return size of dimension 1 */ /* \brief return size of dimension 1 */
size_t dimension_1() const {return d_view.dimension_1();} size_t dimension_1() const {return d_view.extent(1);}
/* \brief return size of dimension 2 */ /* \brief return size of dimension 2 */
size_t dimension_2() const {return d_view.dimension_2();} size_t dimension_2() const {return d_view.extent(2);}
/* \brief return size of dimension 3 */ /* \brief return size of dimension 3 */
size_t dimension_3() const {return d_view.dimension_3();} size_t dimension_3() const {return d_view.extent(3);}
/* \brief return size of dimension 4 */ /* \brief return size of dimension 4 */
size_t dimension_4() const {return d_view.dimension_4();} size_t dimension_4() const {return d_view.extent(4);}
/* \brief return size of dimension 5 */ /* \brief return size of dimension 5 */
size_t dimension_5() const {return d_view.dimension_5();} size_t dimension_5() const {return d_view.extent(5);}
/* \brief return size of dimension 6 */ /* \brief return size of dimension 6 */
size_t dimension_6() const {return d_view.dimension_6();} size_t dimension_6() const {return d_view.extent(6);}
/* \brief return size of dimension 7 */ /* \brief return size of dimension 7 */
size_t dimension_7() const {return d_view.dimension_7();} size_t dimension_7() const {return d_view.extent(7);}
//@} //@}
}; };

View File

@ -35,16 +35,16 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
*/ */
/// \file Kokkos_DynRankView.hpp /// \file Kokkos_DynRankView.hpp
/// \brief Declaration and definition of Kokkos::Experimental::DynRankView. /// \brief Declaration and definition of Kokkos::DynRankView.
/// ///
/// This header file declares and defines Kokkos::Experimental::DynRankView and its /// This header file declares and defines Kokkos::DynRankView and its
/// related nonmember functions. /// related nonmember functions.
#ifndef KOKKOS_DYNRANKVIEW_HPP #ifndef KOKKOS_DYNRANKVIEW_HPP
@ -55,7 +55,6 @@
#include <type_traits> #include <type_traits>
namespace Kokkos { namespace Kokkos {
namespace Experimental {
template< typename DataType , class ... Properties > template< typename DataType , class ... Properties >
class DynRankView; //forward declare class DynRankView; //forward declare
@ -156,7 +155,7 @@ struct DynRankDimTraits {
// Extra overload to match that for specialize types // Extra overload to match that for specialize types
template <typename Traits, typename ... P> template <typename Traits, typename ... P>
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
static typename std::enable_if< (std::is_same<typename Traits::array_layout , Kokkos::LayoutRight>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutLeft>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutStride>::value) , typename Traits::array_layout >::type createLayout( const ViewCtorProp<P...>& prop, const typename Traits::array_layout& layout ) static typename std::enable_if< (std::is_same<typename Traits::array_layout , Kokkos::LayoutRight>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutLeft>::value || std::is_same<typename Traits::array_layout , Kokkos::LayoutStride>::value) , typename Traits::array_layout >::type createLayout( const Kokkos::Impl::ViewCtorProp<P...>& prop, const typename Traits::array_layout& layout )
{ {
return createLayout( layout ); return createLayout( layout );
} }
@ -318,7 +317,6 @@ void dyn_rank_view_verify_operator_bounds
struct ViewToDynRankViewTag {}; struct ViewToDynRankViewTag {};
} // namespace Impl } // namespace Impl
} // namespace Experimental
namespace Impl { namespace Impl {
@ -348,7 +346,7 @@ class ViewMapping< DstTraits , SrcTraits ,
) )
) )
) )
) , Kokkos::Experimental::Impl::ViewToDynRankViewTag >::type > ) , Kokkos::Impl::ViewToDynRankViewTag >::type >
{ {
private: private:
@ -375,7 +373,7 @@ public:
template < typename DT , typename ... DP , typename ST , typename ... SP > template < typename DT , typename ... DP , typename ST , typename ... SP >
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
static void assign( Kokkos::Experimental::DynRankView< DT , DP...> & dst , const Kokkos::View< ST , SP... > & src ) static void assign( Kokkos::DynRankView< DT , DP...> & dst , const Kokkos::View< ST , SP... > & src )
{ {
static_assert( is_assignable_value_type static_assert( is_assignable_value_type
, "View assignment must have same value type or const = non-const" ); , "View assignment must have same value type or const = non-const" );
@ -395,8 +393,6 @@ public:
} //end Impl } //end Impl
namespace Experimental {
/* \class DynRankView /* \class DynRankView
* \brief Container that creates a Kokkos view with rank determined at runtime. * \brief Container that creates a Kokkos view with rank determined at runtime.
* Essentially this is a rank 7 view * Essentially this is a rank 7 view
@ -415,7 +411,7 @@ namespace Experimental {
template< class > struct is_dyn_rank_view : public std::false_type {}; template< class > struct is_dyn_rank_view : public std::false_type {};
template< class D, class ... P > template< class D, class ... P >
struct is_dyn_rank_view< Kokkos::Experimental::DynRankView<D,P...> > : public std::true_type {}; struct is_dyn_rank_view< Kokkos::DynRankView<D,P...> > : public std::true_type {};
template< typename DataType , class ... Properties > template< typename DataType , class ... Properties >
@ -425,7 +421,7 @@ class DynRankView : public ViewTraits< DataType , Properties ... >
private: private:
template < class , class ... > friend class DynRankView ; template < class , class ... > friend class DynRankView ;
template < class , class ... > friend class Impl::ViewMapping ; template < class , class ... > friend class Kokkos::Impl::ViewMapping ;
public: public:
typedef ViewTraits< DataType , Properties ... > drvtraits ; typedef ViewTraits< DataType , Properties ... > drvtraits ;
@ -437,7 +433,7 @@ public:
private: private:
typedef Kokkos::Impl::ViewMapping< traits , void > map_type ; typedef Kokkos::Impl::ViewMapping< traits , void > map_type ;
typedef Kokkos::Experimental::Impl::SharedAllocationTracker track_type ; typedef Kokkos::Impl::SharedAllocationTracker track_type ;
track_type m_track ; track_type m_track ;
map_type m_map ; map_type m_map ;
@ -601,7 +597,7 @@ private:
// rank of the calling operator - included as first argument in ARG // rank of the calling operator - included as first argument in ARG
#define KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( ARG ) \ #define KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( ARG ) \
DynRankView::template verify_space< Kokkos::Impl::ActiveExecutionMemorySpace >::check(); \ DynRankView::template verify_space< Kokkos::Impl::ActiveExecutionMemorySpace >::check(); \
Kokkos::Experimental::Impl::dyn_rank_view_verify_operator_bounds< typename traits::memory_space > ARG ; Kokkos::Impl::dyn_rank_view_verify_operator_bounds< typename traits::memory_space > ARG ;
#else #else
@ -778,6 +774,140 @@ public:
return m_map.reference(i0,i1,i2,i3,i4,i5,i6); return m_map.reference(i0,i1,i2,i3,i4,i5,i6);
} }
// Rank 0
KOKKOS_INLINE_FUNCTION
reference_type access() const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (0 , this->rank(), m_track, m_map) )
return implementation_map().reference();
//return m_map.reference(0,0,0,0,0,0,0);
}
// Rank 1
// Rank 1 parenthesis
template< typename iType >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType>::value), reference_type>::type
access(const iType & i0 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (1 , this->rank(), m_track, m_map, i0) )
return m_map.reference(i0);
}
template< typename iType >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< !(std::is_same<typename traits::specialize , void>::value && std::is_integral<iType>::value), reference_type>::type
access(const iType & i0 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (1 , this->rank(), m_track, m_map, i0) )
return m_map.reference(i0,0,0,0,0,0,0);
}
// Rank 2
template< typename iType0 , typename iType1 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (2 , this->rank(), m_track, m_map, i0, i1) )
return m_map.reference(i0,i1);
}
template< typename iType0 , typename iType1 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (2 , this->rank(), m_track, m_map, i0, i1) )
return m_map.reference(i0,i1,0,0,0,0,0);
}
// Rank 3
template< typename iType0 , typename iType1 , typename iType2 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (3 , this->rank(), m_track, m_map, i0, i1, i2) )
return m_map.reference(i0,i1,i2);
}
template< typename iType0 , typename iType1 , typename iType2 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (3 , this->rank(), m_track, m_map, i0, i1, i2) )
return m_map.reference(i0,i1,i2,0,0,0,0);
}
// Rank 4
template< typename iType0 , typename iType1 , typename iType2 , typename iType3 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (4 , this->rank(), m_track, m_map, i0, i1, i2, i3) )
return m_map.reference(i0,i1,i2,i3);
}
template< typename iType0 , typename iType1 , typename iType2 , typename iType3 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (4 , this->rank(), m_track, m_map, i0, i1, i2, i3) )
return m_map.reference(i0,i1,i2,i3,0,0,0);
}
// Rank 5
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value && std::is_integral<iType4>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (5 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4) )
return m_map.reference(i0,i1,i2,i3,i4);
}
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (5 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4) )
return m_map.reference(i0,i1,i2,i3,i4,0,0);
}
// Rank 6
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 , typename iType5 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< (std::is_same<typename traits::specialize , void>::value && std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value && std::is_integral<iType4>::value && std::is_integral<iType5>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 , const iType5 & i5 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (6 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4, i5) )
return m_map.reference(i0,i1,i2,i3,i4,i5);
}
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 , typename iType5 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< !(std::is_same<typename drvtraits::specialize , void>::value && std::is_integral<iType0>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 , const iType5 & i5 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (6 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4, i5) )
return m_map.reference(i0,i1,i2,i3,i4,i5,0);
}
// Rank 7
template< typename iType0 , typename iType1 , typename iType2 , typename iType3, typename iType4 , typename iType5 , typename iType6 >
KOKKOS_INLINE_FUNCTION
typename std::enable_if< (std::is_integral<iType0>::value && std::is_integral<iType1>::value && std::is_integral<iType2>::value && std::is_integral<iType3>::value && std::is_integral<iType4>::value && std::is_integral<iType5>::value && std::is_integral<iType6>::value), reference_type>::type
access(const iType0 & i0 , const iType1 & i1 , const iType2 & i2 , const iType3 & i3 , const iType4 & i4 , const iType5 & i5 , const iType6 & i6 ) const
{
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY( (7 , this->rank(), m_track, m_map, i0, i1, i2, i3, i4, i5, i6) )
return m_map.reference(i0,i1,i2,i3,i4,i5,i6);
}
#undef KOKKOS_IMPL_VIEW_OPERATOR_VERIFY #undef KOKKOS_IMPL_VIEW_OPERATOR_VERIFY
//---------------------------------------- //----------------------------------------
@ -830,7 +960,6 @@ public:
return *this; return *this;
} }
// Experimental
// Copy/Assign View to DynRankView // Copy/Assign View to DynRankView
template< class RT , class ... RP > template< class RT , class ... RP >
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
@ -840,7 +969,7 @@ public:
, m_rank( rhs.Rank ) , m_rank( rhs.Rank )
{ {
typedef typename View<RT,RP...>::traits SrcTraits ; typedef typename View<RT,RP...>::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Experimental::Impl::ViewToDynRankViewTag > Mapping ; typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Impl::ViewToDynRankViewTag > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" ); static_assert( Mapping::is_assignable , "Incompatible DynRankView copy construction" );
Mapping::assign( *this , rhs ); Mapping::assign( *this , rhs );
} }
@ -850,7 +979,7 @@ public:
DynRankView & operator = ( const View<RT,RP...> & rhs ) DynRankView & operator = ( const View<RT,RP...> & rhs )
{ {
typedef typename View<RT,RP...>::traits SrcTraits ; typedef typename View<RT,RP...>::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Experimental::Impl::ViewToDynRankViewTag > Mapping ; typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , Kokkos::Impl::ViewToDynRankViewTag > Mapping ;
static_assert( Mapping::is_assignable , "Incompatible View to DynRankView copy assignment" ); static_assert( Mapping::is_assignable , "Incompatible View to DynRankView copy assignment" );
Mapping::assign( *this , rhs ); Mapping::assign( *this , rhs );
return *this ; return *this ;
@ -872,8 +1001,8 @@ public:
// unused arg_layout dimensions must be set to ~size_t(0) so that rank deduction can properly take place // unused arg_layout dimensions must be set to ~size_t(0) so that rank deduction can properly take place
template< class ... P > template< class ... P >
explicit inline explicit inline
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
, typename std::enable_if< ! Impl::ViewCtorProp< P... >::has_pointer , typename std::enable_if< ! Kokkos::Impl::ViewCtorProp< P... >::has_pointer
, typename traits::array_layout , typename traits::array_layout
>::type const & arg_layout >::type const & arg_layout
) )
@ -882,11 +1011,11 @@ public:
, m_rank( Impl::DynRankDimTraits<typename traits::specialize>::template computeRank< typename traits::array_layout, P...>(arg_prop, arg_layout) ) , m_rank( Impl::DynRankDimTraits<typename traits::specialize>::template computeRank< typename traits::array_layout, P...>(arg_prop, arg_layout) )
{ {
// Append layout and spaces if not input // Append layout and spaces if not input
typedef Impl::ViewCtorProp< P ... > alloc_prop_input ; typedef Kokkos::Impl::ViewCtorProp< P ... > alloc_prop_input ;
// use 'std::integral_constant<unsigned,I>' for non-types // use 'std::integral_constant<unsigned,I>' for non-types
// to avoid duplicate class error. // to avoid duplicate class error.
typedef Impl::ViewCtorProp typedef Kokkos::Impl::ViewCtorProp
< P ... < P ...
, typename std::conditional , typename std::conditional
< alloc_prop_input::has_label < alloc_prop_input::has_label
@ -931,7 +1060,7 @@ public:
#endif #endif
//------------------------------------------------------------ //------------------------------------------------------------
Kokkos::Experimental::Impl::SharedAllocationRecord<> * Kokkos::Impl::SharedAllocationRecord<> *
record = m_map.allocate_shared( prop , Impl::DynRankDimTraits<typename traits::specialize>::template createLayout<traits, P...>(arg_prop, arg_layout) ); record = m_map.allocate_shared( prop , Impl::DynRankDimTraits<typename traits::specialize>::template createLayout<traits, P...>(arg_prop, arg_layout) );
//------------------------------------------------------------ //------------------------------------------------------------
@ -950,8 +1079,8 @@ public:
// Wrappers // Wrappers
template< class ... P > template< class ... P >
explicit KOKKOS_INLINE_FUNCTION explicit KOKKOS_INLINE_FUNCTION
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
, typename std::enable_if< Impl::ViewCtorProp< P... >::has_pointer , typename std::enable_if< Kokkos::Impl::ViewCtorProp< P... >::has_pointer
, typename traits::array_layout , typename traits::array_layout
>::type const & arg_layout >::type const & arg_layout
) )
@ -972,8 +1101,8 @@ public:
// Simple dimension-only layout // Simple dimension-only layout
template< class ... P > template< class ... P >
explicit inline explicit inline
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
, typename std::enable_if< ! Impl::ViewCtorProp< P... >::has_pointer , typename std::enable_if< ! Kokkos::Impl::ViewCtorProp< P... >::has_pointer
, size_t , size_t
>::type const arg_N0 = ~size_t(0) >::type const arg_N0 = ~size_t(0)
, const size_t arg_N1 = ~size_t(0) , const size_t arg_N1 = ~size_t(0)
@ -992,8 +1121,8 @@ public:
template< class ... P > template< class ... P >
explicit KOKKOS_INLINE_FUNCTION explicit KOKKOS_INLINE_FUNCTION
DynRankView( const Impl::ViewCtorProp< P ... > & arg_prop DynRankView( const Kokkos::Impl::ViewCtorProp< P ... > & arg_prop
, typename std::enable_if< Impl::ViewCtorProp< P... >::has_pointer , typename std::enable_if< Kokkos::Impl::ViewCtorProp< P... >::has_pointer
, size_t , size_t
>::type const arg_N0 = ~size_t(0) >::type const arg_N0 = ~size_t(0)
, const size_t arg_N1 = ~size_t(0) , const size_t arg_N1 = ~size_t(0)
@ -1015,10 +1144,10 @@ public:
explicit inline explicit inline
DynRankView( const Label & arg_label DynRankView( const Label & arg_label
, typename std::enable_if< , typename std::enable_if<
Kokkos::Experimental::Impl::is_view_label<Label>::value , Kokkos::Impl::is_view_label<Label>::value ,
typename traits::array_layout >::type const & arg_layout typename traits::array_layout >::type const & arg_layout
) )
: DynRankView( Impl::ViewCtorProp< std::string >( arg_label ) , arg_layout ) : DynRankView( Kokkos::Impl::ViewCtorProp< std::string >( arg_label ) , arg_layout )
{} {}
// Allocate label and layout, must disambiguate from subview constructor // Allocate label and layout, must disambiguate from subview constructor
@ -1026,7 +1155,7 @@ public:
explicit inline explicit inline
DynRankView( const Label & arg_label DynRankView( const Label & arg_label
, typename std::enable_if< , typename std::enable_if<
Kokkos::Experimental::Impl::is_view_label<Label>::value , Kokkos::Impl::is_view_label<Label>::value ,
const size_t >::type arg_N0 = ~size_t(0) const size_t >::type arg_N0 = ~size_t(0)
, const size_t arg_N1 = ~size_t(0) , const size_t arg_N1 = ~size_t(0)
, const size_t arg_N2 = ~size_t(0) , const size_t arg_N2 = ~size_t(0)
@ -1036,7 +1165,7 @@ public:
, const size_t arg_N6 = ~size_t(0) , const size_t arg_N6 = ~size_t(0)
, const size_t arg_N7 = ~size_t(0) , const size_t arg_N7 = ~size_t(0)
) )
: DynRankView( Impl::ViewCtorProp< std::string >( arg_label ) : DynRankView( Kokkos::Impl::ViewCtorProp< std::string >( arg_label )
, typename traits::array_layout , typename traits::array_layout
( arg_N0 , arg_N1 , arg_N2 , arg_N3 , arg_N4 , arg_N5 , arg_N6 , arg_N7 ) ( arg_N0 , arg_N1 , arg_N2 , arg_N3 , arg_N4 , arg_N5 , arg_N6 , arg_N7 )
) )
@ -1048,7 +1177,8 @@ public:
DynRankView( const ViewAllocateWithoutInitializing & arg_prop DynRankView( const ViewAllocateWithoutInitializing & arg_prop
, const typename traits::array_layout & arg_layout , const typename traits::array_layout & arg_layout
) )
: DynRankView( Impl::ViewCtorProp< std::string , Kokkos::Experimental::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::Experimental::WithoutInitializing ) : DynRankView( Kokkos::Impl::ViewCtorProp< std::string , Kokkos::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::WithoutInitializing )
, Impl::DynRankDimTraits<typename traits::specialize>::createLayout(arg_layout) , Impl::DynRankDimTraits<typename traits::specialize>::createLayout(arg_layout)
) )
{} {}
@ -1064,7 +1194,7 @@ public:
, const size_t arg_N6 = ~size_t(0) , const size_t arg_N6 = ~size_t(0)
, const size_t arg_N7 = ~size_t(0) , const size_t arg_N7 = ~size_t(0)
) )
: DynRankView(Impl::ViewCtorProp< std::string , Kokkos::Experimental::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::Experimental::WithoutInitializing ), arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 ) : DynRankView(Kokkos::Impl::ViewCtorProp< std::string , Kokkos::Impl::WithoutInitializing_t >( arg_prop.label , Kokkos::WithoutInitializing ), arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 )
{} {}
//---------------------------------------- //----------------------------------------
@ -1097,14 +1227,14 @@ public:
, const size_t arg_N6 = ~size_t(0) , const size_t arg_N6 = ~size_t(0)
, const size_t arg_N7 = ~size_t(0) , const size_t arg_N7 = ~size_t(0)
) )
: DynRankView( Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 ) : DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_N0, arg_N1, arg_N2, arg_N3, arg_N4, arg_N5, arg_N6, arg_N7 )
{} {}
explicit KOKKOS_INLINE_FUNCTION explicit KOKKOS_INLINE_FUNCTION
DynRankView( pointer_type arg_ptr DynRankView( pointer_type arg_ptr
, typename traits::array_layout & arg_layout , typename traits::array_layout & arg_layout
) )
: DynRankView( Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_layout ) : DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(arg_ptr) , arg_layout )
{} {}
@ -1140,7 +1270,7 @@ public:
explicit KOKKOS_INLINE_FUNCTION explicit KOKKOS_INLINE_FUNCTION
DynRankView( const typename traits::execution_space::scratch_memory_space & arg_space DynRankView( const typename traits::execution_space::scratch_memory_space & arg_space
, const typename traits::array_layout & arg_layout ) , const typename traits::array_layout & arg_layout )
: DynRankView( Impl::ViewCtorProp<pointer_type>( : DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(
reinterpret_cast<pointer_type>( reinterpret_cast<pointer_type>(
arg_space.get_shmem( map_type::memory_span( arg_space.get_shmem( map_type::memory_span(
Impl::DynRankDimTraits<typename traits::specialize>::createLayout( arg_layout ) //is this correct? Impl::DynRankDimTraits<typename traits::specialize>::createLayout( arg_layout ) //is this correct?
@ -1159,7 +1289,7 @@ public:
, const size_t arg_N6 = ~size_t(0) , const size_t arg_N6 = ~size_t(0)
, const size_t arg_N7 = ~size_t(0) ) , const size_t arg_N7 = ~size_t(0) )
: DynRankView( Impl::ViewCtorProp<pointer_type>( : DynRankView( Kokkos::Impl::ViewCtorProp<pointer_type>(
reinterpret_cast<pointer_type>( reinterpret_cast<pointer_type>(
arg_space.get_shmem( arg_space.get_shmem(
map_type::memory_span( map_type::memory_span(
@ -1190,7 +1320,6 @@ namespace Impl {
struct DynRankSubviewTag {}; struct DynRankSubviewTag {};
} // namespace Impl } // namespace Impl
} // namespace Experimental
namespace Impl { namespace Impl {
@ -1207,7 +1336,7 @@ struct ViewMapping
std::is_same< typename SrcTraits::array_layout std::is_same< typename SrcTraits::array_layout
, Kokkos::LayoutStride >::value , Kokkos::LayoutStride >::value
) )
), Kokkos::Experimental::Impl::DynRankSubviewTag >::type ), Kokkos::Impl::DynRankSubviewTag >::type
, SrcTraits , SrcTraits
, Args ... > , Args ... >
{ {
@ -1279,11 +1408,11 @@ public:
}; };
typedef Kokkos::Experimental::DynRankView< value_type , array_layout , typename SrcTraits::device_type , typename SrcTraits::memory_traits > ret_type; typedef Kokkos::DynRankView< value_type , array_layout , typename SrcTraits::device_type , typename SrcTraits::memory_traits > ret_type;
template < typename T , class ... P > template < typename T , class ... P >
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
static ret_type subview( const unsigned src_rank , Kokkos::Experimental::DynRankView< T , P...> const & src static ret_type subview( const unsigned src_rank , Kokkos::DynRankView< T , P...> const & src
, Args ... args ) , Args ... args )
{ {
@ -1351,20 +1480,19 @@ public:
} // end Impl } // end Impl
namespace Experimental {
template< class V , class ... Args > template< class V , class ... Args >
using Subdynrankview = typename Kokkos::Impl::ViewMapping< Kokkos::Experimental::Impl::DynRankSubviewTag , V , Args... >::ret_type ; using Subdynrankview = typename Kokkos::Impl::ViewMapping< Kokkos::Impl::DynRankSubviewTag , V , Args... >::ret_type ;
template< class D , class ... P , class ...Args > template< class D , class ... P , class ...Args >
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
Subdynrankview< ViewTraits<D******* , P...> , Args... > Subdynrankview< ViewTraits<D******* , P...> , Args... >
subdynrankview( const Kokkos::Experimental::DynRankView< D , P... > &src , Args...args) subdynrankview( const Kokkos::DynRankView< D , P... > &src , Args...args)
{ {
if ( src.rank() > sizeof...(Args) ) //allow sizeof...(Args) >= src.rank(), ignore the remaining args if ( src.rank() > sizeof...(Args) ) //allow sizeof...(Args) >= src.rank(), ignore the remaining args
{ Kokkos::abort("subdynrankview: num of args must be >= rank of the source DynRankView"); } { Kokkos::abort("subdynrankview: num of args must be >= rank of the source DynRankView"); }
typedef Kokkos::Impl::ViewMapping< Kokkos::Experimental::Impl::DynRankSubviewTag , Kokkos::ViewTraits< D*******, P... > , Args... > metafcn ; typedef Kokkos::Impl::ViewMapping< Kokkos::Impl::DynRankSubviewTag , Kokkos::ViewTraits< D*******, P... > , Args... > metafcn ;
return metafcn::subview( src.rank() , src , args... ); return metafcn::subview( src.rank() , src , args... );
} }
@ -1373,16 +1501,14 @@ subdynrankview( const Kokkos::Experimental::DynRankView< D , P... > &src , Args.
template< class D , class ... P , class ...Args > template< class D , class ... P , class ...Args >
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
Subdynrankview< ViewTraits<D******* , P...> , Args... > Subdynrankview< ViewTraits<D******* , P...> , Args... >
subview( const Kokkos::Experimental::DynRankView< D , P... > &src , Args...args) subview( const Kokkos::DynRankView< D , P... > &src , Args...args)
{ {
return subdynrankview( src , args... ); return subdynrankview( src , args... );
} }
} // namespace Experimental
} // namespace Kokkos } // namespace Kokkos
namespace Kokkos { namespace Kokkos {
namespace Experimental {
// overload == and != // overload == and !=
template< class LT , class ... LP , class RT , class ... RP > template< class LT , class ... LP , class RT , class ... RP >
@ -1422,13 +1548,11 @@ bool operator != ( const DynRankView<LT,LP...> & lhs ,
return ! ( operator==(lhs,rhs) ); return ! ( operator==(lhs,rhs) );
} }
} //end Experimental
} //end Kokkos } //end Kokkos
//---------------------------------------------------------------------------- //----------------------------------------------------------------------------
//---------------------------------------------------------------------------- //----------------------------------------------------------------------------
namespace Kokkos { namespace Kokkos {
namespace Experimental {
namespace Impl { namespace Impl {
template< class OutputView , typename Enable = void > template< class OutputView , typename Enable = void >
@ -1455,7 +1579,7 @@ struct DynRankViewFill {
for ( size_t i4 = 0 ; i4 < n4 ; ++i4 ) { for ( size_t i4 = 0 ; i4 < n4 ; ++i4 ) {
for ( size_t i5 = 0 ; i5 < n5 ; ++i5 ) { for ( size_t i5 = 0 ; i5 < n5 ; ++i5 ) {
for ( size_t i6 = 0 ; i6 < n6 ; ++i6 ) { for ( size_t i6 = 0 ; i6 < n6 ; ++i6 ) {
output(i0,i1,i2,i3,i4,i5,i6) = input ; output.access(i0,i1,i2,i3,i4,i5,i6) = input ;
}}}}}} }}}}}}
} }
@ -1498,14 +1622,14 @@ struct DynRankViewRemap {
DynRankViewRemap( const OutputView & arg_out , const InputView & arg_in ) DynRankViewRemap( const OutputView & arg_out , const InputView & arg_in )
: output( arg_out ), input( arg_in ) : output( arg_out ), input( arg_in )
, n0( std::min( (size_t)arg_out.dimension_0() , (size_t)arg_in.dimension_0() ) ) , n0( std::min( (size_t)arg_out.extent(0) , (size_t)arg_in.extent(0) ) )
, n1( std::min( (size_t)arg_out.dimension_1() , (size_t)arg_in.dimension_1() ) ) , n1( std::min( (size_t)arg_out.extent(1) , (size_t)arg_in.extent(1) ) )
, n2( std::min( (size_t)arg_out.dimension_2() , (size_t)arg_in.dimension_2() ) ) , n2( std::min( (size_t)arg_out.extent(2) , (size_t)arg_in.extent(2) ) )
, n3( std::min( (size_t)arg_out.dimension_3() , (size_t)arg_in.dimension_3() ) ) , n3( std::min( (size_t)arg_out.extent(3) , (size_t)arg_in.extent(3) ) )
, n4( std::min( (size_t)arg_out.dimension_4() , (size_t)arg_in.dimension_4() ) ) , n4( std::min( (size_t)arg_out.extent(4) , (size_t)arg_in.extent(4) ) )
, n5( std::min( (size_t)arg_out.dimension_5() , (size_t)arg_in.dimension_5() ) ) , n5( std::min( (size_t)arg_out.extent(5) , (size_t)arg_in.extent(5) ) )
, n6( std::min( (size_t)arg_out.dimension_6() , (size_t)arg_in.dimension_6() ) ) , n6( std::min( (size_t)arg_out.extent(6) , (size_t)arg_in.extent(6) ) )
, n7( std::min( (size_t)arg_out.dimension_7() , (size_t)arg_in.dimension_7() ) ) , n7( std::min( (size_t)arg_out.extent(7) , (size_t)arg_in.extent(7) ) )
{ {
typedef Kokkos::RangePolicy< ExecSpace > Policy ; typedef Kokkos::RangePolicy< ExecSpace > Policy ;
const Kokkos::Impl::ParallelFor< DynRankViewRemap , Policy > closure( *this , Policy( 0 , n0 ) ); const Kokkos::Impl::ParallelFor< DynRankViewRemap , Policy > closure( *this , Policy( 0 , n0 ) );
@ -1521,18 +1645,16 @@ struct DynRankViewRemap {
for ( size_t i4 = 0 ; i4 < n4 ; ++i4 ) { for ( size_t i4 = 0 ; i4 < n4 ; ++i4 ) {
for ( size_t i5 = 0 ; i5 < n5 ; ++i5 ) { for ( size_t i5 = 0 ; i5 < n5 ; ++i5 ) {
for ( size_t i6 = 0 ; i6 < n6 ; ++i6 ) { for ( size_t i6 = 0 ; i6 < n6 ; ++i6 ) {
output(i0,i1,i2,i3,i4,i5,i6) = input(i0,i1,i2,i3,i4,i5,i6); output.access(i0,i1,i2,i3,i4,i5,i6) = input.access(i0,i1,i2,i3,i4,i5,i6);
}}}}}} }}}}}}
} }
}; };
} /* namespace Impl */ } /* namespace Impl */
} /* namespace Experimental */
} /* namespace Kokkos */ } /* namespace Kokkos */
namespace Kokkos { namespace Kokkos {
namespace Experimental {
/** \brief Deep copy a value from Host memory into a view. */ /** \brief Deep copy a value from Host memory into a view. */
template< class DT , class ... DP > template< class DT , class ... DP >
@ -1549,7 +1671,7 @@ void deep_copy
typename ViewTraits<DT,DP...>::value_type >::value typename ViewTraits<DT,DP...>::value_type >::value
, "deep_copy requires non-const type" ); , "deep_copy requires non-const type" );
Kokkos::Experimental::Impl::DynRankViewFill< DynRankView<DT,DP...> >( dst , value ); Kokkos::Impl::DynRankViewFill< DynRankView<DT,DP...> >( dst , value );
} }
/** \brief Deep copy into a value in Host memory from a view. */ /** \brief Deep copy into a value in Host memory from a view. */
@ -1585,7 +1707,7 @@ void deep_copy
std::is_same< typename DstType::traits::specialize , void >::value && std::is_same< typename DstType::traits::specialize , void >::value &&
std::is_same< typename SrcType::traits::specialize , void >::value std::is_same< typename SrcType::traits::specialize , void >::value
&& &&
( Kokkos::Experimental::is_dyn_rank_view<DstType>::value || Kokkos::Experimental::is_dyn_rank_view<SrcType>::value) ( Kokkos::is_dyn_rank_view<DstType>::value || Kokkos::is_dyn_rank_view<SrcType>::value)
)>::type * = 0 ) )>::type * = 0 )
{ {
static_assert( static_assert(
@ -1641,14 +1763,15 @@ void deep_copy
dst.span_is_contiguous() && dst.span_is_contiguous() &&
src.span_is_contiguous() && src.span_is_contiguous() &&
dst.span() == src.span() && dst.span() == src.span() &&
dst.dimension_0() == src.dimension_0() && dst.extent(0) == src.extent(0) &&
dst.dimension_1() == src.dimension_1() &&
dst.dimension_2() == src.dimension_2() && dst.extent(1) == src.extent(1) &&
dst.dimension_3() == src.dimension_3() && dst.extent(2) == src.extent(2) &&
dst.dimension_4() == src.dimension_4() && dst.extent(3) == src.extent(3) &&
dst.dimension_5() == src.dimension_5() && dst.extent(4) == src.extent(4) &&
dst.dimension_6() == src.dimension_6() && dst.extent(5) == src.extent(5) &&
dst.dimension_7() == src.dimension_7() ) { dst.extent(6) == src.extent(6) &&
dst.extent(7) == src.extent(7) ) {
const size_t nbytes = sizeof(typename dst_type::value_type) * dst.span(); const size_t nbytes = sizeof(typename dst_type::value_type) * dst.span();
@ -1673,14 +1796,14 @@ void deep_copy
dst.span_is_contiguous() && dst.span_is_contiguous() &&
src.span_is_contiguous() && src.span_is_contiguous() &&
dst.span() == src.span() && dst.span() == src.span() &&
dst.dimension_0() == src.dimension_0() && dst.extent(0) == src.extent(0) &&
dst.dimension_1() == src.dimension_1() && dst.extent(1) == src.extent(1) &&
dst.dimension_2() == src.dimension_2() && dst.extent(2) == src.extent(2) &&
dst.dimension_3() == src.dimension_3() && dst.extent(3) == src.extent(3) &&
dst.dimension_4() == src.dimension_4() && dst.extent(4) == src.extent(4) &&
dst.dimension_5() == src.dimension_5() && dst.extent(5) == src.extent(5) &&
dst.dimension_6() == src.dimension_6() && dst.extent(6) == src.extent(6) &&
dst.dimension_7() == src.dimension_7() && dst.extent(7) == src.extent(7) &&
dst.stride_0() == src.stride_0() && dst.stride_0() == src.stride_0() &&
dst.stride_1() == src.stride_1() && dst.stride_1() == src.stride_1() &&
dst.stride_2() == src.stride_2() && dst.stride_2() == src.stride_2() &&
@ -1697,11 +1820,11 @@ void deep_copy
} }
else if ( DstExecCanAccessSrc ) { else if ( DstExecCanAccessSrc ) {
// Copying data between views in accessible memory spaces and either non-contiguous or incompatible shape. // Copying data between views in accessible memory spaces and either non-contiguous or incompatible shape.
Kokkos::Experimental::Impl::DynRankViewRemap< dst_type , src_type >( dst , src ); Kokkos::Impl::DynRankViewRemap< dst_type , src_type >( dst , src );
} }
else if ( SrcExecCanAccessDst ) { else if ( SrcExecCanAccessDst ) {
// Copying data between views in accessible memory spaces and either non-contiguous or incompatible shape. // Copying data between views in accessible memory spaces and either non-contiguous or incompatible shape.
Kokkos::Experimental::Impl::DynRankViewRemap< dst_type , src_type , src_execution_space >( dst , src ); Kokkos::Impl::DynRankViewRemap< dst_type , src_type , src_execution_space >( dst , src );
} }
else { else {
Kokkos::Impl::throw_runtime_exception("deep_copy given views that would require a temporary allocation"); Kokkos::Impl::throw_runtime_exception("deep_copy given views that would require a temporary allocation");
@ -1709,7 +1832,6 @@ void deep_copy
} }
} }
} //end Experimental
} //end Kokkos } //end Kokkos
@ -1717,8 +1839,6 @@ void deep_copy
//---------------------------------------------------------------------------- //----------------------------------------------------------------------------
namespace Kokkos { namespace Kokkos {
namespace Experimental {
namespace Impl { namespace Impl {
@ -1726,7 +1846,7 @@ namespace Impl {
template<class Space, class T, class ... P> template<class Space, class T, class ... P>
struct MirrorDRViewType { struct MirrorDRViewType {
// The incoming view_type // The incoming view_type
typedef typename Kokkos::Experimental::DynRankView<T,P...> src_view_type; typedef typename Kokkos::DynRankView<T,P...> src_view_type;
// The memory space for the mirror view // The memory space for the mirror view
typedef typename Space::memory_space memory_space; typedef typename Space::memory_space memory_space;
// Check whether it is the same memory space // Check whether it is the same memory space
@ -1736,7 +1856,7 @@ struct MirrorDRViewType {
// The data type (we probably want it non-const since otherwise we can't even deep_copy to it. // The data type (we probably want it non-const since otherwise we can't even deep_copy to it.
typedef typename src_view_type::non_const_data_type data_type; typedef typename src_view_type::non_const_data_type data_type;
// The destination view type if it is not the same memory space // The destination view type if it is not the same memory space
typedef Kokkos::Experimental::DynRankView<data_type,array_layout,Space> dest_view_type; typedef Kokkos::DynRankView<data_type,array_layout,Space> dest_view_type;
// If it is the same memory_space return the existsing view_type // If it is the same memory_space return the existsing view_type
// This will also keep the unmanaged trait if necessary // This will also keep the unmanaged trait if necessary
typedef typename std::conditional<is_same_memspace,src_view_type,dest_view_type>::type view_type; typedef typename std::conditional<is_same_memspace,src_view_type,dest_view_type>::type view_type;
@ -1745,7 +1865,7 @@ struct MirrorDRViewType {
template<class Space, class T, class ... P> template<class Space, class T, class ... P>
struct MirrorDRVType { struct MirrorDRVType {
// The incoming view_type // The incoming view_type
typedef typename Kokkos::Experimental::DynRankView<T,P...> src_view_type; typedef typename Kokkos::DynRankView<T,P...> src_view_type;
// The memory space for the mirror view // The memory space for the mirror view
typedef typename Space::memory_space memory_space; typedef typename Space::memory_space memory_space;
// Check whether it is the same memory space // Check whether it is the same memory space
@ -1755,12 +1875,11 @@ struct MirrorDRVType {
// The data type (we probably want it non-const since otherwise we can't even deep_copy to it. // The data type (we probably want it non-const since otherwise we can't even deep_copy to it.
typedef typename src_view_type::non_const_data_type data_type; typedef typename src_view_type::non_const_data_type data_type;
// The destination view type if it is not the same memory space // The destination view type if it is not the same memory space
typedef Kokkos::Experimental::DynRankView<data_type,array_layout,Space> view_type; typedef Kokkos::DynRankView<data_type,array_layout,Space> view_type;
}; };
} }
template< class T , class ... P > template< class T , class ... P >
inline inline
typename DynRankView<T,P...>::HostMirror typename DynRankView<T,P...>::HostMirror
@ -1799,7 +1918,7 @@ create_mirror( const DynRankView<T,P...> & src
// Create a mirror in a new space (specialization for different space) // Create a mirror in a new space (specialization for different space)
template<class Space, class T, class ... P> template<class Space, class T, class ... P>
typename Impl::MirrorDRVType<Space,T,P ...>::view_type create_mirror(const Space& , const Kokkos::Experimental::DynRankView<T,P...> & src) { typename Impl::MirrorDRVType<Space,T,P ...>::view_type create_mirror(const Space& , const Kokkos::DynRankView<T,P...> & src) {
return typename Impl::MirrorDRVType<Space,T,P ...>::view_type(src.label(), Impl::reconstructLayout(src.layout(), src.rank()) ); return typename Impl::MirrorDRVType<Space,T,P ...>::view_type(src.label(), Impl::reconstructLayout(src.layout(), src.rank()) );
} }
@ -1836,13 +1955,13 @@ create_mirror_view( const DynRankView<T,P...> & src
)>::type * = 0 )>::type * = 0
) )
{ {
return Kokkos::Experimental::create_mirror( src ); return Kokkos::create_mirror( src );
} }
// Create a mirror view in a new space (specialization for same space) // Create a mirror view in a new space (specialization for same space)
template<class Space, class T, class ... P> template<class Space, class T, class ... P>
typename Impl::MirrorDRViewType<Space,T,P ...>::view_type typename Impl::MirrorDRViewType<Space,T,P ...>::view_type
create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...> & src create_mirror_view(const Space& , const Kokkos::DynRankView<T,P...> & src
, typename std::enable_if<Impl::MirrorDRViewType<Space,T,P ...>::is_same_memspace>::type* = 0 ) { , typename std::enable_if<Impl::MirrorDRViewType<Space,T,P ...>::is_same_memspace>::type* = 0 ) {
return src; return src;
} }
@ -1850,12 +1969,11 @@ create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...
// Create a mirror view in a new space (specialization for different space) // Create a mirror view in a new space (specialization for different space)
template<class Space, class T, class ... P> template<class Space, class T, class ... P>
typename Impl::MirrorDRViewType<Space,T,P ...>::view_type typename Impl::MirrorDRViewType<Space,T,P ...>::view_type
create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...> & src create_mirror_view(const Space& , const Kokkos::DynRankView<T,P...> & src
, typename std::enable_if<!Impl::MirrorDRViewType<Space,T,P ...>::is_same_memspace>::type* = 0 ) { , typename std::enable_if<!Impl::MirrorDRViewType<Space,T,P ...>::is_same_memspace>::type* = 0 ) {
return typename Impl::MirrorDRViewType<Space,T,P ...>::view_type(src.label(), Impl::reconstructLayout(src.layout(), src.rank()) ); return typename Impl::MirrorDRViewType<Space,T,P ...>::view_type(src.label(), Impl::reconstructLayout(src.layout(), src.rank()) );
} }
} //end Experimental
} //end Kokkos } //end Kokkos
@ -1863,7 +1981,6 @@ create_mirror_view(const Space& , const Kokkos::Experimental::DynRankView<T,P...
//---------------------------------------------------------------------------- //----------------------------------------------------------------------------
namespace Kokkos { namespace Kokkos {
namespace Experimental {
/** \brief Resize a view with copying old data to new data at the corresponding indices. */ /** \brief Resize a view with copying old data to new data at the corresponding indices. */
template< class T , class ... P > template< class T , class ... P >
inline inline
@ -1877,13 +1994,13 @@ void resize( DynRankView<T,P...> & v ,
const size_t n6 = ~size_t(0) , const size_t n6 = ~size_t(0) ,
const size_t n7 = ~size_t(0) ) const size_t n7 = ~size_t(0) )
{ {
typedef DynRankView<T,P...> drview_type ; typedef DynRankView<T,P...> drview_type ;
static_assert( Kokkos::ViewTraits<T,P...>::is_managed , "Can only resize managed views" ); static_assert( Kokkos::ViewTraits<T,P...>::is_managed , "Can only resize managed views" );
drview_type v_resized( v.label(), n0, n1, n2, n3, n4, n5, n6 ); drview_type v_resized( v.label(), n0, n1, n2, n3, n4, n5, n6 );
Kokkos::Experimental::Impl::DynRankViewRemap< drview_type , drview_type >( v_resized, v ); Kokkos::Impl::DynRankViewRemap< drview_type , drview_type >( v_resized, v );
v = v_resized ; v = v_resized ;
} }
@ -1911,25 +2028,7 @@ void realloc( DynRankView<T,P...> & v ,
v = drview_type( label, n0, n1, n2, n3, n4, n5, n6 ); v = drview_type( label, n0, n1, n2, n3, n4, n5, n6 );
} }
} //end Experimental
} //end Kokkos } //end Kokkos
using Kokkos::Experimental::is_dyn_rank_view ;
namespace Kokkos {
template< typename D , class ... P >
using DynRankView = Kokkos::Experimental::DynRankView< D , P... > ;
using Kokkos::Experimental::deep_copy ;
using Kokkos::Experimental::create_mirror ;
using Kokkos::Experimental::create_mirror_view ;
using Kokkos::Experimental::subdynrankview ;
using Kokkos::Experimental::subview ;
using Kokkos::Experimental::resize ;
using Kokkos::Experimental::realloc ;
} //end Kokkos
#endif #endif

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -52,7 +52,33 @@
namespace Kokkos { namespace Kokkos {
namespace Experimental { namespace Experimental {
// Simple metafunction for choosing memory space
// In the current implementation, if memory_space == CudaSpace,
// use CudaUVMSpace for the chunk 'array' allocation, which
// contains will contain pointers to chunks of memory allocated
// in CudaSpace
namespace Impl {
template < class MemSpace >
struct ChunkArraySpace {
using memory_space = MemSpace;
};
#ifdef KOKKOS_ENABLE_CUDA
template <>
struct ChunkArraySpace< Kokkos::CudaSpace > {
using memory_space = typename Kokkos::CudaUVMSpace;
};
#endif
#ifdef KOKKOS_ENABLE_ROCM
template <>
struct ChunkArraySpace< Kokkos::Experimental::ROCmSpace > {
using memory_space = typename Kokkos::Experimental::ROCmHostPinnedSpace;
};
#endif
} // end namespace Impl
/** \brief Dynamic views are restricted to rank-one and no layout. /** \brief Dynamic views are restricted to rank-one and no layout.
* Resize only occurs on host outside of parallel_regions.
* Subviews are not allowed. * Subviews are not allowed.
*/ */
template< typename DataType , typename ... P > template< typename DataType , typename ... P >
@ -66,7 +92,7 @@ private:
template< class , class ... > friend class DynamicView ; template< class , class ... > friend class DynamicView ;
typedef Kokkos::Experimental::Impl::SharedAllocationTracker track_type ; typedef Kokkos::Impl::SharedAllocationTracker track_type ;
static_assert( traits::rank == 1 && traits::rank_dynamic == 1 static_assert( traits::rank == 1 && traits::rank_dynamic == 1
, "DynamicView must be rank-one" ); , "DynamicView must be rank-one" );
@ -86,18 +112,14 @@ private:
{ Kokkos::abort("Kokkos::DynamicView ERROR: attempt to access inaccessible memory space"); }; { Kokkos::abort("Kokkos::DynamicView ERROR: attempt to access inaccessible memory space"); };
}; };
public:
typedef Kokkos::MemoryPool< typename traits::device_type > memory_pool ;
private: private:
memory_pool m_pool ;
track_type m_track ; track_type m_track ;
typename traits::value_type ** m_chunks ; typename traits::value_type ** m_chunks ; // array of pointers to 'chunks' of memory
unsigned m_chunk_shift ; unsigned m_chunk_shift ; // ceil(log2(m_chunk_size))
unsigned m_chunk_mask ; unsigned m_chunk_mask ; // m_chunk_size - 1
unsigned m_chunk_max ; unsigned m_chunk_max ; // number of entries in the chunk array - each pointing to a chunk of extent == m_chunk_size entries
unsigned m_chunk_size ; // 2 << (m_chunk_shift - 1)
public: public:
@ -125,28 +147,24 @@ public:
enum { Rank = 1 }; enum { Rank = 1 };
KOKKOS_INLINE_FUNCTION
size_t allocation_extent() const noexcept
{
uintptr_t n = *reinterpret_cast<const uintptr_t*>( m_chunks + m_chunk_max );
return (n << m_chunk_shift);
}
KOKKOS_INLINE_FUNCTION
size_t chunk_size() const noexcept
{
return m_chunk_size;
}
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
size_t size() const noexcept size_t size() const noexcept
{ {
uintptr_t n = 0 ; size_t extent_0 = *reinterpret_cast<const size_t*>( m_chunks + m_chunk_max +1 );
return extent_0;
if ( Kokkos::Impl::MemorySpaceAccess
< Kokkos::Impl::ActiveExecutionMemorySpace
, typename traits::memory_space
>::accessible ) {
n = *reinterpret_cast<const uintptr_t*>( m_chunks + m_chunk_max );
}
#if defined( KOKKOS_ACTIVE_EXECUTION_MEMORY_SPACE_HOST )
else {
Kokkos::Impl::DeepCopy< Kokkos::HostSpace
, typename traits::memory_space
, Kokkos::HostSpace::execution_space >
( & n
, reinterpret_cast<const uintptr_t*>( m_chunks + m_chunk_max )
, sizeof(uintptr_t) );
}
#endif
return n << m_chunk_shift ;
} }
template< typename iType > template< typename iType >
@ -159,6 +177,7 @@ public:
size_t extent_int( const iType & r ) const size_t extent_int( const iType & r ) const
{ return r == 0 ? size() : 1 ; } { return r == 0 ? size() : 1 ; }
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE
KOKKOS_INLINE_FUNCTION size_t dimension_0() const { return size(); } KOKKOS_INLINE_FUNCTION size_t dimension_0() const { return size(); }
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_1() const { return 1 ; } KOKKOS_INLINE_FUNCTION constexpr size_t dimension_1() const { return 1 ; }
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_2() const { return 1 ; } KOKKOS_INLINE_FUNCTION constexpr size_t dimension_2() const { return 1 ; }
@ -167,6 +186,7 @@ public:
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_5() const { return 1 ; } KOKKOS_INLINE_FUNCTION constexpr size_t dimension_5() const { return 1 ; }
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_6() const { return 1 ; } KOKKOS_INLINE_FUNCTION constexpr size_t dimension_6() const { return 1 ; }
KOKKOS_INLINE_FUNCTION constexpr size_t dimension_7() const { return 1 ; } KOKKOS_INLINE_FUNCTION constexpr size_t dimension_7() const { return 1 ; }
#endif
KOKKOS_INLINE_FUNCTION constexpr size_t stride_0() const { return 0 ; } KOKKOS_INLINE_FUNCTION constexpr size_t stride_0() const { return 0 ; }
KOKKOS_INLINE_FUNCTION constexpr size_t stride_1() const { return 0 ; } KOKKOS_INLINE_FUNCTION constexpr size_t stride_1() const { return 0 ; }
@ -180,6 +200,17 @@ public:
template< typename iType > template< typename iType >
KOKKOS_INLINE_FUNCTION void stride( iType * const s ) const { *s = 0 ; } KOKKOS_INLINE_FUNCTION void stride( iType * const s ) const { *s = 0 ; }
//----------------------------------------
// Allocation tracking properties
KOKKOS_INLINE_FUNCTION
int use_count() const
{ return m_track.use_count(); }
inline
const std::string label() const
{ return m_track.template get_label< typename traits::memory_space >(); }
//---------------------------------------------------------------------- //----------------------------------------------------------------------
// Range span is the span which contains all members. // Range span is the span which contains all members.
@ -234,65 +265,15 @@ public:
} }
//---------------------------------------- //----------------------------------------
/** \brief Resizing in parallel only increases the array size, /** \brief Resizing in serial can grow or shrink the array size
* never decrease. * up to the maximum number of chunks
*/ * */
KOKKOS_INLINE_FUNCTION
void resize_parallel( size_t n ) const
{
typedef typename traits::value_type value_type ;
DynamicView::template verify_space< Kokkos::Impl::ActiveExecutionMemorySpace >::check();
const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ;
if ( m_chunk_max < NC ) {
#if defined( KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK )
printf("DynamicView::resize_parallel(%lu) m_chunk_max(%u) NC(%lu)\n"
, n , m_chunk_max , NC );
#endif
Kokkos::abort("DynamicView::resize_parallel exceeded maximum size");
}
typename traits::value_type * volatile * const ch = m_chunks ;
// The allocated chunk counter is m_chunks[ m_chunk_max ]
uintptr_t volatile * const pc =
reinterpret_cast<uintptr_t volatile*>( m_chunks + m_chunk_max );
// Potentially concurrent iteration of allocation to the required size.
for ( uintptr_t jc = *pc ; jc < NC ; ) {
// Claim the 'jc' chunk to-be-allocated index
const uintptr_t jc_try = jc ;
// Jump iteration to the chunk counter.
jc = atomic_compare_exchange( pc , jc_try , jc_try + 1 );
if ( jc_try == jc ) {
ch[jc_try] = reinterpret_cast<value_type*>(
m_pool.allocate( sizeof(value_type) << m_chunk_shift ));
if ( 0 == ch[jc_try] ) {
Kokkos::abort("DynamicView::resize_parallel exhausted memory pool");
}
Kokkos::memory_fence();
}
}
}
/** \brief Resizing in serial can grow or shrink the array size, */
template< typename IntType > template< typename IntType >
inline inline
typename std::enable_if typename std::enable_if
< std::is_integral<IntType>::value && < std::is_integral<IntType>::value &&
Kokkos::Impl::MemorySpaceAccess< Kokkos::HostSpace Kokkos::Impl::MemorySpaceAccess< Kokkos::HostSpace
, typename traits::memory_space , typename Impl::ChunkArraySpace< typename traits::memory_space >::memory_space
>::accessible >::accessible
>::type >::type
resize_serial( IntType const & n ) resize_serial( IntType const & n )
@ -300,108 +281,35 @@ public:
typedef typename traits::value_type value_type ; typedef typename traits::value_type value_type ;
typedef value_type * pointer_type ; typedef value_type * pointer_type ;
const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ; const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ; // New total number of chunks needed for resize
if ( m_chunk_max < NC ) { if ( m_chunk_max < NC ) {
Kokkos::abort("DynamicView::resize_serial exceeded maximum size"); Kokkos::abort("DynamicView::resize_serial exceeded maximum size");
} }
// *m_chunks[m_chunk_max] stores the current number of chunks being used
uintptr_t * const pc = uintptr_t * const pc =
reinterpret_cast<uintptr_t*>( m_chunks + m_chunk_max ); reinterpret_cast<uintptr_t*>( m_chunks + m_chunk_max );
if ( *pc < NC ) { if ( *pc < NC ) {
while ( *pc < NC ) { while ( *pc < NC ) {
m_chunks[*pc] = reinterpret_cast<pointer_type> m_chunks[*pc] = reinterpret_cast<pointer_type>
( m_pool.allocate( sizeof(value_type) << m_chunk_shift ) ); (
typename traits::memory_space().allocate( sizeof(value_type) << m_chunk_shift )
);
++*pc ; ++*pc ;
} }
} }
else { else {
while ( NC + 1 <= *pc ) { while ( NC + 1 <= *pc ) {
--*pc ; --*pc ;
m_pool.deallocate( m_chunks[*pc] typename traits::memory_space().deallocate( m_chunks[*pc]
, sizeof(value_type) << m_chunk_shift ); , sizeof(value_type) << m_chunk_shift );
m_chunks[*pc] = 0 ; m_chunks[*pc] = 0 ;
} }
} }
} // *m_chunks[m_chunk_max+1] stores the 'extent' requested by resize
*(pc+1) = n;
//----------------------------------------
struct ResizeSerial {
memory_pool m_pool ;
typename traits::value_type ** m_chunks ;
uintptr_t * m_pc ;
uintptr_t m_nc ;
unsigned m_chunk_shift ;
KOKKOS_INLINE_FUNCTION
void operator()( int ) const
{
typedef typename traits::value_type value_type ;
typedef value_type * pointer_type ;
if ( *m_pc < m_nc ) {
while ( *m_pc < m_nc ) {
m_chunks[*m_pc] = reinterpret_cast<pointer_type>
( m_pool.allocate( sizeof(value_type) << m_chunk_shift ) );
++*m_pc ;
}
}
else {
while ( m_nc + 1 <= *m_pc ) {
--*m_pc ;
m_pool.deallocate( m_chunks[*m_pc]
, sizeof(value_type) << m_chunk_shift );
m_chunks[*m_pc] = 0 ;
}
}
}
ResizeSerial( memory_pool const & arg_pool
, typename traits::value_type ** arg_chunks
, uintptr_t * arg_pc
, uintptr_t arg_nc
, unsigned arg_chunk_shift
)
: m_pool( arg_pool )
, m_chunks( arg_chunks )
, m_pc( arg_pc )
, m_nc( arg_nc )
, m_chunk_shift( arg_chunk_shift )
{}
};
template< typename IntType >
inline
typename std::enable_if
< std::is_integral<IntType>::value &&
! Kokkos::Impl::MemorySpaceAccess< Kokkos::HostSpace
, typename traits::memory_space
>::accessible
>::type
resize_serial( IntType const & n )
{
const uintptr_t NC = ( n + m_chunk_mask ) >> m_chunk_shift ;
if ( m_chunk_max < NC ) {
Kokkos::abort("DynamicView::resize_serial exceeded maximum size");
}
// Must dispatch kernel
typedef Kokkos::RangePolicy< typename traits::execution_space > Range ;
uintptr_t * const pc =
reinterpret_cast<uintptr_t*>( m_chunks + m_chunk_max );
Kokkos::Impl::ParallelFor<ResizeSerial,Range>
closure( ResizeSerial( m_pool, m_chunks, pc, NC, m_chunk_shift )
, Range(0,1) );
closure.execute();
traits::execution_space::fence();
} }
//---------------------------------------------------------------------- //----------------------------------------------------------------------
@ -415,12 +323,12 @@ public:
template< class RT , class ... RP > template< class RT , class ... RP >
DynamicView( const DynamicView<RT,RP...> & rhs ) DynamicView( const DynamicView<RT,RP...> & rhs )
: m_pool( rhs.m_pool ) : m_track( rhs.m_track )
, m_track( rhs.m_track )
, m_chunks( (typename traits::value_type **) rhs.m_chunks ) , m_chunks( (typename traits::value_type **) rhs.m_chunks )
, m_chunk_shift( rhs.m_chunk_shift ) , m_chunk_shift( rhs.m_chunk_shift )
, m_chunk_mask( rhs.m_chunk_mask ) , m_chunk_mask( rhs.m_chunk_mask )
, m_chunk_max( rhs.m_chunk_max ) , m_chunk_max( rhs.m_chunk_max )
, m_chunk_size( rhs.m_chunk_size )
{ {
typedef typename DynamicView<RT,RP...>::traits SrcTraits ; typedef typename DynamicView<RT,RP...>::traits SrcTraits ;
typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ; typedef Kokkos::Impl::ViewMapping< traits , SrcTraits , void > Mapping ;
@ -430,35 +338,36 @@ public:
//---------------------------------------------------------------------- //----------------------------------------------------------------------
struct Destroy { struct Destroy {
memory_pool m_pool ;
typename traits::value_type ** m_chunks ; typename traits::value_type ** m_chunks ;
unsigned m_chunk_max ; unsigned m_chunk_max ;
bool m_destroy ; bool m_destroy ;
unsigned m_chunk_size ;
// Initialize or destroy array of chunk pointers. // Initialize or destroy array of chunk pointers.
// Two entries beyond the max chunks are allocation counters. // Two entries beyond the max chunks are allocation counters.
inline
KOKKOS_INLINE_FUNCTION
void operator()( unsigned i ) const void operator()( unsigned i ) const
{ {
if ( m_destroy && i < m_chunk_max && 0 != m_chunks[i] ) { if ( m_destroy && i < m_chunk_max && 0 != m_chunks[i] ) {
m_pool.deallocate( m_chunks[i] , m_pool.min_block_size() ); typename traits::memory_space().deallocate( m_chunks[i], m_chunk_size );
} }
m_chunks[i] = 0 ; m_chunks[i] = 0 ;
} }
void execute( bool arg_destroy ) void execute( bool arg_destroy )
{ {
typedef Kokkos::RangePolicy< typename traits::execution_space > Range ; typedef Kokkos::RangePolicy< typename HostSpace::execution_space > Range ;
//typedef Kokkos::RangePolicy< typename Impl::ChunkArraySpace< typename traits::memory_space >::memory_space::execution_space > Range ;
m_destroy = arg_destroy ; m_destroy = arg_destroy ;
Kokkos::Impl::ParallelFor<Destroy,Range> Kokkos::Impl::ParallelFor<Destroy,Range>
closure( *this , Range(0, m_chunk_max + 1) ); closure( *this , Range(0, m_chunk_max + 2) ); // Add 2 to 'destroy' extra slots storing num_chunks and extent; previously + 1
closure.execute(); closure.execute();
traits::execution_space::fence(); traits::execution_space::fence();
//Impl::ChunkArraySpace< typename traits::memory_space >::memory_space::execution_space::fence();
} }
void construct_shared_allocation() void construct_shared_allocation()
@ -473,66 +382,64 @@ public:
Destroy & operator = ( Destroy && ) = default ; Destroy & operator = ( Destroy && ) = default ;
Destroy & operator = ( const Destroy & ) = default ; Destroy & operator = ( const Destroy & ) = default ;
Destroy( const memory_pool & arg_pool Destroy( typename traits::value_type ** arg_chunk
, typename traits::value_type ** arg_chunk , const unsigned arg_chunk_max
, const unsigned arg_chunk_max ) , const unsigned arg_chunk_size )
: m_pool( arg_pool ) : m_chunks( arg_chunk )
, m_chunks( arg_chunk )
, m_chunk_max( arg_chunk_max ) , m_chunk_max( arg_chunk_max )
, m_destroy( false ) , m_destroy( false )
, m_chunk_size( arg_chunk_size )
{} {}
}; };
/**\brief Allocation constructor /**\brief Allocation constructor
* *
* Memory is allocated in chunks from the memory pool. * Memory is allocated in chunks
* The chunk size conforms to the memory pool's chunk size.
* A maximum size is required in order to allocate a * A maximum size is required in order to allocate a
* chunk-pointer array. * chunk-pointer array.
*/ */
explicit inline explicit inline
DynamicView( const std::string & arg_label DynamicView( const std::string & arg_label
, const memory_pool & arg_pool , const unsigned min_chunk_size
, const size_t arg_size_max ) , const unsigned max_extent )
: m_pool( arg_pool ) : m_track()
, m_track()
, m_chunks(0) , m_chunks(0)
// The memory pool chunk is guaranteed to be a power of two // The chunk size is guaranteed to be a power of two
, m_chunk_shift( , m_chunk_shift(
Kokkos::Impl::integral_power_of_two( Kokkos::Impl::integral_power_of_two_that_contains( min_chunk_size ) ) // div ceil(log2(min_chunk_size))
m_pool.min_block_size()/sizeof(typename traits::value_type)) ) , m_chunk_mask( ( 1 << m_chunk_shift ) - 1 ) // mod
, m_chunk_mask( ( 1 << m_chunk_shift ) - 1 ) , m_chunk_max( ( max_extent + m_chunk_mask ) >> m_chunk_shift ) // max num pointers-to-chunks in array
, m_chunk_max( ( arg_size_max + m_chunk_mask ) >> m_chunk_shift ) , m_chunk_size ( 2 << (m_chunk_shift - 1) )
{ {
typedef typename Impl::ChunkArraySpace< typename traits::memory_space >::memory_space chunk_array_memory_space;
// A functor to deallocate all of the chunks upon final destruction // A functor to deallocate all of the chunks upon final destruction
typedef Kokkos::Impl::SharedAllocationRecord< chunk_array_memory_space , Destroy > record_type ;
typedef typename traits::memory_space memory_space ;
typedef Kokkos::Experimental::Impl::SharedAllocationRecord< memory_space , Destroy > record_type ;
// Allocate chunk pointers and allocation counter // Allocate chunk pointers and allocation counter
record_type * const record = record_type * const record =
record_type::allocate( memory_space() record_type::allocate( chunk_array_memory_space()
, arg_label , arg_label
, ( sizeof(pointer_type) * ( m_chunk_max + 1 ) ) ); , ( sizeof(pointer_type) * ( m_chunk_max + 2 ) ) );
// Allocate + 2 extra slots so that *m_chunk[m_chunk_max] == num_chunks_alloc and *m_chunk[m_chunk_max+1] == extent
// This must match in Destroy's execute(...) method
m_chunks = reinterpret_cast<pointer_type*>( record->data() ); m_chunks = reinterpret_cast<pointer_type*>( record->data() );
record->m_destroy = Destroy( m_pool , m_chunks , m_chunk_max ); record->m_destroy = Destroy( m_chunks , m_chunk_max, m_chunk_size );
// Initialize to zero // Initialize to zero
record->m_destroy.construct_shared_allocation(); record->m_destroy.construct_shared_allocation();
m_track.assign_allocated_record_to_uninitialized( record ); m_track.assign_allocated_record_to_uninitialized( record );
} }
}; };
} // namespace Experimental } // namespace Experimental
} // namespace Kokkos } // namespace Kokkos
namespace Kokkos { namespace Kokkos {
namespace Experimental {
template< class T , class ... P > template< class T , class ... P >
inline inline
@ -545,11 +452,11 @@ create_mirror_view( const Kokkos::Experimental::DynamicView<T,P...> & src )
template< class T , class ... DP , class ... SP > template< class T , class ... DP , class ... SP >
inline inline
void deep_copy( const View<T,DP...> & dst void deep_copy( const View<T,DP...> & dst
, const DynamicView<T,SP...> & src , const Kokkos::Experimental::DynamicView<T,SP...> & src
) )
{ {
typedef View<T,DP...> dst_type ; typedef View<T,DP...> dst_type ;
typedef DynamicView<T,SP...> src_type ; typedef Kokkos::Experimental::DynamicView<T,SP...> src_type ;
typedef typename ViewTraits<T,DP...>::execution_space dst_execution_space ; typedef typename ViewTraits<T,DP...>::execution_space dst_execution_space ;
typedef typename ViewTraits<T,SP...>::memory_space src_memory_space ; typedef typename ViewTraits<T,SP...>::memory_space src_memory_space ;
@ -568,11 +475,11 @@ void deep_copy( const View<T,DP...> & dst
template< class T , class ... DP , class ... SP > template< class T , class ... DP , class ... SP >
inline inline
void deep_copy( const DynamicView<T,DP...> & dst void deep_copy( const Kokkos::Experimental::DynamicView<T,DP...> & dst
, const View<T,SP...> & src , const View<T,SP...> & src
) )
{ {
typedef DynamicView<T,SP...> dst_type ; typedef Kokkos::Experimental::DynamicView<T,SP...> dst_type ;
typedef View<T,DP...> src_type ; typedef View<T,DP...> src_type ;
typedef typename ViewTraits<T,DP...>::execution_space dst_execution_space ; typedef typename ViewTraits<T,DP...>::execution_space dst_execution_space ;
@ -590,7 +497,81 @@ void deep_copy( const DynamicView<T,DP...> & dst
} }
} }
} // namespace Experimental namespace Impl {
template<class Arg0, class ... DP , class ... SP>
struct CommonSubview<Kokkos::Experimental::DynamicView<DP...>,Kokkos::Experimental::DynamicView<SP...>,1,Arg0> {
typedef Kokkos::Experimental::DynamicView<DP...> DstType;
typedef Kokkos::Experimental::DynamicView<SP...> SrcType;
typedef DstType dst_subview_type;
typedef SrcType src_subview_type;
dst_subview_type dst_sub;
src_subview_type src_sub;
CommonSubview(const DstType& dst, const SrcType& src, const Arg0& arg0):
dst_sub(dst),src_sub(src) {}
};
template<class ...DP, class SrcType, class Arg0>
struct CommonSubview<Kokkos::Experimental::DynamicView<DP...>,SrcType,1,Arg0> {
typedef Kokkos::Experimental::DynamicView<DP...> DstType;
typedef DstType dst_subview_type;
typedef typename Kokkos::Subview<SrcType,Arg0> src_subview_type;
dst_subview_type dst_sub;
src_subview_type src_sub;
CommonSubview(const DstType& dst, const SrcType& src, const Arg0& arg0):
dst_sub(dst),src_sub(src,arg0) {}
};
template<class DstType, class ...SP, class Arg0>
struct CommonSubview<DstType,Kokkos::Experimental::DynamicView<SP...>,1,Arg0> {
typedef Kokkos::Experimental::DynamicView<SP...> SrcType;
typedef typename Kokkos::Subview<DstType,Arg0> dst_subview_type;
typedef SrcType src_subview_type;
dst_subview_type dst_sub;
src_subview_type src_sub;
CommonSubview(const DstType& dst, const SrcType& src, const Arg0& arg0):
dst_sub(dst,arg0),src_sub(src) {}
};
template<class ...DP,class ViewTypeB, class Layout, class ExecSpace,typename iType>
struct ViewCopy<Kokkos::Experimental::DynamicView<DP...>,ViewTypeB,Layout,ExecSpace,1,iType> {
Kokkos::Experimental::DynamicView<DP...> a;
ViewTypeB b;
typedef Kokkos::RangePolicy<ExecSpace,Kokkos::IndexType<iType>> policy_type;
ViewCopy(const Kokkos::Experimental::DynamicView<DP...>& a_, const ViewTypeB& b_):a(a_),b(b_) {
Kokkos::parallel_for("Kokkos::ViewCopy-2D",
policy_type(0,b.extent(0)),*this);
}
KOKKOS_INLINE_FUNCTION
void operator() (const iType& i0) const {
a(i0) = b(i0);
};
};
template<class ...DP,class ...SP, class Layout, class ExecSpace,typename iType>
struct ViewCopy<Kokkos::Experimental::DynamicView<DP...>,
Kokkos::Experimental::DynamicView<SP...>,Layout,ExecSpace,1,iType> {
Kokkos::Experimental::DynamicView<DP...> a;
Kokkos::Experimental::DynamicView<SP...> b;
typedef Kokkos::RangePolicy<ExecSpace,Kokkos::IndexType<iType>> policy_type;
ViewCopy(const Kokkos::Experimental::DynamicView<DP...>& a_,
const Kokkos::Experimental::DynamicView<SP...>& b_):a(a_),b(b_) {
const iType n = std::min(a.extent(0),b.extent(0));
Kokkos::parallel_for("Kokkos::ViewCopy-2D",
policy_type(0,n),*this);
}
KOKKOS_INLINE_FUNCTION
void operator() (const iType& i0) const {
a(i0) = b(i0);
};
};
} // namespace Impl
} // namespace Kokkos } // namespace Kokkos
#endif /* #ifndef KOKKOS_DYNAMIC_VIEW_HPP */ #endif /* #ifndef KOKKOS_DYNAMIC_VIEW_HPP */

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -69,7 +69,7 @@ public:
clear(); clear();
} }
int getCapacity() const { return m_reports.h_view.dimension_0(); } int getCapacity() const { return m_reports.h_view.extent(0); }
int getNumReports(); int getNumReports();
@ -90,7 +90,7 @@ public:
{ {
int idx = Kokkos::atomic_fetch_add(&m_numReportsAttempted(), 1); int idx = Kokkos::atomic_fetch_add(&m_numReportsAttempted(), 1);
if (idx >= 0 && (idx < static_cast<int>(m_reports.d_view.dimension_0()))) { if (idx >= 0 && (idx < static_cast<int>(m_reports.d_view.extent(0)))) {
m_reporters.d_view(idx) = reporter_id; m_reporters.d_view(idx) = reporter_id;
m_reports.d_view(idx) = report; m_reports.d_view(idx) = report;
return true; return true;
@ -118,8 +118,8 @@ inline int ErrorReporter<ReportType, DeviceType>::getNumReports()
{ {
int num_reports = 0; int num_reports = 0;
Kokkos::deep_copy(num_reports,m_numReportsAttempted); Kokkos::deep_copy(num_reports,m_numReportsAttempted);
if (num_reports > static_cast<int>(m_reports.h_view.dimension_0())) { if (num_reports > static_cast<int>(m_reports.h_view.extent(0))) {
num_reports = m_reports.h_view.dimension_0(); num_reports = m_reports.h_view.extent(0);
} }
return num_reports; return num_reports;
} }

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -623,14 +623,12 @@ public:
typename ExecSpace::memory_space, typename ExecSpace::memory_space,
typename dest_type::memory_space>::value, typename dest_type::memory_space>::value,
"ScatterView deep_copy destination memory space not accessible"); "ScatterView deep_copy destination memory space not accessible");
size_t strides[8];
internal_view.stride(strides);
bool is_equal = (dest.data() == internal_view.data()); bool is_equal = (dest.data() == internal_view.data());
size_t start = is_equal ? 1 : 0; size_t start = is_equal ? 1 : 0;
Kokkos::Impl::Experimental::ReduceDuplicates<ExecSpace, original_value_type, Op>( Kokkos::Impl::Experimental::ReduceDuplicates<ExecSpace, original_value_type, Op>(
internal_view.data(), internal_view.data(),
dest.data(), dest.data(),
strides[0], internal_view.stride(0),
start, start,
internal_view.extent(0), internal_view.extent(0),
internal_view.label()); internal_view.label());
@ -772,9 +770,6 @@ public:
typename ExecSpace::memory_space, typename ExecSpace::memory_space,
typename dest_type::memory_space>::value, typename dest_type::memory_space>::value,
"ScatterView deep_copy destination memory space not accessible"); "ScatterView deep_copy destination memory space not accessible");
size_t strides[8];
internal_view.stride(strides);
size_t stride = strides[internal_view_type::rank - 1];
auto extent = internal_view.extent( auto extent = internal_view.extent(
internal_view_type::rank - 1); internal_view_type::rank - 1);
bool is_equal = (dest.data() == internal_view.data()); bool is_equal = (dest.data() == internal_view.data());
@ -782,7 +777,7 @@ public:
Kokkos::Impl::Experimental::ReduceDuplicates<ExecSpace, original_value_type, Op>( Kokkos::Impl::Experimental::ReduceDuplicates<ExecSpace, original_value_type, Op>(
internal_view.data(), internal_view.data(),
dest.data(), dest.data(),
stride, internal_view.stride(internal_view_type::rank - 1),
start, start,
extent, extent,
internal_view.label()); internal_view.label());

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -70,7 +70,7 @@ namespace Impl {
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator() (const int_type& iRow) const { void operator() (const int_type& iRow) const {
const int_type num_rows = row_offsets.dimension_0()-1; const int_type num_rows = row_offsets.extent(0)-1;
const int_type num_entries = row_offsets(num_rows); const int_type num_entries = row_offsets(num_rows);
const int_type total_cost = num_entries + num_rows*cost_per_row; const int_type total_cost = num_entries + num_rows*cost_per_row;
@ -105,7 +105,7 @@ namespace Impl {
} }
} else { } else {
if((count >= (current_block + 1) * cost_per_workset) || if((count >= (current_block + 1) * cost_per_workset) ||
(iRow+2 == row_offsets.dimension_0())) { (iRow+2 == row_offsets.extent(0))) {
if(end_block>current_block+1) { if(end_block>current_block+1) {
int_type num_block = end_block-current_block; int_type num_block = end_block-current_block;
row_block_offsets(current_block+1) = iRow; row_block_offsets(current_block+1) = iRow;
@ -330,8 +330,8 @@ public:
*/ */
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
size_type numRows() const { size_type numRows() const {
return (row_map.dimension_0 () != 0) ? return (row_map.extent(0) != 0) ?
row_map.dimension_0 () - static_cast<size_type> (1) : row_map.extent(0) - static_cast<size_type> (1) :
static_cast<size_type> (0); static_cast<size_type> (0);
} }
@ -458,7 +458,7 @@ DataType maximum_entry( const StaticCrsGraph< DataType , Arg1Type , Arg2Type , S
typedef Impl::StaticCrsGraphMaximumEntry< GraphType > FunctorType ; typedef Impl::StaticCrsGraphMaximumEntry< GraphType > FunctorType ;
DataType result = 0 ; DataType result = 0 ;
Kokkos::parallel_reduce( graph.entries.dimension_0(), Kokkos::parallel_reduce( graph.entries.extent(0),
FunctorType(graph), result ); FunctorType(graph), result );
return result ; return result ;
} }

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -477,7 +477,7 @@ public:
/// kernel. /// kernel.
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
size_type hash_capacity() const size_type hash_capacity() const
{ return m_hash_lists.dimension_0(); } { return m_hash_lists.extent(0); }
//--------------------------------------------------------------------------- //---------------------------------------------------------------------------
//--------------------------------------------------------------------------- //---------------------------------------------------------------------------
@ -507,13 +507,13 @@ public:
int volatile & failed_insert_ref = m_scalars((int)failed_insert_idx) ; int volatile & failed_insert_ref = m_scalars((int)failed_insert_idx) ;
const size_type hash_value = m_hasher(k); const size_type hash_value = m_hasher(k);
const size_type hash_list = hash_value % m_hash_lists.dimension_0(); const size_type hash_list = hash_value % m_hash_lists.extent(0);
size_type * curr_ptr = & m_hash_lists[ hash_list ]; size_type * curr_ptr = & m_hash_lists[ hash_list ];
size_type new_index = invalid_index ; size_type new_index = invalid_index ;
// Force integer multiply to long // Force integer multiply to long
size_type index_hint = static_cast<size_type>( (static_cast<double>(hash_list) * capacity()) / m_hash_lists.dimension_0()); size_type index_hint = static_cast<size_type>( (static_cast<double>(hash_list) * capacity()) / m_hash_lists.extent(0));
size_type find_attempts = 0; size_type find_attempts = 0;
@ -645,7 +645,7 @@ public:
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
size_type find( const key_type & k) const size_type find( const key_type & k) const
{ {
size_type curr = 0u < capacity() ? m_hash_lists( m_hasher(k) % m_hash_lists.dimension_0() ) : invalid_index ; size_type curr = 0u < capacity() ? m_hash_lists( m_hasher(k) % m_hash_lists.extent(0) ) : invalid_index ;
KOKKOS_NONTEMPORAL_PREFETCH_LOAD(&m_keys[curr != invalid_index ? curr : 0]); KOKKOS_NONTEMPORAL_PREFETCH_LOAD(&m_keys[curr != invalid_index ? curr : 0]);
while (curr != invalid_index && !m_equal_to( m_keys[curr], k) ) { while (curr != invalid_index && !m_equal_to( m_keys[curr], k) ) {
@ -741,7 +741,7 @@ public:
>::type >::type
create_copy_view( UnorderedMap<SKey, SValue, SDevice, Hasher,EqualTo> const& src) create_copy_view( UnorderedMap<SKey, SValue, SDevice, Hasher,EqualTo> const& src)
{ {
if (m_hash_lists.ptr_on_device() != src.m_hash_lists.ptr_on_device()) { if (m_hash_lists.data() != src.m_hash_lists.data()) {
insertable_map_type tmp; insertable_map_type tmp;
@ -750,23 +750,23 @@ public:
tmp.m_equal_to = src.m_equal_to; tmp.m_equal_to = src.m_equal_to;
tmp.m_size = src.size(); tmp.m_size = src.size();
tmp.m_available_indexes = bitset_type( src.capacity() ); tmp.m_available_indexes = bitset_type( src.capacity() );
tmp.m_hash_lists = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap hash list"), src.m_hash_lists.dimension_0() ); tmp.m_hash_lists = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap hash list"), src.m_hash_lists.extent(0) );
tmp.m_next_index = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap next index"), src.m_next_index.dimension_0() ); tmp.m_next_index = size_type_view( ViewAllocateWithoutInitializing("UnorderedMap next index"), src.m_next_index.extent(0) );
tmp.m_keys = key_type_view( ViewAllocateWithoutInitializing("UnorderedMap keys"), src.m_keys.dimension_0() ); tmp.m_keys = key_type_view( ViewAllocateWithoutInitializing("UnorderedMap keys"), src.m_keys.extent(0) );
tmp.m_values = value_type_view( ViewAllocateWithoutInitializing("UnorderedMap values"), src.m_values.dimension_0() ); tmp.m_values = value_type_view( ViewAllocateWithoutInitializing("UnorderedMap values"), src.m_values.extent(0) );
tmp.m_scalars = scalars_view("UnorderedMap scalars"); tmp.m_scalars = scalars_view("UnorderedMap scalars");
Kokkos::deep_copy(tmp.m_available_indexes, src.m_available_indexes); Kokkos::deep_copy(tmp.m_available_indexes, src.m_available_indexes);
typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, typename SDevice::memory_space > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, typename SDevice::memory_space > raw_deep_copy;
raw_deep_copy(tmp.m_hash_lists.ptr_on_device(), src.m_hash_lists.ptr_on_device(), sizeof(size_type)*src.m_hash_lists.dimension_0()); raw_deep_copy(tmp.m_hash_lists.data(), src.m_hash_lists.data(), sizeof(size_type)*src.m_hash_lists.extent(0));
raw_deep_copy(tmp.m_next_index.ptr_on_device(), src.m_next_index.ptr_on_device(), sizeof(size_type)*src.m_next_index.dimension_0()); raw_deep_copy(tmp.m_next_index.data(), src.m_next_index.data(), sizeof(size_type)*src.m_next_index.extent(0));
raw_deep_copy(tmp.m_keys.ptr_on_device(), src.m_keys.ptr_on_device(), sizeof(key_type)*src.m_keys.dimension_0()); raw_deep_copy(tmp.m_keys.data(), src.m_keys.data(), sizeof(key_type)*src.m_keys.extent(0));
if (!is_set) { if (!is_set) {
raw_deep_copy(tmp.m_values.ptr_on_device(), src.m_values.ptr_on_device(), sizeof(impl_value_type)*src.m_values.dimension_0()); raw_deep_copy(tmp.m_values.data(), src.m_values.data(), sizeof(impl_value_type)*src.m_values.extent(0));
} }
raw_deep_copy(tmp.m_scalars.ptr_on_device(), src.m_scalars.ptr_on_device(), sizeof(int)*num_scalars ); raw_deep_copy(tmp.m_scalars.data(), src.m_scalars.data(), sizeof(int)*num_scalars );
*this = tmp; *this = tmp;
} }
@ -784,21 +784,21 @@ private: // private member functions
{ {
typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, Kokkos::HostSpace > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, Kokkos::HostSpace > raw_deep_copy;
const int true_ = true; const int true_ = true;
raw_deep_copy(m_scalars.ptr_on_device() + flag, &true_, sizeof(int)); raw_deep_copy(m_scalars.data() + flag, &true_, sizeof(int));
} }
void reset_flag(int flag) const void reset_flag(int flag) const
{ {
typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, Kokkos::HostSpace > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< typename device_type::memory_space, Kokkos::HostSpace > raw_deep_copy;
const int false_ = false; const int false_ = false;
raw_deep_copy(m_scalars.ptr_on_device() + flag, &false_, sizeof(int)); raw_deep_copy(m_scalars.data() + flag, &false_, sizeof(int));
} }
bool get_flag(int flag) const bool get_flag(int flag) const
{ {
typedef Kokkos::Impl::DeepCopy< Kokkos::HostSpace, typename device_type::memory_space > raw_deep_copy; typedef Kokkos::Impl::DeepCopy< Kokkos::HostSpace, typename device_type::memory_space > raw_deep_copy;
int result = false; int result = false;
raw_deep_copy(&result, m_scalars.ptr_on_device() + flag, sizeof(int)); raw_deep_copy(&result, m_scalars.data() + flag, sizeof(int));
return result; return result;
} }

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -80,7 +80,7 @@ struct BitsetCount
size_type apply() const size_type apply() const
{ {
size_type count = 0u; size_type count = 0u;
parallel_reduce( m_bitset.m_blocks.dimension_0(), *this, count ); parallel_reduce( m_bitset.m_blocks.extent(0), *this, count );
return count; return count;
} }

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -102,7 +102,7 @@ struct UnorderedMapErase
void apply() const void apply() const
{ {
parallel_for(m_map.m_hash_lists.dimension_0(), *this); parallel_for(m_map.m_hash_lists.extent(0), *this);
} }
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
@ -170,7 +170,7 @@ struct UnorderedMapHistogram
void calculate() void calculate()
{ {
parallel_for(m_map.m_hash_lists.dimension_0(), *this); parallel_for(m_map.m_hash_lists.extent(0), *this);
} }
void clear() void clear()
@ -185,7 +185,7 @@ struct UnorderedMapHistogram
host_histogram_view host_copy = create_mirror_view(m_length); host_histogram_view host_copy = create_mirror_view(m_length);
Kokkos::deep_copy(host_copy, m_length); Kokkos::deep_copy(host_copy, m_length);
for (int i=0, size = host_copy.dimension_0(); i<size; ++i) for (int i=0, size = host_copy.extent(0); i<size; ++i)
{ {
out << host_copy[i] << " , "; out << host_copy[i] << " , ";
} }
@ -197,7 +197,7 @@ struct UnorderedMapHistogram
host_histogram_view host_copy = create_mirror_view(m_distance); host_histogram_view host_copy = create_mirror_view(m_distance);
Kokkos::deep_copy(host_copy, m_distance); Kokkos::deep_copy(host_copy, m_distance);
for (int i=0, size = host_copy.dimension_0(); i<size; ++i) for (int i=0, size = host_copy.extent(0); i<size; ++i)
{ {
out << host_copy[i] << " , "; out << host_copy[i] << " , ";
} }
@ -209,7 +209,7 @@ struct UnorderedMapHistogram
host_histogram_view host_copy = create_mirror_view(m_block_distance); host_histogram_view host_copy = create_mirror_view(m_block_distance);
Kokkos::deep_copy(host_copy, m_block_distance); Kokkos::deep_copy(host_copy, m_block_distance);
for (int i=0, size = host_copy.dimension_0(); i<size; ++i) for (int i=0, size = host_copy.extent(0); i<size; ++i)
{ {
out << host_copy[i] << " , "; out << host_copy[i] << " , ";
} }
@ -261,7 +261,7 @@ struct UnorderedMapPrint
void apply() void apply()
{ {
parallel_for(m_map.m_hash_lists.dimension_0(), *this); parallel_for(m_map.m_hash_lists.extent(0), *this);
} }
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -83,13 +83,9 @@ protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision(5) << std::scientific; std::cout << std::setprecision(5) << std::scientific;
Kokkos::HostSpace::execution_space::initialize();
Kokkos::Cuda::initialize( Kokkos::Cuda::SelectDevice(0) );
} }
static void TearDownTestCase() static void TearDownTestCase()
{ {
Kokkos::Cuda::finalize();
Kokkos::HostSpace::execution_space::finalize();
} }
}; };

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -88,10 +88,10 @@ namespace Impl {
a.template sync<typename ViewType::host_mirror_space>(); a.template sync<typename ViewType::host_mirror_space>();
Scalar count = 0; Scalar count = 0;
for(unsigned int i = 0; i<a.d_view.dimension_0(); i++) for(unsigned int i = 0; i<a.d_view.extent(0); i++)
for(unsigned int j = 0; j<a.d_view.dimension_1(); j++) for(unsigned int j = 0; j<a.d_view.extent(1); j++)
count += a.h_view(i,j); count += a.h_view(i,j);
return count - a.d_view.dimension_0()*a.d_view.dimension_1()-2-4-3*2; return count - a.d_view.extent(0)*a.d_view.extent(1)-2-4-3*2;
} }

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -56,7 +56,7 @@
namespace Test { namespace Test {
template< class T , class ... P > template< class T , class ... P >
size_t allocation_count( const Kokkos::Experimental::DynRankView<T,P...> & view ) size_t allocation_count( const Kokkos::DynRankView<T,P...> & view )
{ {
const size_t card = view.size(); const size_t card = view.size();
const size_t alloc = view.span(); const size_t alloc = view.span();
@ -74,7 +74,7 @@ struct TestViewOperator
static const unsigned N = 100 ; static const unsigned N = 100 ;
static const unsigned D = 3 ; static const unsigned D = 3 ;
typedef Kokkos::Experimental::DynRankView< T , execution_space > view_type ; typedef Kokkos::DynRankView< T , execution_space > view_type ;
const view_type v1 ; const view_type v1 ;
const view_type v2 ; const view_type v2 ;
@ -129,10 +129,10 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 7 >
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ; DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ; DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ;
left_view left ; left_view left ;
right_view right ; right_view right ;
@ -163,13 +163,13 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 7 >
long offset ; long offset ;
offset = -1 ; offset = -1 ;
for ( unsigned i6 = 0 ; i6 < unsigned(left.dimension_6()) ; ++i6 ) for ( unsigned i6 = 0 ; i6 < unsigned(left.extent(6)) ; ++i6 )
for ( unsigned i5 = 0 ; i5 < unsigned(left.dimension_5()) ; ++i5 ) for ( unsigned i5 = 0 ; i5 < unsigned(left.extent(5)) ; ++i5 )
for ( unsigned i4 = 0 ; i4 < unsigned(left.dimension_4()) ; ++i4 ) for ( unsigned i4 = 0 ; i4 < unsigned(left.extent(4)) ; ++i4 )
for ( unsigned i3 = 0 ; i3 < unsigned(left.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(left.extent(3)) ; ++i3 )
for ( unsigned i2 = 0 ; i2 < unsigned(left.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(left.extent(2)) ; ++i2 )
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
{ {
const long j = & left( i0, i1, i2, i3, i4, i5, i6 ) - const long j = & left( i0, i1, i2, i3, i4, i5, i6 ) -
& left( 0, 0, 0, 0, 0, 0, 0 ); & left( 0, 0, 0, 0, 0, 0, 0 );
@ -178,13 +178,13 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 7 >
} }
offset = -1 ; offset = -1 ;
for ( unsigned i0 = 0 ; i0 < unsigned(right.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(right.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(right.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(right.extent(1)) ; ++i1 )
for ( unsigned i2 = 0 ; i2 < unsigned(right.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(right.extent(2)) ; ++i2 )
for ( unsigned i3 = 0 ; i3 < unsigned(right.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(right.extent(3)) ; ++i3 )
for ( unsigned i4 = 0 ; i4 < unsigned(right.dimension_4()) ; ++i4 ) for ( unsigned i4 = 0 ; i4 < unsigned(right.extent(4)) ; ++i4 )
for ( unsigned i5 = 0 ; i5 < unsigned(right.dimension_5()) ; ++i5 ) for ( unsigned i5 = 0 ; i5 < unsigned(right.extent(5)) ; ++i5 )
for ( unsigned i6 = 0 ; i6 < unsigned(right.dimension_6()) ; ++i6 ) for ( unsigned i6 = 0 ; i6 < unsigned(right.extent(6)) ; ++i6 )
{ {
const long j = & right( i0, i1, i2, i3, i4, i5, i6 ) - const long j = & right( i0, i1, i2, i3, i4, i5, i6 ) -
& right( 0, 0, 0, 0, 0, 0, 0 ); & right( 0, 0, 0, 0, 0, 0, 0 );
@ -214,10 +214,10 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 6 >
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ; DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ; DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ;
left_view left ; left_view left ;
right_view right ; right_view right ;
@ -248,12 +248,12 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 6 >
long offset ; long offset ;
offset = -1 ; offset = -1 ;
for ( unsigned i5 = 0 ; i5 < unsigned(left.dimension_5()) ; ++i5 ) for ( unsigned i5 = 0 ; i5 < unsigned(left.extent(5)) ; ++i5 )
for ( unsigned i4 = 0 ; i4 < unsigned(left.dimension_4()) ; ++i4 ) for ( unsigned i4 = 0 ; i4 < unsigned(left.extent(4)) ; ++i4 )
for ( unsigned i3 = 0 ; i3 < unsigned(left.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(left.extent(3)) ; ++i3 )
for ( unsigned i2 = 0 ; i2 < unsigned(left.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(left.extent(2)) ; ++i2 )
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
{ {
const long j = & left( i0, i1, i2, i3, i4, i5 ) - const long j = & left( i0, i1, i2, i3, i4, i5 ) -
& left( 0, 0, 0, 0, 0, 0 ); & left( 0, 0, 0, 0, 0, 0 );
@ -262,12 +262,12 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 6 >
} }
offset = -1 ; offset = -1 ;
for ( unsigned i0 = 0 ; i0 < unsigned(right.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(right.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(right.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(right.extent(1)) ; ++i1 )
for ( unsigned i2 = 0 ; i2 < unsigned(right.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(right.extent(2)) ; ++i2 )
for ( unsigned i3 = 0 ; i3 < unsigned(right.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(right.extent(3)) ; ++i3 )
for ( unsigned i4 = 0 ; i4 < unsigned(right.dimension_4()) ; ++i4 ) for ( unsigned i4 = 0 ; i4 < unsigned(right.extent(4)) ; ++i4 )
for ( unsigned i5 = 0 ; i5 < unsigned(right.dimension_5()) ; ++i5 ) for ( unsigned i5 = 0 ; i5 < unsigned(right.extent(5)) ; ++i5 )
{ {
const long j = & right( i0, i1, i2, i3, i4, i5 ) - const long j = & right( i0, i1, i2, i3, i4, i5 ) -
& right( 0, 0, 0, 0, 0, 0 ); & right( 0, 0, 0, 0, 0, 0 );
@ -297,13 +297,13 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 5 >
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ; DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ; DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutStride, execution_space > stride_view ; DynRankView< DataType, Kokkos::LayoutStride, execution_space > stride_view ;
left_view left ; left_view left ;
right_view right ; right_view right ;
@ -338,11 +338,11 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 5 >
long offset ; long offset ;
offset = -1 ; offset = -1 ;
for ( unsigned i4 = 0 ; i4 < unsigned(left.dimension_4()) ; ++i4 ) for ( unsigned i4 = 0 ; i4 < unsigned(left.extent(4)) ; ++i4 )
for ( unsigned i3 = 0 ; i3 < unsigned(left.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(left.extent(3)) ; ++i3 )
for ( unsigned i2 = 0 ; i2 < unsigned(left.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(left.extent(2)) ; ++i2 )
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
{ {
const long j = & left( i0, i1, i2, i3, i4 ) - const long j = & left( i0, i1, i2, i3, i4 ) -
& left( 0, 0, 0, 0, 0 ); & left( 0, 0, 0, 0, 0 );
@ -354,11 +354,11 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 5 >
} }
offset = -1 ; offset = -1 ;
for ( unsigned i0 = 0 ; i0 < unsigned(right.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(right.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(right.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(right.extent(1)) ; ++i1 )
for ( unsigned i2 = 0 ; i2 < unsigned(right.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(right.extent(2)) ; ++i2 )
for ( unsigned i3 = 0 ; i3 < unsigned(right.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(right.extent(3)) ; ++i3 )
for ( unsigned i4 = 0 ; i4 < unsigned(right.dimension_4()) ; ++i4 ) for ( unsigned i4 = 0 ; i4 < unsigned(right.extent(4)) ; ++i4 )
{ {
const long j = & right( i0, i1, i2, i3, i4 ) - const long j = & right( i0, i1, i2, i3, i4 ) -
& right( 0, 0, 0, 0, 0 ); & right( 0, 0, 0, 0, 0 );
@ -391,10 +391,10 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 4 >
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ; DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ; DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ;
left_view left ; left_view left ;
right_view right ; right_view right ;
@ -425,10 +425,10 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 4 >
long offset ; long offset ;
offset = -1 ; offset = -1 ;
for ( unsigned i3 = 0 ; i3 < unsigned(left.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(left.extent(3)) ; ++i3 )
for ( unsigned i2 = 0 ; i2 < unsigned(left.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(left.extent(2)) ; ++i2 )
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
{ {
const long j = & left( i0, i1, i2, i3 ) - const long j = & left( i0, i1, i2, i3 ) -
& left( 0, 0, 0, 0 ); & left( 0, 0, 0, 0 );
@ -437,10 +437,10 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 4 >
} }
offset = -1 ; offset = -1 ;
for ( unsigned i0 = 0 ; i0 < unsigned(right.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(right.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(right.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(right.extent(1)) ; ++i1 )
for ( unsigned i2 = 0 ; i2 < unsigned(right.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(right.extent(2)) ; ++i2 )
for ( unsigned i3 = 0 ; i3 < unsigned(right.dimension_3()) ; ++i3 ) for ( unsigned i3 = 0 ; i3 < unsigned(right.extent(3)) ; ++i3 )
{ {
const long j = & right( i0, i1, i2, i3 ) - const long j = & right( i0, i1, i2, i3 ) -
& right( 0, 0, 0, 0 ); & right( 0, 0, 0, 0 );
@ -470,13 +470,13 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 3 >
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ; DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ; DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutStride, execution_space > stride_view ; DynRankView< DataType, Kokkos::LayoutStride, execution_space > stride_view ;
left_view left ; left_view left ;
right_view right ; right_view right ;
@ -511,9 +511,9 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 3 >
long offset ; long offset ;
offset = -1 ; offset = -1 ;
for ( unsigned i2 = 0 ; i2 < unsigned(left.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(left.extent(2)) ; ++i2 )
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
{ {
const long j = & left( i0, i1, i2 ) - const long j = & left( i0, i1, i2 ) -
& left( 0, 0, 0 ); & left( 0, 0, 0 );
@ -524,9 +524,9 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 3 >
} }
offset = -1 ; offset = -1 ;
for ( unsigned i0 = 0 ; i0 < unsigned(right.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(right.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(right.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(right.extent(1)) ; ++i1 )
for ( unsigned i2 = 0 ; i2 < unsigned(right.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(right.extent(2)) ; ++i2 )
{ {
const long j = & right( i0, i1, i2 ) - const long j = & right( i0, i1, i2 ) -
& right( 0, 0, 0 ); & right( 0, 0, 0 );
@ -536,9 +536,9 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 3 >
if ( & right(i0,i1,i2) != & right_stride(i0,i1,i2) ) { update |= 8 ; } if ( & right(i0,i1,i2) != & right_stride(i0,i1,i2) ) { update |= 8 ; }
} }
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
for ( unsigned i2 = 0 ; i2 < unsigned(left.dimension_2()) ; ++i2 ) for ( unsigned i2 = 0 ; i2 < unsigned(left.extent(2)) ; ++i2 )
{ {
if ( & left(i0,i1,i2) != & left(i0,i1,i2,0,0,0,0) ) { update |= 3 ; } if ( & left(i0,i1,i2) != & left(i0,i1,i2,0,0,0,0) ) { update |= 3 ; }
if ( & right(i0,i1,i2) != & right(i0,i1,i2,0,0,0,0) ) { update |= 3 ; } if ( & right(i0,i1,i2) != & right(i0,i1,i2,0,0,0,0) ) { update |= 3 ; }
@ -566,10 +566,10 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 2 >
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ; DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ; DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ;
left_view left ; left_view left ;
right_view right ; right_view right ;
@ -600,8 +600,8 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 2 >
long offset ; long offset ;
offset = -1 ; offset = -1 ;
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
{ {
const long j = & left( i0, i1 ) - const long j = & left( i0, i1 ) -
& left( 0, 0 ); & left( 0, 0 );
@ -610,8 +610,8 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 2 >
} }
offset = -1 ; offset = -1 ;
for ( unsigned i0 = 0 ; i0 < unsigned(right.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(right.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(right.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(right.extent(1)) ; ++i1 )
{ {
const long j = & right( i0, i1 ) - const long j = & right( i0, i1 ) -
& right( 0, 0 ); & right( 0, 0 );
@ -619,8 +619,8 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 2 >
offset = j ; offset = j ;
} }
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
for ( unsigned i1 = 0 ; i1 < unsigned(left.dimension_1()) ; ++i1 ) for ( unsigned i1 = 0 ; i1 < unsigned(left.extent(1)) ; ++i1 )
{ {
if ( & left(i0,i1) != & left(i0,i1,0,0,0,0,0) ) { update |= 3 ; } if ( & left(i0,i1) != & left(i0,i1,0,0,0,0,0) ) { update |= 3 ; }
if ( & right(i0,i1) != & right(i0,i1,0,0,0,0,0) ) { update |= 3 ; } if ( & right(i0,i1) != & right(i0,i1,0,0,0,0,0) ) { update |= 3 ; }
@ -648,13 +648,13 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 1 >
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ; DynRankView< DataType, Kokkos::LayoutLeft, execution_space > left_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ; DynRankView< DataType, Kokkos::LayoutRight, execution_space > right_view ;
typedef Kokkos:: typedef Kokkos::
Experimental::DynRankView< DataType, Kokkos::LayoutStride, execution_space > stride_view ; DynRankView< DataType, Kokkos::LayoutStride, execution_space > stride_view ;
left_view left ; left_view left ;
right_view right ; right_view right ;
@ -686,7 +686,7 @@ struct TestViewOperator_LeftAndRight< DataType , DeviceType , 1 >
KOKKOS_INLINE_FUNCTION KOKKOS_INLINE_FUNCTION
void operator()( const size_type , value_type & update ) const void operator()( const size_type , value_type & update ) const
{ {
for ( unsigned i0 = 0 ; i0 < unsigned(left.dimension_0()) ; ++i0 ) for ( unsigned i0 = 0 ; i0 < unsigned(left.extent(0)) ; ++i0 )
{ {
if ( & left(i0) != & left(i0,0,0,0,0,0,0) ) { update |= 3 ; } if ( & left(i0) != & left(i0,0,0,0,0,0,0) ) { update |= 3 ; }
if ( & right(i0) != & right(i0,0,0,0,0,0,0) ) { update |= 3 ; } if ( & right(i0) != & right(i0,0,0,0,0,0,0) ) { update |= 3 ; }
@ -709,10 +709,10 @@ public:
N2 = 5 , N2 = 5 ,
N3 = 7 }; N3 = 7 };
typedef Kokkos::Experimental::DynRankView< T , device > dView0 ; typedef Kokkos::DynRankView< T , device > dView0 ;
typedef Kokkos::Experimental::DynRankView< const T , device > const_dView0 ; typedef Kokkos::DynRankView< const T , device > const_dView0 ;
typedef Kokkos::Experimental::DynRankView< T, device, Kokkos::MemoryUnmanaged > dView0_unmanaged ; typedef Kokkos::DynRankView< T, device, Kokkos::MemoryUnmanaged > dView0_unmanaged ;
typedef typename dView0::host_mirror_space host_drv_space ; typedef typename dView0::host_mirror_space host_drv_space ;
typedef Kokkos::View< T , device > View0 ; typedef Kokkos::View< T , device > View0 ;
@ -747,27 +747,27 @@ public:
dView0 drv0("drv0", 10, 20, 30); dView0 drv0("drv0", 10, 20, 30);
ASSERT_EQ( drv0.rank(), 3); ASSERT_EQ( drv0.rank(), 3);
Kokkos::Experimental::resize(drv0, 5, 10); Kokkos::resize(drv0, 5, 10);
ASSERT_EQ( drv0.rank(), 2); ASSERT_EQ( drv0.rank(), 2);
ASSERT_EQ( drv0.dimension_0(), 5); ASSERT_EQ( drv0.extent(0), 5);
ASSERT_EQ( drv0.dimension_1(), 10); ASSERT_EQ( drv0.extent(1), 10);
ASSERT_EQ( drv0.dimension_2(), 1); ASSERT_EQ( drv0.extent(2), 1);
Kokkos::Experimental::realloc(drv0, 10, 20); Kokkos::realloc(drv0, 10, 20);
ASSERT_EQ( drv0.rank(), 2); ASSERT_EQ( drv0.rank(), 2);
ASSERT_EQ( drv0.dimension_0(), 10); ASSERT_EQ( drv0.extent(0), 10);
ASSERT_EQ( drv0.dimension_1(), 20); ASSERT_EQ( drv0.extent(1), 20);
ASSERT_EQ( drv0.dimension_2(), 1); ASSERT_EQ( drv0.extent(2), 1);
} }
static void run_test_mirror() static void run_test_mirror()
{ {
typedef Kokkos::Experimental::DynRankView< int , host_drv_space > view_type ; typedef Kokkos::DynRankView< int , host_drv_space > view_type ;
typedef typename view_type::HostMirror mirror_type ; typedef typename view_type::HostMirror mirror_type ;
view_type a("a"); view_type a("a");
mirror_type am = Kokkos::Experimental::create_mirror_view(a); mirror_type am = Kokkos::create_mirror_view(a);
mirror_type ax = Kokkos::Experimental::create_mirror(a); mirror_type ax = Kokkos::create_mirror(a);
ASSERT_EQ( & a() , & am() ); ASSERT_EQ( & a() , & am() );
ASSERT_EQ( a.rank() , am.rank() ); ASSERT_EQ( a.rank() , am.rank() );
ASSERT_EQ( ax.rank() , am.rank() ); ASSERT_EQ( ax.rank() , am.rank() );
@ -786,8 +786,8 @@ public:
ASSERT_EQ(equal_ptr_h_d ,0); ASSERT_EQ(equal_ptr_h_d ,0);
ASSERT_EQ(equal_ptr_h2_d,0); ASSERT_EQ(equal_ptr_h2_d,0);
ASSERT_EQ(a_h.dimension_0(),a_h2.dimension_0()); ASSERT_EQ(a_h.extent(0),a_h2.extent(0));
ASSERT_EQ(a_h.dimension_0(),a_d .dimension_0()); ASSERT_EQ(a_h.extent(0),a_d .extent(0));
ASSERT_EQ(a_h.rank(),a_h2.rank()); ASSERT_EQ(a_h.rank(),a_h2.rank());
ASSERT_EQ(a_h.rank(),a_d.rank()); ASSERT_EQ(a_h.rank(),a_d.rank());
@ -806,8 +806,8 @@ public:
ASSERT_EQ(equal_ptr_h_d ,0); ASSERT_EQ(equal_ptr_h_d ,0);
ASSERT_EQ(equal_ptr_h2_d,0); ASSERT_EQ(equal_ptr_h2_d,0);
ASSERT_EQ(a_h.dimension_0(),a_h2.dimension_0()); ASSERT_EQ(a_h.extent(0),a_h2.extent(0));
ASSERT_EQ(a_h.dimension_0(),a_d .dimension_0()); ASSERT_EQ(a_h.extent(0),a_d .extent(0));
ASSERT_EQ(a_h.rank(),a_h2.rank()); ASSERT_EQ(a_h.rank(),a_h2.rank());
ASSERT_EQ(a_h.rank(),a_d.rank()); ASSERT_EQ(a_h.rank(),a_d.rank());
@ -828,8 +828,8 @@ public:
ASSERT_EQ(equal_ptr_h_d ,is_same_memspace); ASSERT_EQ(equal_ptr_h_d ,is_same_memspace);
ASSERT_EQ(equal_ptr_h2_d ,is_same_memspace); ASSERT_EQ(equal_ptr_h2_d ,is_same_memspace);
ASSERT_EQ(a_h.dimension_0(),a_h2.dimension_0()); ASSERT_EQ(a_h.extent(0),a_h2.extent(0));
ASSERT_EQ(a_h.dimension_0(),a_d .dimension_0()); ASSERT_EQ(a_h.extent(0),a_d .extent(0));
ASSERT_EQ(a_h.rank(),a_h2.rank()); ASSERT_EQ(a_h.rank(),a_h2.rank());
ASSERT_EQ(a_h.rank(),a_d.rank()); ASSERT_EQ(a_h.rank(),a_d.rank());
@ -849,8 +849,9 @@ public:
ASSERT_EQ(equal_ptr_h_d ,is_same_memspace); ASSERT_EQ(equal_ptr_h_d ,is_same_memspace);
ASSERT_EQ(equal_ptr_h2_d ,is_same_memspace); ASSERT_EQ(equal_ptr_h2_d ,is_same_memspace);
ASSERT_EQ(a_h.dimension_0(),a_h2.dimension_0());
ASSERT_EQ(a_h.dimension_0(),a_d .dimension_0()); ASSERT_EQ(a_h.extent(0),a_h2.extent(0));
ASSERT_EQ(a_h.extent(0),a_d .extent(0));
ASSERT_EQ(a_h.rank(),a_h2.rank()); ASSERT_EQ(a_h.rank(),a_h2.rank());
ASSERT_EQ(a_h.rank(),a_d.rank()); ASSERT_EQ(a_h.rank(),a_d.rank());
@ -872,8 +873,8 @@ public:
ASSERT_EQ(equal_ptr_h_d ,is_same_memspace); ASSERT_EQ(equal_ptr_h_d ,is_same_memspace);
ASSERT_EQ(equal_ptr_h2_d ,is_same_memspace); ASSERT_EQ(equal_ptr_h2_d ,is_same_memspace);
ASSERT_EQ(a_h.dimension_0(),a_h2.dimension_0()); ASSERT_EQ(a_h.extent(0),a_h2.extent(0));
ASSERT_EQ(a_h.dimension_0(),a_d .dimension_0()); ASSERT_EQ(a_h.extent(0),a_d .extent(0));
ASSERT_EQ(a_h.rank(),a_h2.rank()); ASSERT_EQ(a_h.rank(),a_h2.rank());
ASSERT_EQ(a_h.rank(),a_d.rank()); ASSERT_EQ(a_h.rank(),a_d.rank());
@ -890,14 +891,14 @@ public:
dx = dView0( "dx" ); dx = dView0( "dx" );
dy = dView0( "dy" ); dy = dView0( "dy" );
hx = Kokkos::Experimental::create_mirror( dx ); hx = Kokkos::create_mirror( dx );
hy = Kokkos::Experimental::create_mirror( dy ); hy = Kokkos::create_mirror( dy );
hx() = 1 ; hx() = 1 ;
Kokkos::Experimental::deep_copy( dx , hx ); Kokkos::deep_copy( dx , hx );
Kokkos::Experimental::deep_copy( dy , dx ); Kokkos::deep_copy( dy , dx );
Kokkos::Experimental::deep_copy( hy , dy ); Kokkos::deep_copy( hy , dy );
ASSERT_EQ( hx(), hy() ); ASSERT_EQ( hx(), hy() );
ASSERT_EQ( dx.rank() , hx.rank() ); ASSERT_EQ( dx.rank() , hx.rank() );
@ -920,18 +921,18 @@ public:
View7 vcast = dx.ConstDownCast(); View7 vcast = dx.ConstDownCast();
ASSERT_EQ( dx.dimension_0() , vcast.dimension_0() ); ASSERT_EQ( dx.extent(0) , vcast.extent(0) );
ASSERT_EQ( dx.dimension_1() , vcast.dimension_1() ); ASSERT_EQ( dx.extent(1) , vcast.extent(1) );
ASSERT_EQ( dx.dimension_2() , vcast.dimension_2() ); ASSERT_EQ( dx.extent(2) , vcast.extent(2) );
ASSERT_EQ( dx.dimension_3() , vcast.dimension_3() ); ASSERT_EQ( dx.extent(3) , vcast.extent(3) );
ASSERT_EQ( dx.dimension_4() , vcast.dimension_4() ); ASSERT_EQ( dx.extent(4) , vcast.extent(4) );
View7 vcast1( dy.ConstDownCast() ); View7 vcast1( dy.ConstDownCast() );
ASSERT_EQ( dy.dimension_0() , vcast1.dimension_0() ); ASSERT_EQ( dy.extent(0) , vcast1.extent(0) );
ASSERT_EQ( dy.dimension_1() , vcast1.dimension_1() ); ASSERT_EQ( dy.extent(1) , vcast1.extent(1) );
ASSERT_EQ( dy.dimension_2() , vcast1.dimension_2() ); ASSERT_EQ( dy.extent(2) , vcast1.extent(2) );
ASSERT_EQ( dy.dimension_3() , vcast1.dimension_3() ); ASSERT_EQ( dy.extent(3) , vcast1.extent(3) );
ASSERT_EQ( dy.dimension_4() , vcast1.dimension_4() ); ASSERT_EQ( dy.extent(4) , vcast1.extent(4) );
//View - DynRankView Interoperability tests //View - DynRankView Interoperability tests
// copy View to DynRankView // copy View to DynRankView
@ -941,8 +942,8 @@ public:
auto hvx = Kokkos::create_mirror_view(vx) ; auto hvx = Kokkos::create_mirror_view(vx) ;
Kokkos::deep_copy(hvx , vx); Kokkos::deep_copy(hvx , vx);
ASSERT_EQ( rank(hvx) , rank(hmx) ); ASSERT_EQ( rank(hvx) , rank(hmx) );
ASSERT_EQ( hvx.dimension_0() , hmx.dimension_0() ); ASSERT_EQ( hvx.extent(0) , hmx.extent(0) );
ASSERT_EQ( hvx.dimension_1() , hmx.dimension_1() ); ASSERT_EQ( hvx.extent(1) , hmx.extent(1) );
// copy-assign View to DynRankView // copy-assign View to DynRankView
dView0 dfromvy = vy ; dView0 dfromvy = vy ;
@ -951,27 +952,27 @@ public:
auto hvy = Kokkos::create_mirror_view(vy) ; auto hvy = Kokkos::create_mirror_view(vy) ;
Kokkos::deep_copy(hvy , vy); Kokkos::deep_copy(hvy , vy);
ASSERT_EQ( rank(hvy) , rank(hmy) ); ASSERT_EQ( rank(hvy) , rank(hmy) );
ASSERT_EQ( hvy.dimension_0() , hmy.dimension_0() ); ASSERT_EQ( hvy.extent(0) , hmy.extent(0) );
ASSERT_EQ( hvy.dimension_1() , hmy.dimension_1() ); ASSERT_EQ( hvy.extent(1) , hmy.extent(1) );
View7 vtest1("vtest1",2,2,2,2,2,2,2); View7 vtest1("vtest1",2,2,2,2,2,2,2);
dView0 dfromv1( vtest1 ); dView0 dfromv1( vtest1 );
ASSERT_EQ( dfromv1.rank() , vtest1.Rank ); ASSERT_EQ( dfromv1.rank() , vtest1.Rank );
ASSERT_EQ( dfromv1.dimension_0() , vtest1.dimension_0() ); ASSERT_EQ( dfromv1.extent(0) , vtest1.extent(0) );
ASSERT_EQ( dfromv1.dimension_1() , vtest1.dimension_1() ); ASSERT_EQ( dfromv1.extent(1) , vtest1.extent(1) );
ASSERT_EQ( dfromv1.use_count() , vtest1.use_count() ); ASSERT_EQ( dfromv1.use_count() , vtest1.use_count() );
dView0 dfromv2( vcast ); dView0 dfromv2( vcast );
ASSERT_EQ( dfromv2.rank() , vcast.Rank ); ASSERT_EQ( dfromv2.rank() , vcast.Rank );
ASSERT_EQ( dfromv2.dimension_0() , vcast.dimension_0() ); ASSERT_EQ( dfromv2.extent(0) , vcast.extent(0) );
ASSERT_EQ( dfromv2.dimension_1() , vcast.dimension_1() ); ASSERT_EQ( dfromv2.extent(1) , vcast.extent(1) );
ASSERT_EQ( dfromv2.use_count() , vcast.use_count() ); ASSERT_EQ( dfromv2.use_count() , vcast.use_count() );
dView0 dfromv3 = vcast1; dView0 dfromv3 = vcast1;
ASSERT_EQ( dfromv3.rank() , vcast1.Rank ); ASSERT_EQ( dfromv3.rank() , vcast1.Rank );
ASSERT_EQ( dfromv3.dimension_0() , vcast1.dimension_0() ); ASSERT_EQ( dfromv3.extent(0) , vcast1.extent(0) );
ASSERT_EQ( dfromv3.dimension_1() , vcast1.dimension_1() ); ASSERT_EQ( dfromv3.extent(1) , vcast1.extent(1) );
ASSERT_EQ( dfromv3.use_count() , vcast1.use_count() ); ASSERT_EQ( dfromv3.use_count() , vcast1.use_count() );
} }
@ -993,15 +994,15 @@ public:
dView0 d_uninitialized(Kokkos::ViewAllocateWithoutInitializing("uninit"),10,20); dView0 d_uninitialized(Kokkos::ViewAllocateWithoutInitializing("uninit"),10,20);
ASSERT_TRUE( d_uninitialized.data() != nullptr ); ASSERT_TRUE( d_uninitialized.data() != nullptr );
ASSERT_EQ( d_uninitialized.rank() , 2 ); ASSERT_EQ( d_uninitialized.rank() , 2 );
ASSERT_EQ( d_uninitialized.dimension_0() , 10 ); ASSERT_EQ( d_uninitialized.extent(0) , 10 );
ASSERT_EQ( d_uninitialized.dimension_1() , 20 ); ASSERT_EQ( d_uninitialized.extent(1) , 20 );
ASSERT_EQ( d_uninitialized.dimension_2() , 1 ); ASSERT_EQ( d_uninitialized.extent(2) , 1 );
dView0 dx , dy , dz ; dView0 dx , dy , dz ;
hView0 hx , hy , hz ; hView0 hx , hy , hz ;
ASSERT_TRUE( Kokkos::Experimental::is_dyn_rank_view<dView0>::value ); ASSERT_TRUE( Kokkos::is_dyn_rank_view<dView0>::value );
ASSERT_FALSE( Kokkos::Experimental::is_dyn_rank_view< Kokkos::View<double> >::value ); ASSERT_FALSE( Kokkos::is_dyn_rank_view< Kokkos::View<double> >::value );
ASSERT_TRUE( dx.ptr_on_device() == 0 ); //Okay with UVM ASSERT_TRUE( dx.ptr_on_device() == 0 ); //Okay with UVM
ASSERT_TRUE( dy.ptr_on_device() == 0 ); //Okay with UVM ASSERT_TRUE( dy.ptr_on_device() == 0 ); //Okay with UVM
@ -1009,12 +1010,12 @@ public:
ASSERT_TRUE( hx.ptr_on_device() == 0 ); ASSERT_TRUE( hx.ptr_on_device() == 0 );
ASSERT_TRUE( hy.ptr_on_device() == 0 ); ASSERT_TRUE( hy.ptr_on_device() == 0 );
ASSERT_TRUE( hz.ptr_on_device() == 0 ); ASSERT_TRUE( hz.ptr_on_device() == 0 );
ASSERT_EQ( dx.dimension_0() , 0u ); //Okay with UVM ASSERT_EQ( dx.extent(0) , 0u ); //Okay with UVM
ASSERT_EQ( dy.dimension_0() , 0u ); //Okay with UVM ASSERT_EQ( dy.extent(0) , 0u ); //Okay with UVM
ASSERT_EQ( dz.dimension_0() , 0u ); //Okay with UVM ASSERT_EQ( dz.extent(0) , 0u ); //Okay with UVM
ASSERT_EQ( hx.dimension_0() , 0u ); ASSERT_EQ( hx.extent(0) , 0u );
ASSERT_EQ( hy.dimension_0() , 0u ); ASSERT_EQ( hy.extent(0) , 0u );
ASSERT_EQ( hz.dimension_0() , 0u ); ASSERT_EQ( hz.extent(0) , 0u );
ASSERT_EQ( dx.rank() , 0u ); //Okay with UVM ASSERT_EQ( dx.rank() , 0u ); //Okay with UVM
ASSERT_EQ( hx.rank() , 0u ); ASSERT_EQ( hx.rank() , 0u );
@ -1024,10 +1025,10 @@ public:
hx = hView0( "hx" , N1 , N2 , N3 ); hx = hView0( "hx" , N1 , N2 , N3 );
hy = hView0( "hy" , N1 , N2 , N3 ); hy = hView0( "hy" , N1 , N2 , N3 );
ASSERT_EQ( dx.dimension_0() , unsigned(N1) ); //Okay with UVM ASSERT_EQ( dx.extent(0) , unsigned(N1) ); //Okay with UVM
ASSERT_EQ( dy.dimension_0() , unsigned(N1) ); //Okay with UVM ASSERT_EQ( dy.extent(0) , unsigned(N1) ); //Okay with UVM
ASSERT_EQ( hx.dimension_0() , unsigned(N1) ); ASSERT_EQ( hx.extent(0) , unsigned(N1) );
ASSERT_EQ( hy.dimension_0() , unsigned(N1) ); ASSERT_EQ( hy.extent(0) , unsigned(N1) );
ASSERT_EQ( dx.rank() , 3 ); //Okay with UVM ASSERT_EQ( dx.rank() , 3 ); //Okay with UVM
ASSERT_EQ( hx.rank() , 3 ); ASSERT_EQ( hx.rank() , 3 );
@ -1036,10 +1037,10 @@ public:
hx = hView0( "hx" , N0 , N1 , N2 , N3 ); hx = hView0( "hx" , N0 , N1 , N2 , N3 );
hy = hView0( "hy" , N0 , N1 , N2 , N3 ); hy = hView0( "hy" , N0 , N1 , N2 , N3 );
ASSERT_EQ( dx.dimension_0() , unsigned(N0) ); ASSERT_EQ( dx.extent(0) , unsigned(N0) );
ASSERT_EQ( dy.dimension_0() , unsigned(N0) ); ASSERT_EQ( dy.extent(0) , unsigned(N0) );
ASSERT_EQ( hx.dimension_0() , unsigned(N0) ); ASSERT_EQ( hx.extent(0) , unsigned(N0) );
ASSERT_EQ( hy.dimension_0() , unsigned(N0) ); ASSERT_EQ( hy.extent(0) , unsigned(N0) );
ASSERT_EQ( dx.rank() , 4 ); ASSERT_EQ( dx.rank() , 4 );
ASSERT_EQ( dy.rank() , 4 ); ASSERT_EQ( dy.rank() , 4 );
ASSERT_EQ( hx.rank() , 4 ); ASSERT_EQ( hx.rank() , 4 );
@ -1052,19 +1053,19 @@ public:
dView0_unmanaged unmanaged_from_ptr_dx = dView0_unmanaged(dx.ptr_on_device(), dView0_unmanaged unmanaged_from_ptr_dx = dView0_unmanaged(dx.ptr_on_device(),
dx.dimension_0(), dx.extent(0),
dx.dimension_1(), dx.extent(1),
dx.dimension_2(), dx.extent(2),
dx.dimension_3()); dx.extent(3));
{ {
// Destruction of this view should be harmless // Destruction of this view should be harmless
const_dView0 unmanaged_from_ptr_const_dx( dx.ptr_on_device() , const_dView0 unmanaged_from_ptr_const_dx( dx.ptr_on_device() ,
dx.dimension_0() , dx.extent(0) ,
dx.dimension_1() , dx.extent(1) ,
dx.dimension_2() , dx.extent(2) ,
dx.dimension_3() ); dx.extent(3) );
} }
const_dView0 const_dx = dx ; const_dView0 const_dx = dx ;
@ -1095,33 +1096,33 @@ public:
ASSERT_FALSE( dy.ptr_on_device() == 0 ); ASSERT_FALSE( dy.ptr_on_device() == 0 );
ASSERT_NE( dx , dy ); ASSERT_NE( dx , dy );
ASSERT_EQ( dx.dimension_0() , unsigned(N0) ); ASSERT_EQ( dx.extent(0) , unsigned(N0) );
ASSERT_EQ( dx.dimension_1() , unsigned(N1) ); ASSERT_EQ( dx.extent(1) , unsigned(N1) );
ASSERT_EQ( dx.dimension_2() , unsigned(N2) ); ASSERT_EQ( dx.extent(2) , unsigned(N2) );
ASSERT_EQ( dx.dimension_3() , unsigned(N3) ); ASSERT_EQ( dx.extent(3) , unsigned(N3) );
ASSERT_EQ( dy.dimension_0() , unsigned(N0) ); ASSERT_EQ( dy.extent(0) , unsigned(N0) );
ASSERT_EQ( dy.dimension_1() , unsigned(N1) ); ASSERT_EQ( dy.extent(1) , unsigned(N1) );
ASSERT_EQ( dy.dimension_2() , unsigned(N2) ); ASSERT_EQ( dy.extent(2) , unsigned(N2) );
ASSERT_EQ( dy.dimension_3() , unsigned(N3) ); ASSERT_EQ( dy.extent(3) , unsigned(N3) );
ASSERT_EQ( unmanaged_from_ptr_dx.capacity(),unsigned(N0)*unsigned(N1)*unsigned(N2)*unsigned(N3) ); ASSERT_EQ( unmanaged_from_ptr_dx.capacity(),unsigned(N0)*unsigned(N1)*unsigned(N2)*unsigned(N3) );
hx = Kokkos::Experimental::create_mirror( dx ); hx = Kokkos::create_mirror( dx );
hy = Kokkos::Experimental::create_mirror( dy ); hy = Kokkos::create_mirror( dy );
ASSERT_EQ( hx.rank() , dx.rank() ); ASSERT_EQ( hx.rank() , dx.rank() );
ASSERT_EQ( hy.rank() , dy.rank() ); ASSERT_EQ( hy.rank() , dy.rank() );
ASSERT_EQ( hx.dimension_0() , unsigned(N0) ); ASSERT_EQ( hx.extent(0) , unsigned(N0) );
ASSERT_EQ( hx.dimension_1() , unsigned(N1) ); ASSERT_EQ( hx.extent(1) , unsigned(N1) );
ASSERT_EQ( hx.dimension_2() , unsigned(N2) ); ASSERT_EQ( hx.extent(2) , unsigned(N2) );
ASSERT_EQ( hx.dimension_3() , unsigned(N3) ); ASSERT_EQ( hx.extent(3) , unsigned(N3) );
ASSERT_EQ( hy.dimension_0() , unsigned(N0) ); ASSERT_EQ( hy.extent(0) , unsigned(N0) );
ASSERT_EQ( hy.dimension_1() , unsigned(N1) ); ASSERT_EQ( hy.extent(1) , unsigned(N1) );
ASSERT_EQ( hy.dimension_2() , unsigned(N2) ); ASSERT_EQ( hy.extent(2) , unsigned(N2) );
ASSERT_EQ( hy.dimension_3() , unsigned(N3) ); ASSERT_EQ( hy.extent(3) , unsigned(N3) );
// T v1 = hx() ; // Generates compile error as intended // T v1 = hx() ; // Generates compile error as intended
// T v2 = hx(0,0) ; // Generates compile error as intended // T v2 = hx(0,0) ; // Generates compile error as intended
@ -1132,9 +1133,9 @@ public:
{ {
size_t count = 0 ; size_t count = 0 ;
for ( size_t ip = 0 ; ip < N0 ; ++ip ) { for ( size_t ip = 0 ; ip < N0 ; ++ip ) {
for ( size_t i1 = 0 ; i1 < hx.dimension_1() ; ++i1 ) { for ( size_t i1 = 0 ; i1 < hx.extent(1) ; ++i1 ) {
for ( size_t i2 = 0 ; i2 < hx.dimension_2() ; ++i2 ) { for ( size_t i2 = 0 ; i2 < hx.extent(2) ; ++i2 ) {
for ( size_t i3 = 0 ; i3 < hx.dimension_3() ; ++i3 ) { for ( size_t i3 = 0 ; i3 < hx.extent(3) ; ++i3 ) {
hx(ip,i1,i2,i3) = ++count ; hx(ip,i1,i2,i3) = ++count ;
}}}} }}}}
@ -1165,9 +1166,9 @@ public:
{ {
size_t count = 0 ; size_t count = 0 ;
for ( size_t ip = 0 ; ip < N0 ; ++ip ) { for ( size_t ip = 0 ; ip < N0 ; ++ip ) {
for ( size_t i1 = 0 ; i1 < hx.dimension_1() ; ++i1 ) { for ( size_t i1 = 0 ; i1 < hx.extent(1) ; ++i1 ) {
for ( size_t i2 = 0 ; i2 < hx.dimension_2() ; ++i2 ) { for ( size_t i2 = 0 ; i2 < hx.extent(2) ; ++i2 ) {
for ( size_t i3 = 0 ; i3 < hx.dimension_3() ; ++i3 ) { for ( size_t i3 = 0 ; i3 < hx.extent(3) ; ++i3 ) {
hx(ip,i1,i2,i3) = ++count ; hx(ip,i1,i2,i3) = ++count ;
}}}} }}}}
@ -1198,15 +1199,15 @@ public:
{ {
size_t count = 0 ; size_t count = 0 ;
for ( size_t ip = 0 ; ip < N0 ; ++ip ) { for ( size_t ip = 0 ; ip < N0 ; ++ip ) {
for ( size_t i1 = 0 ; i1 < hx.dimension_1() ; ++i1 ) { for ( size_t i1 = 0 ; i1 < hx.extent(1) ; ++i1 ) {
for ( size_t i2 = 0 ; i2 < hx.dimension_2() ; ++i2 ) { for ( size_t i2 = 0 ; i2 < hx.extent(2) ; ++i2 ) {
for ( size_t i3 = 0 ; i3 < hx.dimension_3() ; ++i3 ) { for ( size_t i3 = 0 ; i3 < hx.extent(3) ; ++i3 ) {
hx(ip,i1,i2,i3) = ++count ; hx(ip,i1,i2,i3) = ++count ;
}}}} }}}}
Kokkos::Experimental::deep_copy( dx , hx ); Kokkos::deep_copy( dx , hx );
Kokkos::Experimental::deep_copy( dy , dx ); Kokkos::deep_copy( dy , dx );
Kokkos::Experimental::deep_copy( hy , dy ); Kokkos::deep_copy( hy , dy );
for ( size_t ip = 0 ; ip < N0 ; ++ip ) { for ( size_t ip = 0 ; ip < N0 ; ++ip ) {
for ( size_t i1 = 0 ; i1 < N1 ; ++i1 ) { for ( size_t i1 = 0 ; i1 < N1 ; ++i1 ) {
@ -1215,8 +1216,8 @@ public:
{ ASSERT_EQ( hx(ip,i1,i2,i3) , hy(ip,i1,i2,i3) ); } { ASSERT_EQ( hx(ip,i1,i2,i3) , hy(ip,i1,i2,i3) ); }
}}}} }}}}
Kokkos::Experimental::deep_copy( dx , T(0) ); Kokkos::deep_copy( dx , T(0) );
Kokkos::Experimental::deep_copy( hx , dx ); Kokkos::deep_copy( hx , dx );
for ( size_t ip = 0 ; ip < N0 ; ++ip ) { for ( size_t ip = 0 ; ip < N0 ; ++ip ) {
for ( size_t i1 = 0 ; i1 < N1 ; ++i1 ) { for ( size_t i1 = 0 ; i1 < N1 ; ++i1 ) {
@ -1259,16 +1260,16 @@ public:
{ ASSERT_EQ( hvxx(i) , hdxx(i) ); } { ASSERT_EQ( hvxx(i) , hdxx(i) ); }
ASSERT_EQ( rank(hdxx) , rank(hvxx) ); ASSERT_EQ( rank(hdxx) , rank(hvxx) );
ASSERT_EQ( hdxx.dimension_0() , testdim ); ASSERT_EQ( hdxx.extent(0) , testdim );
ASSERT_EQ( hdxx.dimension_0() , hvxx.dimension_0() ); ASSERT_EQ( hdxx.extent(0) , hvxx.extent(0) );
// deep_copy from dynrankview to view // deep_copy from dynrankview to view
View1 vdxx("vdxx",testdim); View1 vdxx("vdxx",testdim);
auto hvdxx = Kokkos::create_mirror_view(vdxx); auto hvdxx = Kokkos::create_mirror_view(vdxx);
Kokkos::deep_copy(hvdxx , hdxx); Kokkos::deep_copy(hvdxx , hdxx);
ASSERT_EQ( rank(hdxx) , rank(hvdxx) ); ASSERT_EQ( rank(hdxx) , rank(hvdxx) );
ASSERT_EQ( hvdxx.dimension_0() , testdim ); ASSERT_EQ( hvdxx.extent(0) , testdim );
ASSERT_EQ( hdxx.dimension_0() , hvdxx.dimension_0() ); ASSERT_EQ( hdxx.extent(0) , hvdxx.extent(0) );
for (int i = 0; i < testdim; ++i) for (int i = 0; i < testdim; ++i)
{ ASSERT_EQ( hvxx(i) , hvdxx(i) ); } { ASSERT_EQ( hvxx(i) , hvdxx(i) ); }
} }
@ -1277,17 +1278,17 @@ public:
static void static void
check_auto_conversion_to_const( check_auto_conversion_to_const(
const Kokkos::Experimental::DynRankView< const DataType , device > & arg_const , const Kokkos::DynRankView< const DataType , device > & arg_const ,
const Kokkos::Experimental::DynRankView< DataType , device > & arg ) const Kokkos::DynRankView< DataType , device > & arg )
{ {
ASSERT_TRUE( arg_const == arg ); ASSERT_TRUE( arg_const == arg );
} }
static void run_test_const() static void run_test_const()
{ {
typedef Kokkos::Experimental::DynRankView< DataType , device > typeX ; typedef Kokkos::DynRankView< DataType , device > typeX ;
typedef Kokkos::Experimental::DynRankView< const DataType , device > const_typeX ; typedef Kokkos::DynRankView< const DataType , device > const_typeX ;
typedef Kokkos::Experimental::DynRankView< const DataType , device , Kokkos::MemoryRandomAccess > const_typeR ; typedef Kokkos::DynRankView< const DataType , device , Kokkos::MemoryRandomAccess > const_typeR ;
typeX x( "X", 2 ); typeX x( "X", 2 );
const_typeX xc = x ; const_typeX xc = x ;
const_typeR xr = x ; const_typeR xr = x ;
@ -1313,10 +1314,10 @@ public:
static void run_test_subview() static void run_test_subview()
{ {
typedef Kokkos::Experimental::DynRankView< const T , device > cdView ; typedef Kokkos::DynRankView< const T , device > cdView ;
typedef Kokkos::Experimental::DynRankView< T , device > dView ; typedef Kokkos::DynRankView< T , device > dView ;
// LayoutStride required for all returned DynRankView subdynrankview's // LayoutStride required for all returned DynRankView subdynrankview's
typedef Kokkos::Experimental::DynRankView< T , Kokkos::LayoutStride , device > sdView ; typedef Kokkos::DynRankView< T , Kokkos::LayoutStride , device > sdView ;
dView0 d0( "d0" ); dView0 d0( "d0" );
cdView s0 = d0 ; cdView s0 = d0 ;
@ -1330,25 +1331,25 @@ public:
ASSERT_EQ( ds0.rank() , 0 ); ASSERT_EQ( ds0.rank() , 0 );
//Basic test - ALL //Basic test - ALL
sdView dsALL = Kokkos::Experimental::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() ); sdView dsALL = Kokkos::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() );
ASSERT_EQ( dsALL.rank() , 7 ); ASSERT_EQ( dsALL.rank() , 7 );
// Send a value to final rank returning rank 6 subview // Send a value to final rank returning rank 6 subview
sdView dsm1 = Kokkos::Experimental::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , 1 ); sdView dsm1 = Kokkos::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , 1 );
ASSERT_EQ( dsm1.rank() , 6 ); ASSERT_EQ( dsm1.rank() , 6 );
// Send a std::pair as argument to a rank // Send a std::pair as argument to a rank
sdView dssp = Kokkos::Experimental::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , std::pair<unsigned,unsigned>(1,2) ); sdView dssp = Kokkos::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , std::pair<unsigned,unsigned>(1,2) );
ASSERT_EQ( dssp.rank() , 7 ); ASSERT_EQ( dssp.rank() , 7 );
// Send a kokkos::pair as argument to a rank; take default layout as input // Send a kokkos::pair as argument to a rank; take default layout as input
dView0 dd0("dd0" , N0 , N1 , N2 , 2 , 2 , 2 , 2 ); //default layout dView0 dd0("dd0" , N0 , N1 , N2 , 2 , 2 , 2 , 2 ); //default layout
ASSERT_EQ( dd0.rank() , 7 ); ASSERT_EQ( dd0.rank() , 7 );
sdView dtkp = Kokkos::Experimental::subdynrankview( dd0 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) ); sdView dtkp = Kokkos::subdynrankview( dd0 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) );
ASSERT_EQ( dtkp.rank() , 7 ); ASSERT_EQ( dtkp.rank() , 7 );
// Return rank 7 subview, taking a pair as one argument, layout stride input // Return rank 7 subview, taking a pair as one argument, layout stride input
sdView ds7 = Kokkos::Experimental::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) ); sdView ds7 = Kokkos::subdynrankview( d7 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) );
ASSERT_EQ( ds7.rank() , 7 ); ASSERT_EQ( ds7.rank() , 7 );
// Default Layout DynRankView // Default Layout DynRankView
@ -1356,7 +1357,7 @@ public:
ASSERT_EQ( dv6.rank() , 6 ); ASSERT_EQ( dv6.rank() , 6 );
// DynRankView with LayoutRight // DynRankView with LayoutRight
typedef Kokkos::Experimental::DynRankView< T , Kokkos::LayoutRight , device > drView ; typedef Kokkos::DynRankView< T , Kokkos::LayoutRight , device > drView ;
drView dr5( "dr5" , N0 , N1 , N2 , 2 , 2 ); drView dr5( "dr5" , N0 , N1 , N2 , 2 , 2 );
ASSERT_EQ( dr5.rank() , 5 ); ASSERT_EQ( dr5.rank() , 5 );
@ -1386,27 +1387,27 @@ public:
// (i.e. rank 7 rather than 5). // (i.e. rank 7 rather than 5).
// Check LayoutRight dr5 and LayoutStride d5 dimensions agree (as they should) // Check LayoutRight dr5 and LayoutStride d5 dimensions agree (as they should)
ASSERT_EQ( d5.dimension_0() , dr5.dimension_0() ); ASSERT_EQ( d5.extent(0) , dr5.extent(0) );
ASSERT_EQ( d5.dimension_1() , dr5.dimension_1() ); ASSERT_EQ( d5.extent(1) , dr5.extent(1) );
ASSERT_EQ( d5.dimension_2() , dr5.dimension_2() ); ASSERT_EQ( d5.extent(2) , dr5.extent(2) );
ASSERT_EQ( d5.dimension_3() , dr5.dimension_3() ); ASSERT_EQ( d5.extent(3) , dr5.extent(3) );
ASSERT_EQ( d5.dimension_4() , dr5.dimension_4() ); ASSERT_EQ( d5.extent(4) , dr5.extent(4) );
ASSERT_EQ( d5.dimension_5() , dr5.dimension_5() ); ASSERT_EQ( d5.extent(5) , dr5.extent(5) );
ASSERT_EQ( d5.rank() , dr5.rank() ); ASSERT_EQ( d5.rank() , dr5.rank() );
// Rank 5 subview of rank 5 dynamic rank view, layout stride input // Rank 5 subview of rank 5 dynamic rank view, layout stride input
sdView ds5 = Kokkos::Experimental::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) ); sdView ds5 = Kokkos::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) );
ASSERT_EQ( ds5.rank() , 5 ); ASSERT_EQ( ds5.rank() , 5 );
// Pass in extra ALL arguments beyond the rank of the DynRank View. // Pass in extra ALL arguments beyond the rank of the DynRank View.
// This behavior is allowed - ignore the extra ALL arguments when // This behavior is allowed - ignore the extra ALL arguments when
// the src.rank() < number of arguments, but be careful! // the src.rank() < number of arguments, but be careful!
sdView ds5plus = Kokkos::Experimental::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) , Kokkos::ALL() ); sdView ds5plus = Kokkos::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::pair<unsigned,unsigned>(0,1) , Kokkos::ALL() );
ASSERT_EQ( ds5.rank() , ds5plus.rank() ); ASSERT_EQ( ds5.rank() , ds5plus.rank() );
ASSERT_EQ( ds5.dimension_0() , ds5plus.dimension_0() ); ASSERT_EQ( ds5.extent(0) , ds5plus.extent(0) );
ASSERT_EQ( ds5.dimension_4() , ds5plus.dimension_4() ); ASSERT_EQ( ds5.extent(4) , ds5plus.extent(4) );
ASSERT_EQ( ds5.dimension_5() , ds5plus.dimension_5() ); ASSERT_EQ( ds5.extent(5) , ds5plus.extent(5) );
#if ! defined( KOKKOS_ENABLE_CUDA ) || defined ( KOKKOS_ENABLE_CUDA_UVM ) #if ! defined( KOKKOS_ENABLE_CUDA ) || defined ( KOKKOS_ENABLE_CUDA_UVM )
ASSERT_EQ( & ds5(1,1,1,1,0) - & ds5plus(1,1,1,1,0) , 0 ); ASSERT_EQ( & ds5(1,1,1,1,0) - & ds5plus(1,1,1,1,0) , 0 );
@ -1415,36 +1416,36 @@ public:
// Similar test to rank 5 above, but create rank 4 subview // Similar test to rank 5 above, but create rank 4 subview
// Check that the rank contracts (ds4 and ds4plus) and that subdynrankview can accept extra args (ds4plus) // Check that the rank contracts (ds4 and ds4plus) and that subdynrankview can accept extra args (ds4plus)
sdView ds4 = Kokkos::Experimental::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , 0 ); sdView ds4 = Kokkos::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , 0 );
sdView ds4plus = Kokkos::Experimental::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , 0 , Kokkos::ALL() ); sdView ds4plus = Kokkos::subdynrankview( d5 , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , Kokkos::ALL() , 0 , Kokkos::ALL() );
ASSERT_EQ( ds4.rank() , ds4plus.rank() ); ASSERT_EQ( ds4.rank() , ds4plus.rank() );
ASSERT_EQ( ds4.rank() , 4 ); ASSERT_EQ( ds4.rank() , 4 );
ASSERT_EQ( ds4.dimension_0() , ds4plus.dimension_0() ); ASSERT_EQ( ds4.extent(0) , ds4plus.extent(0) );
ASSERT_EQ( ds4.dimension_4() , ds4plus.dimension_4() ); ASSERT_EQ( ds4.extent(4) , ds4plus.extent(4) );
ASSERT_EQ( ds4.dimension_5() , ds4plus.dimension_5() ); ASSERT_EQ( ds4.extent(5) , ds4plus.extent(5) );
} }
static void run_test_subview_strided() static void run_test_subview_strided()
{ {
typedef Kokkos::Experimental::DynRankView < int , Kokkos::LayoutLeft , host_drv_space > drview_left ; typedef Kokkos::DynRankView < int , Kokkos::LayoutLeft , host_drv_space > drview_left ;
typedef Kokkos::Experimental::DynRankView < int , Kokkos::LayoutRight , host_drv_space > drview_right ; typedef Kokkos::DynRankView < int , Kokkos::LayoutRight , host_drv_space > drview_right ;
typedef Kokkos::Experimental::DynRankView < int , Kokkos::LayoutStride , host_drv_space > drview_stride ; typedef Kokkos::DynRankView < int , Kokkos::LayoutStride , host_drv_space > drview_stride ;
drview_left xl2( "xl2", 100 , 200 ); drview_left xl2( "xl2", 100 , 200 );
drview_right xr2( "xr2", 100 , 200 ); drview_right xr2( "xr2", 100 , 200 );
drview_stride yl1 = Kokkos::Experimental::subdynrankview( xl2 , 0 , Kokkos::ALL() ); drview_stride yl1 = Kokkos::subdynrankview( xl2 , 0 , Kokkos::ALL() );
drview_stride yl2 = Kokkos::Experimental::subdynrankview( xl2 , 1 , Kokkos::ALL() ); drview_stride yl2 = Kokkos::subdynrankview( xl2 , 1 , Kokkos::ALL() );
drview_stride ys1 = Kokkos::Experimental::subdynrankview( xr2 , 0 , Kokkos::ALL() ); drview_stride ys1 = Kokkos::subdynrankview( xr2 , 0 , Kokkos::ALL() );
drview_stride ys2 = Kokkos::Experimental::subdynrankview( xr2 , 1 , Kokkos::ALL() ); drview_stride ys2 = Kokkos::subdynrankview( xr2 , 1 , Kokkos::ALL() );
drview_stride yr1 = Kokkos::Experimental::subdynrankview( xr2 , 0 , Kokkos::ALL() ); drview_stride yr1 = Kokkos::subdynrankview( xr2 , 0 , Kokkos::ALL() );
drview_stride yr2 = Kokkos::Experimental::subdynrankview( xr2 , 1 , Kokkos::ALL() ); drview_stride yr2 = Kokkos::subdynrankview( xr2 , 1 , Kokkos::ALL() );
ASSERT_EQ( yl1.dimension_0() , xl2.dimension_1() ); ASSERT_EQ( yl1.extent(0) , xl2.extent(1) );
ASSERT_EQ( yl2.dimension_0() , xl2.dimension_1() ); ASSERT_EQ( yl2.extent(0) , xl2.extent(1) );
ASSERT_EQ( yr1.dimension_0() , xr2.dimension_1() ); ASSERT_EQ( yr1.extent(0) , xr2.extent(1) );
ASSERT_EQ( yr2.dimension_0() , xr2.dimension_1() ); ASSERT_EQ( yr2.extent(0) , xr2.extent(1) );
ASSERT_EQ( & yl1(0) - & xl2(0,0) , 0 ); ASSERT_EQ( & yl1(0) - & xl2(0,0) , 0 );
ASSERT_EQ( & yl2(0) - & xl2(1,0) , 0 ); ASSERT_EQ( & yl2(0) - & xl2(1,0) , 0 );
@ -1456,13 +1457,13 @@ public:
drview_right xr4( "xr4", 10 , 20 , 30 , 40 ); drview_right xr4( "xr4", 10 , 20 , 30 , 40 );
//Replace subdynrankview with subview - test //Replace subdynrankview with subview - test
drview_stride yl4 = Kokkos::Experimental::subview( xl4 , 1 , Kokkos::ALL() , 2 , Kokkos::ALL() ); drview_stride yl4 = Kokkos::subview( xl4 , 1 , Kokkos::ALL() , 2 , Kokkos::ALL() );
drview_stride yr4 = Kokkos::Experimental::subview( xr4 , 1 , Kokkos::ALL() , 2 , Kokkos::ALL() ); drview_stride yr4 = Kokkos::subview( xr4 , 1 , Kokkos::ALL() , 2 , Kokkos::ALL() );
ASSERT_EQ( yl4.dimension_0() , xl4.dimension_1() ); ASSERT_EQ( yl4.extent(0) , xl4.extent(1) );
ASSERT_EQ( yl4.dimension_1() , xl4.dimension_3() ); ASSERT_EQ( yl4.extent(1) , xl4.extent(3) );
ASSERT_EQ( yr4.dimension_0() , xr4.dimension_1() ); ASSERT_EQ( yr4.extent(0) , xr4.extent(1) );
ASSERT_EQ( yr4.dimension_1() , xr4.dimension_3() ); ASSERT_EQ( yr4.extent(1) , xr4.extent(3) );
ASSERT_EQ( yl4.rank() , 2); ASSERT_EQ( yl4.rank() , 2);
ASSERT_EQ( yr4.rank() , 2); ASSERT_EQ( yr4.rank() , 2);
@ -1474,46 +1475,46 @@ public:
{ {
static const unsigned Length = 1000 , Count = 8 ; static const unsigned Length = 1000 , Count = 8 ;
typedef typename Kokkos::Experimental::DynRankView< T , Kokkos::LayoutLeft , host_drv_space > multivector_type ; typedef typename Kokkos::DynRankView< T , Kokkos::LayoutLeft , host_drv_space > multivector_type ;
typedef typename Kokkos::Experimental::DynRankView< T , Kokkos::LayoutRight , host_drv_space > multivector_right_type ; typedef typename Kokkos::DynRankView< T , Kokkos::LayoutRight , host_drv_space > multivector_right_type ;
multivector_type mv = multivector_type( "mv" , Length , Count ); multivector_type mv = multivector_type( "mv" , Length , Count );
multivector_right_type mv_right = multivector_right_type( "mv" , Length , Count ); multivector_right_type mv_right = multivector_right_type( "mv" , Length , Count );
typedef typename Kokkos::Experimental::DynRankView< T , Kokkos::LayoutStride , host_drv_space > svector_type ; typedef typename Kokkos::DynRankView< T , Kokkos::LayoutStride , host_drv_space > svector_type ;
typedef typename Kokkos::Experimental::DynRankView< T , Kokkos::LayoutStride , host_drv_space > smultivector_type ; typedef typename Kokkos::DynRankView< T , Kokkos::LayoutStride , host_drv_space > smultivector_type ;
typedef typename Kokkos::Experimental::DynRankView< const T , Kokkos::LayoutStride , host_drv_space > const_svector_right_type ; typedef typename Kokkos::DynRankView< const T , Kokkos::LayoutStride , host_drv_space > const_svector_right_type ;
typedef typename Kokkos::Experimental::DynRankView< const T , Kokkos::LayoutStride , host_drv_space > const_svector_type ; typedef typename Kokkos::DynRankView< const T , Kokkos::LayoutStride , host_drv_space > const_svector_type ;
typedef typename Kokkos::Experimental::DynRankView< const T , Kokkos::LayoutStride , host_drv_space > const_smultivector_type ; typedef typename Kokkos::DynRankView< const T , Kokkos::LayoutStride , host_drv_space > const_smultivector_type ;
svector_type v1 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 0 ); svector_type v1 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 0 );
svector_type v2 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 1 ); svector_type v2 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 1 );
svector_type v3 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 2 ); svector_type v3 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 2 );
svector_type rv1 = Kokkos::Experimental::subdynrankview( mv_right , 0 , Kokkos::ALL() ); svector_type rv1 = Kokkos::subdynrankview( mv_right , 0 , Kokkos::ALL() );
svector_type rv2 = Kokkos::Experimental::subdynrankview( mv_right , 1 , Kokkos::ALL() ); svector_type rv2 = Kokkos::subdynrankview( mv_right , 1 , Kokkos::ALL() );
svector_type rv3 = Kokkos::Experimental::subdynrankview( mv_right , 2 , Kokkos::ALL() ); svector_type rv3 = Kokkos::subdynrankview( mv_right , 2 , Kokkos::ALL() );
smultivector_type mv1 = Kokkos::Experimental::subdynrankview( mv , std::make_pair( 1 , 998 ) , smultivector_type mv1 = Kokkos::subdynrankview( mv , std::make_pair( 1 , 998 ) ,
std::make_pair( 2 , 5 ) ); std::make_pair( 2 , 5 ) );
smultivector_type mvr1 = smultivector_type mvr1 =
Kokkos::Experimental::subdynrankview( mv_right , Kokkos::subdynrankview( mv_right ,
std::make_pair( 1 , 998 ) , std::make_pair( 1 , 998 ) ,
std::make_pair( 2 , 5 ) ); std::make_pair( 2 , 5 ) );
const_svector_type cv1 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL(), 0 ); const_svector_type cv1 = Kokkos::subdynrankview( mv , Kokkos::ALL(), 0 );
const_svector_type cv2 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL(), 1 ); const_svector_type cv2 = Kokkos::subdynrankview( mv , Kokkos::ALL(), 1 );
const_svector_type cv3 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL(), 2 ); const_svector_type cv3 = Kokkos::subdynrankview( mv , Kokkos::ALL(), 2 );
svector_type vr1 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 0 ); svector_type vr1 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 0 );
svector_type vr2 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 1 ); svector_type vr2 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 1 );
svector_type vr3 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 2 ); svector_type vr3 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 2 );
const_svector_right_type cvr1 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 0 ); const_svector_right_type cvr1 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 0 );
const_svector_right_type cvr2 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 1 ); const_svector_right_type cvr2 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 1 );
const_svector_right_type cvr3 = Kokkos::Experimental::subdynrankview( mv , Kokkos::ALL() , 2 ); const_svector_right_type cvr3 = Kokkos::subdynrankview( mv , Kokkos::ALL() , 2 );
ASSERT_TRUE( & v1[0] == & v1(0) ); ASSERT_TRUE( & v1[0] == & v1(0) );

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -61,114 +61,181 @@ struct TestDynamicView
typedef typename Space::execution_space execution_space ; typedef typename Space::execution_space execution_space ;
typedef typename Space::memory_space memory_space ; typedef typename Space::memory_space memory_space ;
typedef Kokkos::MemoryPool<typename Space::device_type> memory_pool_type;
typedef Kokkos::Experimental::DynamicView<Scalar*,Space> view_type; typedef Kokkos::Experimental::DynamicView<Scalar*,Space> view_type;
typedef typename view_type::const_type const_view_type ;
typedef typename Kokkos::TeamPolicy<execution_space>::member_type member_type ;
typedef double value_type; typedef double value_type;
struct TEST {};
struct VERIFY {};
view_type a;
const unsigned total_size ;
TestDynamicView( const view_type & arg_a , const unsigned arg_total )
: a(arg_a), total_size( arg_total ) {}
KOKKOS_INLINE_FUNCTION
void operator() ( const TEST , member_type team_member, double& value) const
{
const unsigned int team_idx = team_member.league_rank() * team_member.team_size();
if ( team_member.team_rank() == 0 ) {
unsigned n = team_idx + team_member.team_size();
if ( total_size < n ) n = total_size ;
a.resize_parallel( n );
if ( a.extent(0) < n ) {
Kokkos::abort("GrowTest TEST failed resize_parallel");
}
}
// Make sure resize is done for all team members:
team_member.team_barrier();
const unsigned int val = team_idx + team_member.team_rank();
if ( val < total_size ) {
value += val ;
a( val ) = val ;
}
}
KOKKOS_INLINE_FUNCTION
void operator() ( const VERIFY , member_type team_member, double& value) const
{
const unsigned int val =
team_member.team_rank() +
team_member.league_rank() * team_member.team_size();
if ( val < total_size ) {
if ( val != a(val) ) {
Kokkos::abort("GrowTest VERIFY failed resize_parallel");
}
value += a(val);
}
}
static void run( unsigned arg_total_size ) static void run( unsigned arg_total_size )
{ {
typedef Kokkos::TeamPolicy<execution_space,TEST> TestPolicy ; // Test: Create DynamicView, initialize size (via resize), run through parallel_for to set values, check values (via parallel_reduce); resize values and repeat
typedef Kokkos::TeamPolicy<execution_space,VERIFY> VerifyPolicy ; // Case 1: min_chunk_size is a power of 2
{
view_type da("da", 1024, arg_total_size );
ASSERT_EQ( da.size(), 0 );
// Init
unsigned da_size = arg_total_size / 8;
da.resize_serial(da_size);
ASSERT_EQ( da.size(), da_size );
// printf("TestDynamicView::run(%d) construct memory pool\n",arg_total_size); #if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i )
{
da(i) = Scalar(i);
}
);
const size_t total_alloc_size = arg_total_size * sizeof(Scalar) * 1.2 ; value_type result_sum = 0.0;
const size_t superblock = std::min( total_alloc_size , size_t(1000000) ); Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
{
partial_sum += (value_type)da(i);
}
, result_sum
);
memory_pool_type pool( memory_space() ASSERT_EQ(result_sum, (value_type)( da_size * (da_size - 1) / 2 ) );
, total_alloc_size #endif
, 500 /* min block size in bytes */ #endif
, 30000 /* max block size in bytes */
, superblock
);
// printf("TestDynamicView::run(%d) construct dynamic view\n",arg_total_size); // add 3x more entries i.e. 4x larger than previous size
// the first 1/4 should remain the same
unsigned da_resize = arg_total_size / 2;
da.resize_serial(da_resize);
ASSERT_EQ( da.size(), da_resize );
view_type da("A",pool,arg_total_size); #if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i )
{
da(i) = Scalar(i);
}
);
const_view_type ca(da); value_type new_result_sum = 0.0;
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
{
partial_sum += (value_type)da(i);
}
, new_result_sum
);
// printf("TestDynamicView::run(%d) construct test functor\n",arg_total_size); ASSERT_EQ(new_result_sum+result_sum, (value_type)( da_resize * (da_resize - 1) / 2 ) );
#endif
#endif
} // end scope
TestDynamicView functor(da,arg_total_size); // Test: Create DynamicView, initialize size (via resize), run through parallel_for to set values, check values (via parallel_reduce); resize values and repeat
// Case 2: min_chunk_size is NOT a power of 2
{
view_type da("da", 1023, arg_total_size );
ASSERT_EQ( da.size(), 0 );
// Init
unsigned da_size = arg_total_size / 8;
da.resize_serial(da_size);
ASSERT_EQ( da.size(), da_size );
const unsigned team_size = TestPolicy::team_size_recommended(functor); #if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
const unsigned league_size = ( arg_total_size + team_size - 1 ) / team_size ; #if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i )
{
da(i) = Scalar(i);
}
);
double reference = 0; value_type result_sum = 0.0;
double result = 0; Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
{
partial_sum += (value_type)da(i);
}
, result_sum
);
// printf("TestDynamicView::run(%d) run functor test\n",arg_total_size); ASSERT_EQ(result_sum, (value_type)( da_size * (da_size - 1) / 2 ) );
#endif
#endif
Kokkos::parallel_reduce( TestPolicy(league_size,team_size) , functor , reference); // add 3x more entries i.e. 4x larger than previous size
execution_space::fence(); // the first 1/4 should remain the same
unsigned da_resize = arg_total_size / 2;
da.resize_serial(da_resize);
ASSERT_EQ( da.size(), da_resize );
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i )
{
da(i) = Scalar(i);
}
);
// printf("TestDynamicView::run(%d) run functor verify\n",arg_total_size); value_type new_result_sum = 0.0;
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(da_size, da_resize), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
{
partial_sum += (value_type)da(i);
}
, new_result_sum
);
Kokkos::parallel_reduce( VerifyPolicy(league_size,team_size) , functor , result ); ASSERT_EQ(new_result_sum+result_sum, (value_type)( da_resize * (da_resize - 1) / 2 ) );
execution_space::fence(); #endif
#endif
} // end scope
// printf("TestDynamicView::run(%d) done\n",arg_total_size); // Test: Create DynamicView, initialize size (via resize), run through parallel_for to set values, check values (via parallel_reduce); resize values and repeat
// Case 3: resize reduces the size
{
view_type da("da", 1023, arg_total_size );
ASSERT_EQ( da.size(), 0 );
// Init
unsigned da_size = arg_total_size / 2;
da.resize_serial(da_size);
ASSERT_EQ( da.size(), da_size );
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i )
{
da(i) = Scalar(i);
}
);
value_type result_sum = 0.0;
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_size), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
{
partial_sum += (value_type)da(i);
}
, result_sum
);
ASSERT_EQ(result_sum, (value_type)( da_size * (da_size - 1) / 2 ) );
#endif
#endif
// remove the final 3/4 entries i.e. first 1/4 remain
unsigned da_resize = arg_total_size / 8;
da.resize_serial(da_resize);
ASSERT_EQ( da.size(), da_resize );
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
#if !defined(KOKKOS_ENABLE_CUDA) || ( 8000 <= CUDA_VERSION )
Kokkos::parallel_for( Kokkos::RangePolicy<execution_space>(0, da_resize), KOKKOS_LAMBDA ( const int i )
{
da(i) = Scalar(i);
}
);
value_type new_result_sum = 0.0;
Kokkos::parallel_reduce( Kokkos::RangePolicy<execution_space>(0, da_resize), KOKKOS_LAMBDA ( const int i, value_type& partial_sum )
{
partial_sum += (value_type)da(i);
}
, new_result_sum
);
ASSERT_EQ(new_result_sum, (value_type)( da_resize * (da_resize - 1) / 2 ) );
#endif
#endif
} // end scope
} }
}; };

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -79,13 +79,10 @@ protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision(5) << std::scientific; std::cout << std::setprecision(5) << std::scientific;
Kokkos::OpenMP::initialize();
} }
static void TearDownTestCase() static void TearDownTestCase()
{ {
Kokkos::OpenMP::finalize();
} }
}; };

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -81,7 +81,7 @@ void test_scatter_view_config(int n)
} }
#if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA ) #if defined( KOKKOS_ENABLE_CXX11_DISPATCH_LAMBDA )
auto host_view = Kokkos::create_mirror_view_and_copy(Kokkos::HostSpace(), original_view); auto host_view = Kokkos::create_mirror_view_and_copy(Kokkos::HostSpace(), original_view);
for (typename decltype(host_view)::size_type i = 0; i < host_view.dimension_0(); ++i) { for (typename decltype(host_view)::size_type i = 0; i < host_view.extent(0); ++i) {
auto val0 = host_view(i, 0); auto val0 = host_view(i, 0);
auto val1 = host_view(i, 1); auto val1 = host_view(i, 1);
auto val2 = host_view(i, 2); auto val2 = host_view(i, 2);

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -76,11 +76,9 @@ class serial : public ::testing::Test {
protected: protected:
static void SetUpTestCase () { static void SetUpTestCase () {
std::cout << std::setprecision(5) << std::scientific; std::cout << std::setprecision(5) << std::scientific;
Kokkos::Serial::initialize ();
} }
static void TearDownTestCase () { static void TearDownTestCase () {
Kokkos::Serial::finalize ();
} }
}; };

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -73,7 +73,7 @@ void run_test_graph()
dx = Kokkos::create_staticcrsgraph<dView>( "dx" , graph ); dx = Kokkos::create_staticcrsgraph<dView>( "dx" , graph );
hx = Kokkos::create_mirror( dx ); hx = Kokkos::create_mirror( dx );
ASSERT_EQ( hx.row_map.dimension_0() - 1 , LENGTH ); ASSERT_EQ( hx.row_map.extent(0) - 1 , LENGTH );
for ( size_t i = 0 ; i < LENGTH ; ++i ) { for ( size_t i = 0 ; i < LENGTH ; ++i ) {
const size_t begin = hx.row_map[i]; const size_t begin = hx.row_map[i];
@ -115,17 +115,17 @@ void run_test_graph2()
hView hx = Kokkos::create_mirror( dx ); hView hx = Kokkos::create_mirror( dx );
hView mx = Kokkos::create_mirror( dx ); hView mx = Kokkos::create_mirror( dx );
ASSERT_EQ( (size_t) dx.row_map.dimension_0() , (size_t) LENGTH + 1 ); ASSERT_EQ( (size_t) dx.row_map.extent(0) , (size_t) LENGTH + 1 );
ASSERT_EQ( (size_t) hx.row_map.dimension_0() , (size_t) LENGTH + 1 ); ASSERT_EQ( (size_t) hx.row_map.extent(0) , (size_t) LENGTH + 1 );
ASSERT_EQ( (size_t) mx.row_map.dimension_0() , (size_t) LENGTH + 1 ); ASSERT_EQ( (size_t) mx.row_map.extent(0) , (size_t) LENGTH + 1 );
ASSERT_EQ( (size_t) dx.entries.dimension_0() , (size_t) total_length ); ASSERT_EQ( (size_t) dx.entries.extent(0) , (size_t) total_length );
ASSERT_EQ( (size_t) hx.entries.dimension_0() , (size_t) total_length ); ASSERT_EQ( (size_t) hx.entries.extent(0) , (size_t) total_length );
ASSERT_EQ( (size_t) mx.entries.dimension_0() , (size_t) total_length ); ASSERT_EQ( (size_t) mx.entries.extent(0) , (size_t) total_length );
ASSERT_EQ( (size_t) dx.entries.dimension_1() , (size_t) 3 ); ASSERT_EQ( (size_t) dx.entries.extent(1) , (size_t) 3 );
ASSERT_EQ( (size_t) hx.entries.dimension_1() , (size_t) 3 ); ASSERT_EQ( (size_t) hx.entries.extent(1) , (size_t) 3 );
ASSERT_EQ( (size_t) mx.entries.dimension_1() , (size_t) 3 ); ASSERT_EQ( (size_t) mx.entries.extent(1) , (size_t) 3 );
for ( size_t i = 0 ; i < LENGTH ; ++i ) { for ( size_t i = 0 ; i < LENGTH ; ++i ) {
const size_t entry_begin = hx.row_map[i]; const size_t entry_begin = hx.row_map[i];
@ -140,7 +140,7 @@ void run_test_graph2()
Kokkos::deep_copy( dx.entries , hx.entries ); Kokkos::deep_copy( dx.entries , hx.entries );
Kokkos::deep_copy( mx.entries , dx.entries ); Kokkos::deep_copy( mx.entries , dx.entries );
ASSERT_EQ( mx.row_map.dimension_0() , (size_t) LENGTH + 1 ); ASSERT_EQ( mx.row_map.extent(0) , (size_t) LENGTH + 1 );
for ( size_t i = 0 ; i < LENGTH ; ++i ) { for ( size_t i = 0 ; i < LENGTH ; ++i ) {
const size_t entry_begin = mx.row_map[i]; const size_t entry_begin = mx.row_map[i];

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -79,25 +79,10 @@ protected:
static void SetUpTestCase() static void SetUpTestCase()
{ {
std::cout << std::setprecision(5) << std::scientific; std::cout << std::setprecision(5) << std::scientific;
unsigned num_threads = 4;
if (Kokkos::hwloc::available()) {
num_threads = Kokkos::hwloc::get_available_numa_count()
* Kokkos::hwloc::get_available_cores_per_numa()
// * Kokkos::hwloc::get_available_threads_per_core()
;
}
std::cout << "Threads: " << num_threads << std::endl;
Kokkos::Threads::initialize( num_threads );
} }
static void TearDownTestCase() static void TearDownTestCase()
{ {
Kokkos::Threads::finalize();
} }
}; };

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -34,7 +34,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -43,10 +43,13 @@
#include <gtest/gtest.h> #include <gtest/gtest.h>
#include <cstdlib> #include <cstdlib>
#include <Kokkos_Macros.hpp> #include <Kokkos_Core.hpp>
int main(int argc, char *argv[]) { int main(int argc, char *argv[]) {
Kokkos::initialize(argc,argv);
::testing::InitGoogleTest(&argc,argv); ::testing::InitGoogleTest(&argc,argv);
return RUN_ALL_TESTS(); int result = RUN_ALL_TESTS();
Kokkos::finalize();
return result;
} }

View File

@ -33,6 +33,7 @@ OBJ_PERF = PerfTestMain.o gtest-all.o
OBJ_PERF += PerfTestGramSchmidt.o OBJ_PERF += PerfTestGramSchmidt.o
OBJ_PERF += PerfTestHexGrad.o OBJ_PERF += PerfTestHexGrad.o
OBJ_PERF += PerfTest_CustomReduction.o OBJ_PERF += PerfTest_CustomReduction.o
OBJ_PERF += PerfTest_ViewCopy.o
TARGETS += KokkosCore_PerformanceTest TARGETS += KokkosCore_PerformanceTest
TEST_TARGETS += test-performance TEST_TARGETS += test-performance

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -76,7 +76,7 @@ void axpby( const ConstScalarType & alpha ,
{ {
typedef AXPBY< ConstScalarType , ConstVectorType , VectorType > functor ; typedef AXPBY< ConstScalarType , ConstVectorType , VectorType > functor ;
parallel_for( Y.dimension_0() , functor( alpha , X , beta , Y ) ); parallel_for( Y.extent(0) , functor( alpha , X , beta , Y ) );
} }
/** \brief Y *= alpha */ /** \brief Y *= alpha */
@ -86,7 +86,7 @@ void scale( const ConstScalarType & alpha , const VectorType & Y )
{ {
typedef Scale< ConstScalarType , VectorType > functor ; typedef Scale< ConstScalarType , VectorType > functor ;
parallel_for( Y.dimension_0() , functor( alpha , Y ) ); parallel_for( Y.extent(0) , functor( alpha , Y ) );
} }
template< class ConstVectorType , template< class ConstVectorType ,
@ -97,7 +97,7 @@ void dot( const ConstVectorType & X ,
{ {
typedef Dot< ConstVectorType > functor ; typedef Dot< ConstVectorType > functor ;
parallel_reduce( X.dimension_0() , functor( X , Y ) , finalize ); parallel_reduce( X.extent(0) , functor( X , Y ) , finalize );
} }
template< class ConstVectorType , template< class ConstVectorType ,
@ -107,7 +107,7 @@ void dot( const ConstVectorType & X ,
{ {
typedef DotSingle< ConstVectorType > functor ; typedef DotSingle< ConstVectorType > functor ;
parallel_reduce( X.dimension_0() , functor( X ) , finalize ); parallel_reduce( X.extent(0) , functor( X ) , finalize );
} }
} /* namespace Kokkos */ } /* namespace Kokkos */

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER
@ -86,7 +86,7 @@ void invnorm2( const VectorView & x ,
const ValueView & r , const ValueView & r ,
const ValueView & r_inv ) const ValueView & r_inv )
{ {
Kokkos::parallel_reduce( x.dimension_0() , InvNorm2< VectorView , ValueView >( x , r , r_inv ) ); Kokkos::parallel_reduce( x.extent(0) , InvNorm2< VectorView , ValueView >( x , r , r_inv ) );
} }
// PostProcess : tmp = - ( R(j,k) = result ); // PostProcess : tmp = - ( R(j,k) = result );
@ -122,7 +122,7 @@ void dot_neg( const VectorView & x ,
const ValueView & r , const ValueView & r ,
const ValueView & r_neg ) const ValueView & r_neg )
{ {
Kokkos::parallel_reduce( x.dimension_0() , DotM< VectorView , ValueView >( x , y , r , r_neg ) ); Kokkos::parallel_reduce( x.extent(0) , DotM< VectorView , ValueView >( x , y , r , r_neg ) );
} }
@ -151,7 +151,7 @@ struct ModifiedGramSchmidt
static double factorization( const multivector_type Q_ , static double factorization( const multivector_type Q_ ,
const multivector_type R_ ) const multivector_type R_ )
{ {
const size_type count = Q_.dimension_1(); const size_type count = Q_.extent(1);
value_view tmp("tmp"); value_view tmp("tmp");
value_view one("one"); value_view one("one");

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

View File

@ -35,7 +35,7 @@
// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS // NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
// SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
// //
// Questions? Contact H. Carter Edwards (hcedwar@sandia.gov) // Questions? Contact Christian R. Trott (crtrott@sandia.gov)
// //
// ************************************************************************ // ************************************************************************
//@HEADER //@HEADER

Some files were not shown because too many files have changed in this diff Show More