Update Kokkos library in LAMMPS to v4.4.0
This commit is contained in:
@ -1,12 +1,88 @@
|
|||||||
# CHANGELOG
|
# CHANGELOG
|
||||||
|
|
||||||
|
## [4.4.00](https://github.com/kokkos/kokkos/tree/4.4.00)
|
||||||
|
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.3.01...4.4.00)
|
||||||
|
|
||||||
|
### Features:
|
||||||
|
* Add `Kokkos::View` conversions from and to [`std::mdspan`](https://en.cppreference.com/w/cpp/container/mdspan) [\#6830](https://github.com/kokkos/kokkos/pull/6830) [\#7069](https://github.com/kokkos/kokkos/pull/7069)
|
||||||
|
|
||||||
|
### Backend and Architecture Enhancements:
|
||||||
|
|
||||||
|
#### CUDA:
|
||||||
|
* `nvcc_wrapper`: Adding ability to process `--disable-warnings` flag [\#6936](https://github.com/kokkos/kokkos/issues/6936)
|
||||||
|
* Use recommended/max team size functions in Cuda ParallelFor and Reduce constructors [\#6891](https://github.com/kokkos/kokkos/issues/6891)
|
||||||
|
* Improve compile-times when building with `Kokkos_ENABLE_DEBUG_BOUNDS_CHECK` in Cuda [\#7013](https://github.com/kokkos/kokkos/pull/7013)
|
||||||
|
|
||||||
|
#### HIP:
|
||||||
|
* Use HIP builtin atomics [\#6882](https://github.com/kokkos/kokkos/pull/6882) [\#7000](https://github.com/kokkos/kokkos/pull/7000)
|
||||||
|
* Enable user-specified compiler and linker flags for AMD GPUs [\#7127](https://github.com/kokkos/kokkos/pull/7127)
|
||||||
|
|
||||||
|
#### SYCL:
|
||||||
|
* Add support for Graphs [\#6912](https://github.com/kokkos/kokkos/pull/6912)
|
||||||
|
* Fix multi-GPU support [\#6887](https://github.com/kokkos/kokkos/pull/6887)
|
||||||
|
* Improve performance of reduction and scan operations [\#6562](https://github.com/kokkos/kokkos/pull/6562), [\#6750](https://github.com/kokkos/kokkos/pull/6750)
|
||||||
|
* Fix lock for guarding scratch space in `TeamPolicy` `parallel_reduce` [\#6988](https://github.com/kokkos/kokkos/pull/6988)
|
||||||
|
* Include submission command queue property information into `SYCL::print_configuration()` [\#7004](https://github.com/kokkos/kokkos/pull/7004)
|
||||||
|
|
||||||
|
#### OpenACC:
|
||||||
|
* Make `TeamPolicy` `parallel_for` execute on the correct async queue [\#7012](https://github.com/kokkos/kokkos/pull/7012)
|
||||||
|
|
||||||
|
#### OpenMPTarget:
|
||||||
|
* Honor user requested loop ordering in `MDRange` policy [\#6925](https://github.com/kokkos/kokkos/pull/6925)
|
||||||
|
* Prevent data races by guarding the scratch space used in `parallel_scan` [\#6998](https://github.com/kokkos/kokkos/pull/6998)
|
||||||
|
|
||||||
|
#### HPX:
|
||||||
|
* Workaround issue with template argument deduction to support compilation with NVCC [\#7015](https://github.com/kokkos/kokkos/pull/7015)
|
||||||
|
|
||||||
|
### General Enhancements
|
||||||
|
* Improve performance of view copies in host parallel regions [\#6730](https://github.com/kokkos/kokkos/pull/6730)
|
||||||
|
* Harmonize convertibility rules of `Kokkos::RandomAccessIterator` with `View`s [\#6929](https://github.com/kokkos/kokkos/pull/6929)
|
||||||
|
* Add a check precondition non-overlapping ranges for the `adjacent_difference` algorithm in debug mode [\#6922](https://github.com/kokkos/kokkos/pull/6922)
|
||||||
|
* Add deduction guides for `TeamPolicy` [\#7030](https://github.com/kokkos/kokkos/pull/7030)
|
||||||
|
* SIMD: Allow flexible vector width for 32 bit types [\#6802](https://github.com/kokkos/kokkos/pull/6802)
|
||||||
|
* Updates for `Kokkos::Array`: add `kokkos_swap(Array<T, N>)` specialization [\#6943](https://github.com/kokkos/kokkos/pull/6943), add `Kokkos::to_array` [\#6375](https://github.com/kokkos/kokkos/pull/6375), make `Kokkos::Array` equality-comparable [\#7148](https://github.com/kokkos/kokkos/pull/7148)
|
||||||
|
* Structured binding support for `Kokkos::complex` [\#7040](https://github.com/kokkos/kokkos/pull/7040)
|
||||||
|
|
||||||
|
### Build System Changes
|
||||||
|
* Do not require OpenMP support for languages other than CXX [\#6965](https://github.com/kokkos/kokkos/pull/6965)
|
||||||
|
* Update Intel GPU architectures in Makefile [\#6895](https://github.com/kokkos/kokkos/pull/6895)
|
||||||
|
* Fix use of OpenMP with Cuda or HIP as compile language [\#6972](https://github.com/kokkos/kokkos/pull/6972)
|
||||||
|
* Define and enforce new minimum compiler versions for C++20 support [\#7128](https://github.com/kokkos/kokkos/pull/7128), [\#7123](https://github.com/kokkos/kokkos/pull/7123)
|
||||||
|
* Add nvidia Grace CPU architecture: `Kokkos_ARCH_ARMV9_GRACE` [\#7158](https://github.com/kokkos/kokkos/pull/7158)
|
||||||
|
* Fix Makefile.kokkos for Threads [\#6896](https://github.com/kokkos/kokkos/pull/6896)
|
||||||
|
* Remove support for NVHPC as CUDA device compiler [\#6987](https://github.com/kokkos/kokkos/pull/6987)
|
||||||
|
* Fix using CUDAToolkit for CMake 3.28.4 and higher [\#7062](https://github.com/kokkos/kokkos/pull/7062)
|
||||||
|
|
||||||
|
### Incompatibilities (i.e. breaking changes)
|
||||||
|
* Drop `Kokkos::Array` special treatment in `View`s [\#6906](https://github.com/kokkos/kokkos/pull/6906)
|
||||||
|
* Drop `Experimental::RawMemoryAllocationFailure` [\#7145](https://github.com/kokkos/kokkos/pull/7145)
|
||||||
|
|
||||||
|
### Deprecations
|
||||||
|
* Remove `Experimental::LayoutTiled` class template and deprecate `is_layouttiled` trait [\#6907](https://github.com/kokkos/kokkos/pull/6907)
|
||||||
|
* Deprecate `Kokkos::layout_iterate_type_selector` [\#7076](https://github.com/kokkos/kokkos/pull/7076)
|
||||||
|
* Deprecate specialization of `Kokkos::pair` for a single element [\#6947](https://github.com/kokkos/kokkos/pull/6947)
|
||||||
|
* Deprecate `deep_copy` of `UnorderedMap` of different size [\#6812](https://github.com/kokkos/kokkos/pull/6812)
|
||||||
|
* Deprecate trailing `Proxy` template argument of `Kokkos::Array` [\#6934](https://github.com/kokkos/kokkos/pull/6934)
|
||||||
|
* Deprecate implicit conversions of integers to `ChunkSize` [\#7151](https://github.com/kokkos/kokkos/pull/7151)
|
||||||
|
* Deprecate implicit conversions to execution spaces [\#7156](https://github.com/kokkos/kokkos/pull/7156)
|
||||||
|
|
||||||
|
### Bug Fixes
|
||||||
|
* Do not return a copy of the input functor in `Experimental::for_each` [\#6910](https://github.com/kokkos/kokkos/pull/6910)
|
||||||
|
* Fix `realloc` on views of non-default constructible element types [\#6993](https://github.com/kokkos/kokkos/pull/6993)
|
||||||
|
* Fix undefined behavior in `View` initialization or fill with zeros [\#7014](https://github.com/kokkos/kokkos/pull/7014)
|
||||||
|
* Fix `sort_by_key` on host execution spaces when building with NVCC [\#7059](https://github.com/kokkos/kokkos/pull/7059)
|
||||||
|
* Fix using shared libraries and -fvisibility=hidden [\#7065](https://github.com/kokkos/kokkos/pull/7065)
|
||||||
|
* Fix view reference counting when functor copy constructor throws in parallel dispatch [\#6289](https://github.com/kokkos/kokkos/pull/6289)
|
||||||
|
* Fix `initialize(InitializationSetting)` for handling `print_configuration` setting [\#7098](https://github.com/kokkos/kokkos/pull/7098)
|
||||||
|
* Thread safety fixes for the Serial and OpenMP backend [\#7080](https://github.com/kokkos/kokkos/pull/7080), [\#6151](https://github.com/kokkos/kokkos/pull/6151)
|
||||||
|
|
||||||
## [4.3.01](https://github.com/kokkos/kokkos/tree/4.3.01)
|
## [4.3.01](https://github.com/kokkos/kokkos/tree/4.3.01)
|
||||||
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.3.00...4.3.01)
|
[Full Changelog](https://github.com/kokkos/kokkos/compare/4.3.00...4.3.01)
|
||||||
|
|
||||||
### Backend and Architecture Enhancements:
|
### Backend and Architecture Enhancements:
|
||||||
|
|
||||||
#### HIP:
|
#### HIP:
|
||||||
* MI300 support unified memory support [\#6877](https://github.com/kokkos/kokkos/pull/6877)
|
* MI300 support unified memory [\#6877](https://github.com/kokkos/kokkos/pull/6877)
|
||||||
|
|
||||||
### Bug Fixes
|
### Bug Fixes
|
||||||
* Serial: Use the provided execution space instance in TeamPolicy [\#6951](https://github.com/kokkos/kokkos/pull/6951)
|
* Serial: Use the provided execution space instance in TeamPolicy [\#6951](https://github.com/kokkos/kokkos/pull/6951)
|
||||||
|
|||||||
65
lib/kokkos/CITATION.cff
Normal file
65
lib/kokkos/CITATION.cff
Normal file
@ -0,0 +1,65 @@
|
|||||||
|
cff-version: 1.2.0
|
||||||
|
title: Kokkos
|
||||||
|
message: >-
|
||||||
|
If you use this software, please cite the overview paper
|
||||||
|
type: software
|
||||||
|
authors:
|
||||||
|
- name: The Kokkos authors
|
||||||
|
website: https://kokkos.org/community/team/
|
||||||
|
identifiers:
|
||||||
|
- type: url
|
||||||
|
website: https://kokkos.org/kokkos-core-wiki/citation.html
|
||||||
|
repository-code: 'https://github.com/kokkos/kokkos'
|
||||||
|
url: 'https://kokkos.org/'
|
||||||
|
license: Apache-2.0
|
||||||
|
preferred-citation:
|
||||||
|
type: article
|
||||||
|
authors:
|
||||||
|
- given-names: Christian R.
|
||||||
|
family-names: Trott
|
||||||
|
- given-names: Damien
|
||||||
|
family-names: Lebrun-Grandié
|
||||||
|
- given-names: Daniel
|
||||||
|
family-names: Arndt
|
||||||
|
- family-names: Ciesko
|
||||||
|
given-names: Jan
|
||||||
|
- given-names: Vinh
|
||||||
|
family-names: Dang
|
||||||
|
- family-names: Ellingwood
|
||||||
|
given-names: Nathan
|
||||||
|
- given-names: Rahulkumar
|
||||||
|
family-names: Gayatri
|
||||||
|
- given-names: Evan
|
||||||
|
family-names: Harvey
|
||||||
|
- given-names: Daisy S.
|
||||||
|
family-names: Hollman
|
||||||
|
- given-names: Dan
|
||||||
|
family-names: Ibanez
|
||||||
|
- given-names: Nevin
|
||||||
|
family-names: Liber
|
||||||
|
- given-names: Jonathan
|
||||||
|
family-names: Madsen
|
||||||
|
- given-names: Jeff
|
||||||
|
family-names: Miles
|
||||||
|
- given-names: David
|
||||||
|
family-names: Poliakoff
|
||||||
|
- given-names: Amy
|
||||||
|
family-names: Powell
|
||||||
|
- given-names: Sivasankaran
|
||||||
|
family-names: Rajamanickam
|
||||||
|
- given-names: Mikael
|
||||||
|
family-names: Simberg
|
||||||
|
- given-names: Dan
|
||||||
|
family-names: Sunderland
|
||||||
|
- given-names: Bruno
|
||||||
|
family-names: Turcksin
|
||||||
|
- given-names: Jeremiah
|
||||||
|
family-names: Wilke
|
||||||
|
doi: 10.1109/TPDS.2021.3097283
|
||||||
|
journal: IEEE Transactions on Parallel and Distributed Systems
|
||||||
|
start: 805
|
||||||
|
end: 817
|
||||||
|
title: "Kokkos 3: Programming Model Extensions for the Exascale Era"
|
||||||
|
volume: 33
|
||||||
|
issue: 4
|
||||||
|
year: 2022
|
||||||
@ -150,8 +150,8 @@ ENDIF()
|
|||||||
|
|
||||||
|
|
||||||
set(Kokkos_VERSION_MAJOR 4)
|
set(Kokkos_VERSION_MAJOR 4)
|
||||||
set(Kokkos_VERSION_MINOR 3)
|
set(Kokkos_VERSION_MINOR 4)
|
||||||
set(Kokkos_VERSION_PATCH 1)
|
set(Kokkos_VERSION_PATCH 0)
|
||||||
set(Kokkos_VERSION "${Kokkos_VERSION_MAJOR}.${Kokkos_VERSION_MINOR}.${Kokkos_VERSION_PATCH}")
|
set(Kokkos_VERSION "${Kokkos_VERSION_MAJOR}.${Kokkos_VERSION_MINOR}.${Kokkos_VERSION_PATCH}")
|
||||||
message(STATUS "Kokkos version: ${Kokkos_VERSION}")
|
message(STATUS "Kokkos version: ${Kokkos_VERSION}")
|
||||||
math(EXPR KOKKOS_VERSION "${Kokkos_VERSION_MAJOR} * 10000 + ${Kokkos_VERSION_MINOR} * 100 + ${Kokkos_VERSION_PATCH}")
|
math(EXPR KOKKOS_VERSION "${Kokkos_VERSION_MAJOR} * 10000 + ${Kokkos_VERSION_MINOR} * 100 + ${Kokkos_VERSION_PATCH}")
|
||||||
|
|||||||
@ -11,8 +11,8 @@ CXXFLAGS += $(SHFLAGS)
|
|||||||
endif
|
endif
|
||||||
|
|
||||||
KOKKOS_VERSION_MAJOR = 4
|
KOKKOS_VERSION_MAJOR = 4
|
||||||
KOKKOS_VERSION_MINOR = 3
|
KOKKOS_VERSION_MINOR = 4
|
||||||
KOKKOS_VERSION_PATCH = 1
|
KOKKOS_VERSION_PATCH = 0
|
||||||
KOKKOS_VERSION = $(shell echo $(KOKKOS_VERSION_MAJOR)*10000+$(KOKKOS_VERSION_MINOR)*100+$(KOKKOS_VERSION_PATCH) | bc)
|
KOKKOS_VERSION = $(shell echo $(KOKKOS_VERSION_MAJOR)*10000+$(KOKKOS_VERSION_MINOR)*100+$(KOKKOS_VERSION_PATCH) | bc)
|
||||||
|
|
||||||
# Options: Cuda,HIP,SYCL,OpenMPTarget,OpenMP,Threads,Serial
|
# Options: Cuda,HIP,SYCL,OpenMPTarget,OpenMP,Threads,Serial
|
||||||
@ -21,11 +21,11 @@ KOKKOS_DEVICES ?= "OpenMP"
|
|||||||
# Options:
|
# Options:
|
||||||
# Intel: KNC,KNL,SNB,HSW,BDW,SKL,SKX,ICL,ICX,SPR
|
# Intel: KNC,KNL,SNB,HSW,BDW,SKL,SKX,ICL,ICX,SPR
|
||||||
# NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72,Turing75,Ampere80,Ampere86,Ada89,Hopper90
|
# NVIDIA: Kepler,Kepler30,Kepler32,Kepler35,Kepler37,Maxwell,Maxwell50,Maxwell52,Maxwell53,Pascal60,Pascal61,Volta70,Volta72,Turing75,Ampere80,Ampere86,Ada89,Hopper90
|
||||||
# ARM: ARMv80,ARMv81,ARMv8-ThunderX,ARMv8-TX2,A64FX
|
# ARM: ARMv80,ARMv81,ARMv8-ThunderX,ARMv8-TX2,A64FX,ARMv9-Grace
|
||||||
# IBM: Power8,Power9
|
# IBM: Power8,Power9
|
||||||
# AMD-GPUS: AMD_GFX906,AMD_GFX908,AMD_GFX90A,AMD_GFX940,AMD_GFX942,AMD_GFX1030,AMD_GFX1100,AMD_GFX1103
|
# AMD-GPUS: AMD_GFX906,AMD_GFX908,AMD_GFX90A,AMD_GFX940,AMD_GFX942,AMD_GFX1030,AMD_GFX1100
|
||||||
# AMD-CPUS: AMDAVX,Zen,Zen2,Zen3
|
# AMD-CPUS: AMDAVX,Zen,Zen2,Zen3
|
||||||
# Intel-GPUs: Gen9,Gen11,Gen12LP,DG1,XeHP,PVC
|
# Intel-GPUs: Intel_Gen,Intel_Gen9,Intel_Gen11,Intel_Gen12LP,Intel_DG1,Intel_XeHP,Intel_PVC
|
||||||
KOKKOS_ARCH ?= ""
|
KOKKOS_ARCH ?= ""
|
||||||
# Options: yes,no
|
# Options: yes,no
|
||||||
KOKKOS_DEBUG ?= "no"
|
KOKKOS_DEBUG ?= "no"
|
||||||
@ -41,7 +41,7 @@ KOKKOS_STANDALONE_CMAKE ?= "no"
|
|||||||
|
|
||||||
# Default settings specific options.
|
# Default settings specific options.
|
||||||
# Options: force_uvm,use_ldg,rdc,enable_lambda,enable_constexpr,disable_malloc_async
|
# Options: force_uvm,use_ldg,rdc,enable_lambda,enable_constexpr,disable_malloc_async
|
||||||
KOKKOS_CUDA_OPTIONS ?= "enable_lambda"
|
KOKKOS_CUDA_OPTIONS ?= "enable_lambda,disable_malloc_async"
|
||||||
|
|
||||||
# Options: rdc
|
# Options: rdc
|
||||||
KOKKOS_HIP_OPTIONS ?= ""
|
KOKKOS_HIP_OPTIONS ?= ""
|
||||||
@ -328,12 +328,43 @@ KOKKOS_INTERNAL_USE_ARCH_ICL := $(call kokkos_has_string,$(KOKKOS_ARCH),ICL)
|
|||||||
KOKKOS_INTERNAL_USE_ARCH_ICX := $(call kokkos_has_string,$(KOKKOS_ARCH),ICX)
|
KOKKOS_INTERNAL_USE_ARCH_ICX := $(call kokkos_has_string,$(KOKKOS_ARCH),ICX)
|
||||||
KOKKOS_INTERNAL_USE_ARCH_SPR := $(call kokkos_has_string,$(KOKKOS_ARCH),SPR)
|
KOKKOS_INTERNAL_USE_ARCH_SPR := $(call kokkos_has_string,$(KOKKOS_ARCH),SPR)
|
||||||
|
|
||||||
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen)
|
# Traditionally, we supported, e.g., IntelGen9 instead of Intel_Gen9. The latter
|
||||||
|
# matches the CMake option but we also accept the former for backward-compatibility.
|
||||||
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN9 := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen9)
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN9 := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen9)
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN9), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN9 := $(call kokkos_has_string,$(KOKKOS_ARCH),Intel_Gen9)
|
||||||
|
endif
|
||||||
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN11 := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen11)
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN11 := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen11)
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN11), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN11 := $(call kokkos_has_string,$(KOKKOS_ARCH),Intel_Gen11)
|
||||||
|
endif
|
||||||
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN12LP := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen12LP)
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN12LP := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen12LP)
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN12LP), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN12LP := $(call kokkos_has_string,$(KOKKOS_ARCH),Intel_Gen12LP)
|
||||||
|
endif
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen9)
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN9), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN9 := $(call kokkos_has_string,$(KOKKOS_ARCH),Intel_Gen9)
|
||||||
|
endif
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN_SET := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN9) \
|
||||||
|
+ $(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN11) \
|
||||||
|
+ $(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN12LP))
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN_SET), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelGen)
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_GEN := $(call kokkos_has_string,$(KOKKOS_ARCH),Intel_Gen)
|
||||||
|
endif
|
||||||
|
endif
|
||||||
KOKKOS_INTERNAL_USE_ARCH_INTEL_DG1 := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelDG1)
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_DG1 := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelDG1)
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_DG1), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_DG1 := $(call kokkos_has_string,$(KOKKOS_ARCH),Intel_DG1)
|
||||||
|
endif
|
||||||
KOKKOS_INTERNAL_USE_ARCH_INTEL_XEHP := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelXeHP)
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_XEHP := $(call kokkos_has_string,$(KOKKOS_ARCH),IntelXeHP)
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_INTEL_XEHP), 0)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_XEHP := $(call kokkos_has_string,$(KOKKOS_ARCH),Intel_XeHP)
|
||||||
|
endif
|
||||||
|
# Traditionally the architecture was called PVC instead of Intel_PVC. This
|
||||||
|
# version makes us accept IntelPVC and Intel_PVC as well.
|
||||||
KOKKOS_INTERNAL_USE_ARCH_INTEL_PVC := $(call kokkos_has_string,$(KOKKOS_ARCH),PVC)
|
KOKKOS_INTERNAL_USE_ARCH_INTEL_PVC := $(call kokkos_has_string,$(KOKKOS_ARCH),PVC)
|
||||||
|
|
||||||
# NVIDIA based.
|
# NVIDIA based.
|
||||||
@ -394,7 +425,8 @@ KOKKOS_INTERNAL_USE_ARCH_ARMV81 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8
|
|||||||
KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-ThunderX)
|
KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-ThunderX)
|
||||||
KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-TX2)
|
KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2 := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv8-TX2)
|
||||||
KOKKOS_INTERNAL_USE_ARCH_A64FX := $(call kokkos_has_string,$(KOKKOS_ARCH),A64FX)
|
KOKKOS_INTERNAL_USE_ARCH_A64FX := $(call kokkos_has_string,$(KOKKOS_ARCH),A64FX)
|
||||||
KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2)+$(KOKKOS_INTERNAL_USE_ARCH_A64FX) | bc))
|
KOKKOS_INTERNAL_USE_ARCH_ARMV9_GRACE := $(call kokkos_has_string,$(KOKKOS_ARCH),ARMv9-Grace)
|
||||||
|
KOKKOS_INTERNAL_USE_ARCH_ARM := $(strip $(shell echo $(KOKKOS_INTERNAL_USE_ARCH_ARMV80)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV81)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV8_THUNDERX2)+$(KOKKOS_INTERNAL_USE_ARCH_A64FX)+$(KOKKOS_INTERNAL_USE_ARCH_ARMV9_GRACE) | bc))
|
||||||
|
|
||||||
# IBM based.
|
# IBM based.
|
||||||
KOKKOS_INTERNAL_USE_ARCH_POWER8 := $(call kokkos_has_string,$(KOKKOS_ARCH),Power8)
|
KOKKOS_INTERNAL_USE_ARCH_POWER8 := $(call kokkos_has_string,$(KOKKOS_ARCH),Power8)
|
||||||
@ -433,7 +465,6 @@ KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1100 := $(call kokkos_has_string,$(KOKKOS_ARCH),
|
|||||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1100), 0)
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1100), 0)
|
||||||
KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1100 := $(call kokkos_has_string,$(KOKKOS_ARCH),NAVI1100)
|
KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1100 := $(call kokkos_has_string,$(KOKKOS_ARCH),NAVI1100)
|
||||||
endif
|
endif
|
||||||
KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1103 := $(call kokkos_has_string,$(KOKKOS_ARCH),AMD_GFX1103)
|
|
||||||
|
|
||||||
# Any AVX?
|
# Any AVX?
|
||||||
KOKKOS_INTERNAL_USE_ARCH_AVX := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_AMDAVX))
|
KOKKOS_INTERNAL_USE_ARCH_AVX := $(shell expr $(KOKKOS_INTERNAL_USE_ARCH_SNB) + $(KOKKOS_INTERNAL_USE_ARCH_AMDAVX))
|
||||||
@ -758,6 +789,14 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_A64FX), 1)
|
|||||||
endif
|
endif
|
||||||
endif
|
endif
|
||||||
|
|
||||||
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ARMV9_GRACE), 1)
|
||||||
|
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_ARMV9_GRACE")
|
||||||
|
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_ARM_NEON")
|
||||||
|
|
||||||
|
KOKKOS_CXXFLAGS += -mcpu=neoverse-v2 -msve-vector-bits=128
|
||||||
|
KOKKOS_LDFLAGS += -mcpu=neoverse-v2 -msve-vector-bits=128
|
||||||
|
endif
|
||||||
|
|
||||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ZEN), 1)
|
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_ZEN), 1)
|
||||||
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AMD_ZEN")
|
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AMD_ZEN")
|
||||||
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AVX2")
|
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AVX2")
|
||||||
@ -1119,11 +1158,6 @@ ifeq ($(KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1100), 1)
|
|||||||
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AMD_GPU")
|
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AMD_GPU")
|
||||||
KOKKOS_INTERNAL_HIP_ARCH_FLAG := --offload-arch=gfx1100
|
KOKKOS_INTERNAL_HIP_ARCH_FLAG := --offload-arch=gfx1100
|
||||||
endif
|
endif
|
||||||
ifeq ($(KOKKOS_INTERNAL_USE_ARCH_AMD_GFX1103), 1)
|
|
||||||
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AMD_GFX1103")
|
|
||||||
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ARCH_AMD_GPU")
|
|
||||||
KOKKOS_INTERNAL_HIP_ARCH_FLAG := --offload-arch=gfx1103
|
|
||||||
endif
|
|
||||||
|
|
||||||
|
|
||||||
ifeq ($(KOKKOS_INTERNAL_USE_HIP), 1)
|
ifeq ($(KOKKOS_INTERNAL_USE_HIP), 1)
|
||||||
@ -1216,6 +1250,8 @@ ifeq ($(KOKKOS_INTERNAL_DISABLE_BUNDLED_MDSPAN), 0)
|
|||||||
endif
|
endif
|
||||||
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ENABLE_IMPL_MDSPAN")
|
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ENABLE_IMPL_MDSPAN")
|
||||||
|
|
||||||
|
tmp := $(call kokkos_append_header,"$H""define KOKKOS_ENABLE_IMPL_REF_COUNT_BRANCH_UNLIKELY")
|
||||||
|
|
||||||
KOKKOS_INTERNAL_LS_CONFIG := $(shell ls KokkosCore_config.h 2>&1)
|
KOKKOS_INTERNAL_LS_CONFIG := $(shell ls KokkosCore_config.h 2>&1)
|
||||||
|
|
||||||
ifeq ($(KOKKOS_INTERNAL_LS_CONFIG), KokkosCore_config.h)
|
ifeq ($(KOKKOS_INTERNAL_LS_CONFIG), KokkosCore_config.h)
|
||||||
|
|||||||
@ -81,7 +81,7 @@ ifeq ($(KOKKOS_INTERNAL_USE_THREADS), 1)
|
|||||||
Kokkos_Threads_Instance.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/Threads/Kokkos_Threads_Instance.cpp
|
Kokkos_Threads_Instance.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/Threads/Kokkos_Threads_Instance.cpp
|
||||||
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/Threads/Kokkos_Threads_Instance.cpp
|
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/Threads/Kokkos_Threads_Instance.cpp
|
||||||
Kokkos_Threads_Spinwait.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/Threads/Kokkos_Threads_Spinwait.cpp
|
Kokkos_Threads_Spinwait.o: $(KOKKOS_CPP_DEPENDS) $(KOKKOS_PATH)/core/src/Threads/Kokkos_Threads_Spinwait.cpp
|
||||||
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/Threads/Kokkos_Spinwait.cpp
|
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) -c $(KOKKOS_PATH)/core/src/Threads/Kokkos_Threads_Spinwait.cpp
|
||||||
endif
|
endif
|
||||||
|
|
||||||
ifeq ($(KOKKOS_INTERNAL_USE_OPENMP), 1)
|
ifeq ($(KOKKOS_INTERNAL_USE_OPENMP), 1)
|
||||||
|
|||||||
@ -1,4 +1,4 @@
|
|||||||

|
[](https://kokkos.org)
|
||||||
|
|
||||||
# Kokkos: Core Libraries
|
# Kokkos: Core Libraries
|
||||||
|
|
||||||
@ -10,43 +10,66 @@ hierarchies and multiple types of execution resources. It currently can use
|
|||||||
CUDA, HIP, SYCL, HPX, OpenMP and C++ threads as backend programming models with several other
|
CUDA, HIP, SYCL, HPX, OpenMP and C++ threads as backend programming models with several other
|
||||||
backends in development.
|
backends in development.
|
||||||
|
|
||||||
**Kokkos Core is part of the Kokkos C++ Performance Portability Programming EcoSystem.**
|
**Kokkos Core is part of the [Kokkos C++ Performance Portability Programming Ecosystem](https://kokkos.org/about/abstract/).**
|
||||||
|
|
||||||
For the complete documentation, click below:
|
Kokkos is a [Linux Foundation](https://linuxfoundation.org) project.
|
||||||
|
|
||||||
# [kokkos.github.io/kokkos-core-wiki](https://kokkos.github.io/kokkos-core-wiki)
|
## Learning about Kokkos
|
||||||
|
|
||||||
# Learning about Kokkos
|
|
||||||
|
|
||||||
To start learning about Kokkos:
|
To start learning about Kokkos:
|
||||||
|
|
||||||
- [Kokkos Lectures](https://kokkos.github.io/kokkos-core-wiki/videolectures.html): they contain a mix of lecture videos and hands-on exercises covering all the important Kokkos Ecosystem capabilities.
|
- [Kokkos Lectures](https://kokkos.org/kokkos-core-wiki/videolectures.html): they contain a mix of lecture videos and hands-on exercises covering all the important capabilities.
|
||||||
|
|
||||||
- [Programming guide](https://kokkos.github.io/kokkos-core-wiki/programmingguide.html): contains in "narrative" form a technical description of the programming model, machine model, and the main building blocks like the Views and parallel dispatch.
|
- [Programming guide](https://kokkos.org/kokkos-core-wiki/programmingguide.html): contains in "narrative" form a technical description of the programming model, machine model, and the main building blocks like the Views and parallel dispatch.
|
||||||
|
|
||||||
- [API reference](https://kokkos.github.io/kokkos-core-wiki/): organized by category, i.e., [core](https://kokkos.github.io/kokkos-core-wiki/API/core-index.html), [algorithms](https://kokkos.github.io/kokkos-core-wiki/API/algorithms-index.html) and [containers](https://kokkos.github.io/kokkos-core-wiki/API/containers-index.html) or, if you prefer, in [alphabetical order](https://kokkos.github.io/kokkos-core-wiki/API/alphabetical.html).
|
- [API reference](https://kokkos.org/kokkos-core-wiki/): organized by category, i.e., [core](https://kokkos.org/kokkos-core-wiki/API/core-index.html), [algorithms](https://kokkos.org/kokkos-core-wiki/API/algorithms-index.html) and [containers](https://kokkos.org/kokkos-core-wiki/API/containers-index.html) or, if you prefer, in [alphabetical order](https://kokkos.org/kokkos-core-wiki/API/alphabetical.html).
|
||||||
|
|
||||||
- [Use cases and Examples](https://kokkos.github.io/kokkos-core-wiki/usecases.html): a series of examples ranging from how to use Kokkos with MPI to Fortran interoperability.
|
- [Use cases and Examples](https://kokkos.org/kokkos-core-wiki/usecases.html): a serie of examples ranging from how to use Kokkos with MPI to Fortran interoperability.
|
||||||
|
|
||||||
|
## Obtaining Kokkos
|
||||||
|
|
||||||
|
The latest release of Kokkos can be obtained from the [GitHub releases page](https://github.com/kokkos/kokkos/releases/latest).
|
||||||
|
|
||||||
|
The current release is [4.3.01](https://github.com/kokkos/kokkos/releases/tag/4.3.01).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -OJ -L https://github.com/kokkos/kokkos/archive/refs/tags/4.3.01.tar.gz
|
||||||
|
# Or with wget
|
||||||
|
wget https://github.com/kokkos/kokkos/archive/refs/tags/4.3.01.tar.gz
|
||||||
|
```
|
||||||
|
|
||||||
|
To clone the latest development version of Kokkos from GitHub:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone -b develop https://github.com/kokkos/kokkos.git
|
||||||
|
```
|
||||||
|
|
||||||
|
### Building Kokkos
|
||||||
|
|
||||||
|
To build Kokkos, you will need to have a C++ compiler that supports C++17 or later.
|
||||||
|
All requirements including minimum and primary tested compiler versions can be found [here](https://kokkos.org/kokkos-core-wiki/requirements.html).
|
||||||
|
|
||||||
|
Building and installation instructions are described [here](https://kokkos.org/kokkos-core-wiki/building.html).
|
||||||
|
|
||||||
|
You can also install Kokkos using [Spack](https://spack.io/): `spack install kokkos`. [Available configuration options](https://packages.spack.io/package.html?name=kokkos) can be displayed using `spack info kokkos`.
|
||||||
|
|
||||||
|
## For the complete documentation: [kokkos.org/kokkos-core-wiki/](https://kokkos.org/kokkos-core-wiki/)
|
||||||
|
|
||||||
|
## Support
|
||||||
|
|
||||||
For questions find us on Slack: https://kokkosteam.slack.com or open a GitHub issue.
|
For questions find us on Slack: https://kokkosteam.slack.com or open a GitHub issue.
|
||||||
|
|
||||||
For non-public questions send an email to: *crtrott(at)sandia.gov*
|
For non-public questions send an email to: *crtrott(at)sandia.gov*
|
||||||
|
|
||||||
# Contributing to Kokkos
|
## Contributing
|
||||||
|
|
||||||
Please see [this page](https://kokkos.github.io/kokkos-core-wiki/contributing.html) for details on how to contribute.
|
Please see [this page](https://kokkos.org/kokkos-core-wiki/contributing.html) for details on how to contribute.
|
||||||
|
|
||||||
# Requirements, Building and Installing
|
## Citing Kokkos
|
||||||
|
|
||||||
All requirements including minimum and primary tested compiler versions can be found [here](https://kokkos.github.io/kokkos-core-wiki/requirements.html).
|
Please see the [following page](https://kokkos.org/kokkos-core-wiki/citation.html).
|
||||||
|
|
||||||
Building and installation instructions are described [here](https://kokkos.github.io/kokkos-core-wiki/building.html).
|
## License
|
||||||
|
|
||||||
# Citing Kokkos
|
|
||||||
|
|
||||||
Please see the [following page](https://kokkos.github.io/kokkos-core-wiki/citation.html).
|
|
||||||
|
|
||||||
# License
|
|
||||||
|
|
||||||
[](https://spdx.org/licenses/LLVM-exception.html)
|
[](https://spdx.org/licenses/LLVM-exception.html)
|
||||||
|
|
||||||
|
|||||||
@ -189,6 +189,33 @@ void applyPermutation(const ExecutionSpace& space,
|
|||||||
KOKKOS_LAMBDA(int i) { view(i) = view_copy(permutation(i)); });
|
KOKKOS_LAMBDA(int i) { view(i) = view_copy(permutation(i)); });
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// FIXME_NVCC: nvcc has trouble compiling lambdas inside a function with
|
||||||
|
// variadic templates (sort_by_key_via_sort). Switch to using functors instead.
|
||||||
|
template <typename Permute>
|
||||||
|
struct IotaFunctor {
|
||||||
|
Permute _permute;
|
||||||
|
KOKKOS_FUNCTION void operator()(int i) const { _permute(i) = i; }
|
||||||
|
};
|
||||||
|
template <typename Keys>
|
||||||
|
struct LessFunctor {
|
||||||
|
Keys _keys;
|
||||||
|
KOKKOS_FUNCTION bool operator()(int i, int j) const {
|
||||||
|
return _keys(i) < _keys(j);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// FIXME_NVCC+MSVC: We can't use a lambda instead of a functor which gave us
|
||||||
|
// "For this host platform/dialect, an extended lambda cannot be defined inside
|
||||||
|
// the 'if' or 'else' block of a constexpr if statement"
|
||||||
|
template <typename Keys, typename Comparator>
|
||||||
|
struct KeyComparisonFunctor {
|
||||||
|
Keys m_keys;
|
||||||
|
Comparator m_comparator;
|
||||||
|
KOKKOS_FUNCTION bool operator()(int i, int j) const {
|
||||||
|
return m_comparator(m_keys(i), m_keys(j));
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
template <class ExecutionSpace, class KeysDataType, class... KeysProperties,
|
template <class ExecutionSpace, class KeysDataType, class... KeysProperties,
|
||||||
class ValuesDataType, class... ValuesProperties,
|
class ValuesDataType, class... ValuesProperties,
|
||||||
class... MaybeComparator>
|
class... MaybeComparator>
|
||||||
@ -207,10 +234,9 @@ void sort_by_key_via_sort(
|
|||||||
n);
|
n);
|
||||||
|
|
||||||
// iota
|
// iota
|
||||||
Kokkos::parallel_for(
|
Kokkos::parallel_for("Kokkos::sort_by_key_via_sort::iota",
|
||||||
"Kokkos::sort_by_key_via_sort::iota",
|
Kokkos::RangePolicy<ExecutionSpace>(exec, 0, n),
|
||||||
Kokkos::RangePolicy<ExecutionSpace>(exec, 0, n),
|
IotaFunctor<decltype(permute)>{permute});
|
||||||
KOKKOS_LAMBDA(int i) { permute(i) = i; });
|
|
||||||
|
|
||||||
using Layout =
|
using Layout =
|
||||||
typename Kokkos::View<unsigned int*, ExecutionSpace>::array_layout;
|
typename Kokkos::View<unsigned int*, ExecutionSpace>::array_layout;
|
||||||
@ -228,16 +254,15 @@ void sort_by_key_via_sort(
|
|||||||
Kokkos::DefaultHostExecutionSpace host_exec;
|
Kokkos::DefaultHostExecutionSpace host_exec;
|
||||||
|
|
||||||
if constexpr (sizeof...(MaybeComparator) == 0) {
|
if constexpr (sizeof...(MaybeComparator) == 0) {
|
||||||
Kokkos::sort(
|
Kokkos::sort(host_exec, host_permute,
|
||||||
host_exec, host_permute,
|
LessFunctor<decltype(host_keys)>{host_keys});
|
||||||
KOKKOS_LAMBDA(int i, int j) { return host_keys(i) < host_keys(j); });
|
|
||||||
} else {
|
} else {
|
||||||
auto keys_comparator =
|
auto keys_comparator =
|
||||||
std::get<0>(std::tuple<MaybeComparator...>(maybeComparator...));
|
std::get<0>(std::tuple<MaybeComparator...>(maybeComparator...));
|
||||||
Kokkos::sort(
|
Kokkos::sort(
|
||||||
host_exec, host_permute, KOKKOS_LAMBDA(int i, int j) {
|
host_exec, host_permute,
|
||||||
return keys_comparator(host_keys(i), host_keys(j));
|
KeyComparisonFunctor<decltype(host_keys), decltype(keys_comparator)>{
|
||||||
});
|
host_keys, keys_comparator});
|
||||||
}
|
}
|
||||||
host_exec.fence("Kokkos::Impl::sort_by_key_via_sort: after host sort");
|
host_exec.fence("Kokkos::Impl::sort_by_key_via_sort: after host sort");
|
||||||
Kokkos::deep_copy(exec, permute, host_permute);
|
Kokkos::deep_copy(exec, permute, host_permute);
|
||||||
@ -262,16 +287,14 @@ void sort_by_key_via_sort(
|
|||||||
}
|
}
|
||||||
#else
|
#else
|
||||||
if constexpr (sizeof...(MaybeComparator) == 0) {
|
if constexpr (sizeof...(MaybeComparator) == 0) {
|
||||||
Kokkos::sort(
|
Kokkos::sort(exec, permute, LessFunctor<decltype(keys)>{keys});
|
||||||
exec, permute,
|
|
||||||
KOKKOS_LAMBDA(int i, int j) { return keys(i) < keys(j); });
|
|
||||||
} else {
|
} else {
|
||||||
auto keys_comparator =
|
auto keys_comparator =
|
||||||
std::get<0>(std::tuple<MaybeComparator...>(maybeComparator...));
|
std::get<0>(std::tuple<MaybeComparator...>(maybeComparator...));
|
||||||
Kokkos::sort(
|
Kokkos::sort(
|
||||||
exec, permute, KOKKOS_LAMBDA(int i, int j) {
|
exec, permute,
|
||||||
return keys_comparator(keys(i), keys(j));
|
KeyComparisonFunctor<decltype(keys), decltype(keys_comparator)>{
|
||||||
});
|
keys, keys_comparator});
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|||||||
@ -29,49 +29,46 @@ namespace Experimental {
|
|||||||
template <
|
template <
|
||||||
class ExecutionSpace, class IteratorType, class UnaryFunctorType,
|
class ExecutionSpace, class IteratorType, class UnaryFunctorType,
|
||||||
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
||||||
UnaryFunctorType for_each(const std::string& label, const ExecutionSpace& ex,
|
void for_each(const std::string& label, const ExecutionSpace& ex,
|
||||||
IteratorType first, IteratorType last,
|
IteratorType first, IteratorType last, UnaryFunctorType functor) {
|
||||||
UnaryFunctorType functor) {
|
Impl::for_each_exespace_impl(label, ex, first, last, std::move(functor));
|
||||||
return Impl::for_each_exespace_impl(label, ex, first, last,
|
|
||||||
std::move(functor));
|
|
||||||
}
|
}
|
||||||
|
|
||||||
template <
|
template <
|
||||||
class ExecutionSpace, class IteratorType, class UnaryFunctorType,
|
class ExecutionSpace, class IteratorType, class UnaryFunctorType,
|
||||||
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
||||||
UnaryFunctorType for_each(const ExecutionSpace& ex, IteratorType first,
|
void for_each(const ExecutionSpace& ex, IteratorType first, IteratorType last,
|
||||||
IteratorType last, UnaryFunctorType functor) {
|
UnaryFunctorType functor) {
|
||||||
return Impl::for_each_exespace_impl("Kokkos::for_each_iterator_api_default",
|
Impl::for_each_exespace_impl("Kokkos::for_each_iterator_api_default", ex,
|
||||||
ex, first, last, std::move(functor));
|
first, last, std::move(functor));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <
|
template <
|
||||||
class ExecutionSpace, class DataType, class... Properties,
|
class ExecutionSpace, class DataType, class... Properties,
|
||||||
class UnaryFunctorType,
|
class UnaryFunctorType,
|
||||||
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
||||||
UnaryFunctorType for_each(const std::string& label, const ExecutionSpace& ex,
|
void for_each(const std::string& label, const ExecutionSpace& ex,
|
||||||
const ::Kokkos::View<DataType, Properties...>& v,
|
const ::Kokkos::View<DataType, Properties...>& v,
|
||||||
UnaryFunctorType functor) {
|
UnaryFunctorType functor) {
|
||||||
Impl::static_assert_is_admissible_to_kokkos_std_algorithms(v);
|
Impl::static_assert_is_admissible_to_kokkos_std_algorithms(v);
|
||||||
|
|
||||||
namespace KE = ::Kokkos::Experimental;
|
namespace KE = ::Kokkos::Experimental;
|
||||||
return Impl::for_each_exespace_impl(label, ex, KE::begin(v), KE::end(v),
|
Impl::for_each_exespace_impl(label, ex, KE::begin(v), KE::end(v),
|
||||||
std::move(functor));
|
std::move(functor));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <
|
template <
|
||||||
class ExecutionSpace, class DataType, class... Properties,
|
class ExecutionSpace, class DataType, class... Properties,
|
||||||
class UnaryFunctorType,
|
class UnaryFunctorType,
|
||||||
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
std::enable_if_t<Kokkos::is_execution_space_v<ExecutionSpace>, int> = 0>
|
||||||
UnaryFunctorType for_each(const ExecutionSpace& ex,
|
void for_each(const ExecutionSpace& ex,
|
||||||
const ::Kokkos::View<DataType, Properties...>& v,
|
const ::Kokkos::View<DataType, Properties...>& v,
|
||||||
UnaryFunctorType functor) {
|
UnaryFunctorType functor) {
|
||||||
Impl::static_assert_is_admissible_to_kokkos_std_algorithms(v);
|
Impl::static_assert_is_admissible_to_kokkos_std_algorithms(v);
|
||||||
|
|
||||||
namespace KE = ::Kokkos::Experimental;
|
namespace KE = ::Kokkos::Experimental;
|
||||||
return Impl::for_each_exespace_impl("Kokkos::for_each_view_api_default", ex,
|
Impl::for_each_exespace_impl("Kokkos::for_each_view_api_default", ex,
|
||||||
KE::begin(v), KE::end(v),
|
KE::begin(v), KE::end(v), std::move(functor));
|
||||||
std::move(functor));
|
|
||||||
}
|
}
|
||||||
|
|
||||||
//
|
//
|
||||||
@ -82,24 +79,23 @@ UnaryFunctorType for_each(const ExecutionSpace& ex,
|
|||||||
|
|
||||||
template <class TeamHandleType, class IteratorType, class UnaryFunctorType,
|
template <class TeamHandleType, class IteratorType, class UnaryFunctorType,
|
||||||
std::enable_if_t<Kokkos::is_team_handle_v<TeamHandleType>, int> = 0>
|
std::enable_if_t<Kokkos::is_team_handle_v<TeamHandleType>, int> = 0>
|
||||||
KOKKOS_FUNCTION UnaryFunctorType for_each(const TeamHandleType& teamHandle,
|
KOKKOS_FUNCTION void for_each(const TeamHandleType& teamHandle,
|
||||||
IteratorType first, IteratorType last,
|
IteratorType first, IteratorType last,
|
||||||
UnaryFunctorType functor) {
|
UnaryFunctorType functor) {
|
||||||
return Impl::for_each_team_impl(teamHandle, first, last, std::move(functor));
|
Impl::for_each_team_impl(teamHandle, first, last, std::move(functor));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class TeamHandleType, class DataType, class... Properties,
|
template <class TeamHandleType, class DataType, class... Properties,
|
||||||
class UnaryFunctorType,
|
class UnaryFunctorType,
|
||||||
std::enable_if_t<Kokkos::is_team_handle_v<TeamHandleType>, int> = 0>
|
std::enable_if_t<Kokkos::is_team_handle_v<TeamHandleType>, int> = 0>
|
||||||
KOKKOS_FUNCTION UnaryFunctorType
|
KOKKOS_FUNCTION void for_each(const TeamHandleType& teamHandle,
|
||||||
for_each(const TeamHandleType& teamHandle,
|
const ::Kokkos::View<DataType, Properties...>& v,
|
||||||
const ::Kokkos::View<DataType, Properties...>& v,
|
UnaryFunctorType functor) {
|
||||||
UnaryFunctorType functor) {
|
|
||||||
Impl::static_assert_is_admissible_to_kokkos_std_algorithms(v);
|
Impl::static_assert_is_admissible_to_kokkos_std_algorithms(v);
|
||||||
|
|
||||||
namespace KE = ::Kokkos::Experimental;
|
namespace KE = ::Kokkos::Experimental;
|
||||||
return Impl::for_each_team_impl(teamHandle, KE::begin(v), KE::end(v),
|
Impl::for_each_team_impl(teamHandle, KE::begin(v), KE::end(v),
|
||||||
std::move(functor));
|
std::move(functor));
|
||||||
}
|
}
|
||||||
|
|
||||||
} // namespace Experimental
|
} // namespace Experimental
|
||||||
|
|||||||
@ -82,6 +82,11 @@ OutputIteratorType adjacent_difference_exespace_impl(
|
|||||||
return first_dest;
|
return first_dest;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEBUG
|
||||||
|
// check for overlapping iterators
|
||||||
|
Impl::expect_no_overlap(first_from, last_from, first_dest);
|
||||||
|
#endif
|
||||||
|
|
||||||
// run
|
// run
|
||||||
const auto num_elements =
|
const auto num_elements =
|
||||||
Kokkos::Experimental::distance(first_from, last_from);
|
Kokkos::Experimental::distance(first_from, last_from);
|
||||||
@ -114,6 +119,11 @@ KOKKOS_FUNCTION OutputIteratorType adjacent_difference_team_impl(
|
|||||||
return first_dest;
|
return first_dest;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEBUG
|
||||||
|
// check for overlapping iterators
|
||||||
|
Impl::expect_no_overlap(first_from, last_from, first_dest);
|
||||||
|
#endif
|
||||||
|
|
||||||
// run
|
// run
|
||||||
const auto num_elements =
|
const auto num_elements =
|
||||||
Kokkos::Experimental::distance(first_from, last_from);
|
Kokkos::Experimental::distance(first_from, last_from);
|
||||||
|
|||||||
@ -24,18 +24,21 @@ namespace Kokkos {
|
|||||||
namespace Experimental {
|
namespace Experimental {
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
|
template <class T>
|
||||||
|
class RandomAccessIterator;
|
||||||
|
|
||||||
template <typename T, typename enable = void>
|
template <typename T, typename enable = void>
|
||||||
struct is_admissible_to_kokkos_std_algorithms : std::false_type {};
|
struct is_admissible_to_kokkos_std_algorithms : std::false_type {};
|
||||||
|
|
||||||
template <typename T>
|
template <typename T>
|
||||||
struct is_admissible_to_kokkos_std_algorithms<
|
struct is_admissible_to_kokkos_std_algorithms<
|
||||||
T, std::enable_if_t< ::Kokkos::is_view<T>::value && T::rank() == 1 &&
|
T, std::enable_if_t<::Kokkos::is_view<T>::value && T::rank() == 1 &&
|
||||||
(std::is_same<typename T::traits::array_layout,
|
(std::is_same<typename T::traits::array_layout,
|
||||||
Kokkos::LayoutLeft>::value ||
|
Kokkos::LayoutLeft>::value ||
|
||||||
std::is_same<typename T::traits::array_layout,
|
std::is_same<typename T::traits::array_layout,
|
||||||
Kokkos::LayoutRight>::value ||
|
Kokkos::LayoutRight>::value ||
|
||||||
std::is_same<typename T::traits::array_layout,
|
std::is_same<typename T::traits::array_layout,
|
||||||
Kokkos::LayoutStride>::value)> >
|
Kokkos::LayoutStride>::value)>>
|
||||||
: std::true_type {};
|
: std::true_type {};
|
||||||
|
|
||||||
template <class ViewType>
|
template <class ViewType>
|
||||||
@ -58,6 +61,18 @@ using is_iterator = Kokkos::is_detected<iterator_category_t, T>;
|
|||||||
template <class T>
|
template <class T>
|
||||||
inline constexpr bool is_iterator_v = is_iterator<T>::value;
|
inline constexpr bool is_iterator_v = is_iterator<T>::value;
|
||||||
|
|
||||||
|
template <typename ViewType>
|
||||||
|
struct is_kokkos_iterator : std::false_type {};
|
||||||
|
|
||||||
|
template <typename ViewType>
|
||||||
|
struct is_kokkos_iterator<RandomAccessIterator<ViewType>> {
|
||||||
|
static constexpr bool value =
|
||||||
|
is_admissible_to_kokkos_std_algorithms<ViewType>::value;
|
||||||
|
};
|
||||||
|
|
||||||
|
template <class T>
|
||||||
|
inline constexpr bool is_kokkos_iterator_v = is_kokkos_iterator<T>::value;
|
||||||
|
|
||||||
//
|
//
|
||||||
// are_iterators
|
// are_iterators
|
||||||
//
|
//
|
||||||
@ -215,6 +230,38 @@ KOKKOS_INLINE_FUNCTION void expect_valid_range(IteratorType first,
|
|||||||
(void)last;
|
(void)last;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
//
|
||||||
|
// Check if kokkos iterators are overlapping
|
||||||
|
//
|
||||||
|
template <typename IteratorType1, typename IteratorType2>
|
||||||
|
KOKKOS_INLINE_FUNCTION void expect_no_overlap(
|
||||||
|
[[maybe_unused]] IteratorType1 first, [[maybe_unused]] IteratorType1 last,
|
||||||
|
[[maybe_unused]] IteratorType2 s_first) {
|
||||||
|
if constexpr (is_kokkos_iterator_v<IteratorType1> &&
|
||||||
|
is_kokkos_iterator_v<IteratorType2>) {
|
||||||
|
auto const view1 = first.view();
|
||||||
|
auto const view2 = s_first.view();
|
||||||
|
|
||||||
|
std::size_t stride1 = view1.stride(0);
|
||||||
|
std::size_t stride2 = view2.stride(0);
|
||||||
|
ptrdiff_t first_diff = view1.data() - view2.data();
|
||||||
|
|
||||||
|
// FIXME If strides are not identical, checks may not be made
|
||||||
|
// with the cost of O(1)
|
||||||
|
// Currently, checks are made only if strides are identical
|
||||||
|
// If first_diff == 0, there is already an overlap
|
||||||
|
if (stride1 == stride2 || first_diff == 0) {
|
||||||
|
[[maybe_unused]] bool is_no_overlap = (first_diff % stride1);
|
||||||
|
auto* first_pointer1 = view1.data();
|
||||||
|
auto* first_pointer2 = view2.data();
|
||||||
|
[[maybe_unused]] auto* last_pointer1 = first_pointer1 + (last - first);
|
||||||
|
[[maybe_unused]] auto* last_pointer2 = first_pointer2 + (last - first);
|
||||||
|
KOKKOS_EXPECTS(first_pointer1 >= last_pointer2 ||
|
||||||
|
last_pointer1 <= first_pointer2 || is_no_overlap);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
} // namespace Experimental
|
} // namespace Experimental
|
||||||
} // namespace Kokkos
|
} // namespace Kokkos
|
||||||
|
|||||||
@ -150,8 +150,9 @@ KOKKOS_FUNCTION OutputIterator copy_if_team_impl(
|
|||||||
return d_first + count;
|
return d_first + count;
|
||||||
}
|
}
|
||||||
|
|
||||||
#if defined KOKKOS_COMPILER_INTEL || \
|
#if defined KOKKOS_COMPILER_INTEL || \
|
||||||
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130)
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
__builtin_unreachable();
|
__builtin_unreachable();
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|||||||
@ -42,10 +42,9 @@ struct StdForEachFunctor {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <class HandleType, class IteratorType, class UnaryFunctorType>
|
template <class HandleType, class IteratorType, class UnaryFunctorType>
|
||||||
UnaryFunctorType for_each_exespace_impl(const std::string& label,
|
void for_each_exespace_impl(const std::string& label, const HandleType& handle,
|
||||||
const HandleType& handle,
|
IteratorType first, IteratorType last,
|
||||||
IteratorType first, IteratorType last,
|
UnaryFunctorType functor) {
|
||||||
UnaryFunctorType functor) {
|
|
||||||
// checks
|
// checks
|
||||||
Impl::static_assert_random_access_and_accessible(handle, first);
|
Impl::static_assert_random_access_and_accessible(handle, first);
|
||||||
Impl::expect_valid_range(first, last);
|
Impl::expect_valid_range(first, last);
|
||||||
@ -56,8 +55,6 @@ UnaryFunctorType for_each_exespace_impl(const std::string& label,
|
|||||||
label, RangePolicy<HandleType>(handle, 0, num_elements),
|
label, RangePolicy<HandleType>(handle, 0, num_elements),
|
||||||
StdForEachFunctor<IteratorType, UnaryFunctorType>(first, functor));
|
StdForEachFunctor<IteratorType, UnaryFunctorType>(first, functor));
|
||||||
handle.fence("Kokkos::for_each: fence after operation");
|
handle.fence("Kokkos::for_each: fence after operation");
|
||||||
|
|
||||||
return functor;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class ExecutionSpace, class IteratorType, class SizeType,
|
template <class ExecutionSpace, class IteratorType, class SizeType,
|
||||||
@ -75,7 +72,7 @@ IteratorType for_each_n_exespace_impl(const std::string& label,
|
|||||||
}
|
}
|
||||||
|
|
||||||
for_each_exespace_impl(label, ex, first, last, std::move(functor));
|
for_each_exespace_impl(label, ex, first, last, std::move(functor));
|
||||||
// no neeed to fence since for_each_exespace_impl fences already
|
// no need to fence since for_each_exespace_impl fences already
|
||||||
|
|
||||||
return last;
|
return last;
|
||||||
}
|
}
|
||||||
@ -84,9 +81,9 @@ IteratorType for_each_n_exespace_impl(const std::string& label,
|
|||||||
// team impl
|
// team impl
|
||||||
//
|
//
|
||||||
template <class TeamHandleType, class IteratorType, class UnaryFunctorType>
|
template <class TeamHandleType, class IteratorType, class UnaryFunctorType>
|
||||||
KOKKOS_FUNCTION UnaryFunctorType
|
KOKKOS_FUNCTION void for_each_team_impl(const TeamHandleType& teamHandle,
|
||||||
for_each_team_impl(const TeamHandleType& teamHandle, IteratorType first,
|
IteratorType first, IteratorType last,
|
||||||
IteratorType last, UnaryFunctorType functor) {
|
UnaryFunctorType functor) {
|
||||||
// checks
|
// checks
|
||||||
Impl::static_assert_random_access_and_accessible(teamHandle, first);
|
Impl::static_assert_random_access_and_accessible(teamHandle, first);
|
||||||
Impl::expect_valid_range(first, last);
|
Impl::expect_valid_range(first, last);
|
||||||
@ -96,7 +93,6 @@ for_each_team_impl(const TeamHandleType& teamHandle, IteratorType first,
|
|||||||
TeamThreadRange(teamHandle, 0, num_elements),
|
TeamThreadRange(teamHandle, 0, num_elements),
|
||||||
StdForEachFunctor<IteratorType, UnaryFunctorType>(first, functor));
|
StdForEachFunctor<IteratorType, UnaryFunctorType>(first, functor));
|
||||||
teamHandle.team_barrier();
|
teamHandle.team_barrier();
|
||||||
return functor;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class TeamHandleType, class IteratorType, class SizeType,
|
template <class TeamHandleType, class IteratorType, class SizeType,
|
||||||
@ -113,7 +109,7 @@ for_each_n_team_impl(const TeamHandleType& teamHandle, IteratorType first,
|
|||||||
}
|
}
|
||||||
|
|
||||||
for_each_team_impl(teamHandle, first, last, std::move(functor));
|
for_each_team_impl(teamHandle, first, last, std::move(functor));
|
||||||
// no neeed to fence since for_each_team_impl fences already
|
// no need to fence since for_each_team_impl fences already
|
||||||
|
|
||||||
return last;
|
return last;
|
||||||
}
|
}
|
||||||
|
|||||||
@ -59,6 +59,30 @@ class RandomAccessIterator< ::Kokkos::View<DataType, Args...> > {
|
|||||||
ptrdiff_t current_index)
|
ptrdiff_t current_index)
|
||||||
: m_view(view), m_current_index(current_index) {}
|
: m_view(view), m_current_index(current_index) {}
|
||||||
|
|
||||||
|
#ifndef KOKKOS_ENABLE_CXX17 // C++20 and beyond
|
||||||
|
template <class OtherViewType>
|
||||||
|
requires(std::is_constructible_v<view_type, OtherViewType>) KOKKOS_FUNCTION
|
||||||
|
explicit(!std::is_convertible_v<OtherViewType, view_type>)
|
||||||
|
RandomAccessIterator(const RandomAccessIterator<OtherViewType>& other)
|
||||||
|
: m_view(other.m_view), m_current_index(other.m_current_index) {}
|
||||||
|
#else
|
||||||
|
template <
|
||||||
|
class OtherViewType,
|
||||||
|
std::enable_if_t<std::is_constructible_v<view_type, OtherViewType> &&
|
||||||
|
!std::is_convertible_v<OtherViewType, view_type>,
|
||||||
|
int> = 0>
|
||||||
|
KOKKOS_FUNCTION explicit RandomAccessIterator(
|
||||||
|
const RandomAccessIterator<OtherViewType>& other)
|
||||||
|
: m_view(other.m_view), m_current_index(other.m_current_index) {}
|
||||||
|
|
||||||
|
template <class OtherViewType,
|
||||||
|
std::enable_if_t<std::is_convertible_v<OtherViewType, view_type>,
|
||||||
|
int> = 0>
|
||||||
|
KOKKOS_FUNCTION RandomAccessIterator(
|
||||||
|
const RandomAccessIterator<OtherViewType>& other)
|
||||||
|
: m_view(other.m_view), m_current_index(other.m_current_index) {}
|
||||||
|
#endif
|
||||||
|
|
||||||
KOKKOS_FUNCTION
|
KOKKOS_FUNCTION
|
||||||
iterator_type& operator++() {
|
iterator_type& operator++() {
|
||||||
++m_current_index;
|
++m_current_index;
|
||||||
@ -152,9 +176,16 @@ class RandomAccessIterator< ::Kokkos::View<DataType, Args...> > {
|
|||||||
KOKKOS_FUNCTION
|
KOKKOS_FUNCTION
|
||||||
reference operator*() const { return m_view(m_current_index); }
|
reference operator*() const { return m_view(m_current_index); }
|
||||||
|
|
||||||
|
KOKKOS_FUNCTION
|
||||||
|
view_type view() const { return m_view; }
|
||||||
|
|
||||||
private:
|
private:
|
||||||
view_type m_view;
|
view_type m_view;
|
||||||
ptrdiff_t m_current_index = 0;
|
ptrdiff_t m_current_index = 0;
|
||||||
|
|
||||||
|
// Needed for the converting constructor accepting another iterator
|
||||||
|
template <class>
|
||||||
|
friend class RandomAccessIterator;
|
||||||
};
|
};
|
||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|||||||
@ -175,8 +175,9 @@ KOKKOS_FUNCTION OutputIterator unique_copy_team_impl(
|
|||||||
d_first + count);
|
d_first + count);
|
||||||
}
|
}
|
||||||
|
|
||||||
#if defined KOKKOS_COMPILER_INTEL || \
|
#if defined KOKKOS_COMPILER_INTEL || \
|
||||||
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130)
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
__builtin_unreachable();
|
__builtin_unreachable();
|
||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|||||||
@ -46,6 +46,44 @@ TEST_F(random_access_iterator_test, constructor) {
|
|||||||
EXPECT_TRUE(true);
|
EXPECT_TRUE(true);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TEST_F(random_access_iterator_test, constructiblity) {
|
||||||
|
auto first_d = KE::begin(m_dynamic_view);
|
||||||
|
auto cfirst_d = KE::cbegin(m_dynamic_view);
|
||||||
|
|
||||||
|
static_assert(std::is_constructible_v<decltype(cfirst_d), decltype(first_d)>);
|
||||||
|
static_assert(
|
||||||
|
!std::is_constructible_v<decltype(first_d), decltype(cfirst_d)>);
|
||||||
|
[[maybe_unused]] decltype(cfirst_d) tmp_cfirst_d(first_d);
|
||||||
|
|
||||||
|
auto first_s = KE::begin(m_static_view);
|
||||||
|
auto cfirst_s = KE::cbegin(m_static_view);
|
||||||
|
|
||||||
|
static_assert(std::is_constructible_v<decltype(cfirst_s), decltype(first_s)>);
|
||||||
|
static_assert(
|
||||||
|
!std::is_constructible_v<decltype(first_s), decltype(cfirst_s)>);
|
||||||
|
[[maybe_unused]] decltype(cfirst_s) tmp_cfirst_s(first_s);
|
||||||
|
|
||||||
|
auto first_st = KE::begin(m_strided_view);
|
||||||
|
auto cfirst_st = KE::cbegin(m_strided_view);
|
||||||
|
|
||||||
|
static_assert(
|
||||||
|
std::is_constructible_v<decltype(cfirst_st), decltype(first_st)>);
|
||||||
|
static_assert(
|
||||||
|
!std::is_constructible_v<decltype(first_st), decltype(cfirst_st)>);
|
||||||
|
[[maybe_unused]] decltype(cfirst_st) tmp_cfirst_st(first_st);
|
||||||
|
|
||||||
|
// [FIXME] Better to have tests for the explicit specifier with an expression.
|
||||||
|
// As soon as View converting constructors are re-implemented with a
|
||||||
|
// conditional explicit, we may add those tests.
|
||||||
|
static_assert(std::is_constructible_v<decltype(first_s), decltype(first_d)>);
|
||||||
|
static_assert(std::is_constructible_v<decltype(first_st), decltype(first_d)>);
|
||||||
|
static_assert(std::is_constructible_v<decltype(first_d), decltype(first_s)>);
|
||||||
|
static_assert(std::is_constructible_v<decltype(first_st), decltype(first_s)>);
|
||||||
|
static_assert(std::is_constructible_v<decltype(first_d), decltype(first_st)>);
|
||||||
|
static_assert(std::is_constructible_v<decltype(first_s), decltype(first_st)>);
|
||||||
|
EXPECT_TRUE(true);
|
||||||
|
}
|
||||||
|
|
||||||
template <class IteratorType, class ValueType>
|
template <class IteratorType, class ValueType>
|
||||||
void test_random_access_it_verify(IteratorType it, ValueType gold_value) {
|
void test_random_access_it_verify(IteratorType it, ValueType gold_value) {
|
||||||
using view_t = Kokkos::View<typename IteratorType::value_type>;
|
using view_t = Kokkos::View<typename IteratorType::value_type>;
|
||||||
|
|||||||
@ -69,7 +69,7 @@ void iota(ExecutionSpace const &space, ViewType const &v,
|
|||||||
typename ViewType::value_type value = 0) {
|
typename ViewType::value_type value = 0) {
|
||||||
using ValueType = typename ViewType::value_type;
|
using ValueType = typename ViewType::value_type;
|
||||||
Kokkos::parallel_for(
|
Kokkos::parallel_for(
|
||||||
"ArborX::Algorithms::iota",
|
"Kokkos::Algorithms::iota",
|
||||||
Kokkos::RangePolicy<ExecutionSpace>(space, 0, v.extent(0)),
|
Kokkos::RangePolicy<ExecutionSpace>(space, 0, v.extent(0)),
|
||||||
KOKKOS_LAMBDA(int i) { v(i) = value + (ValueType)i; });
|
KOKKOS_LAMBDA(int i) { v(i) = value + (ValueType)i; });
|
||||||
}
|
}
|
||||||
@ -87,6 +87,18 @@ TEST(TEST_CATEGORY, SortByKeyEmptyView) {
|
|||||||
Kokkos::Experimental::sort_by_key(ExecutionSpace(), keys, values));
|
Kokkos::Experimental::sort_by_key(ExecutionSpace(), keys, values));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Test #7036
|
||||||
|
TEST(TEST_CATEGORY, SortByKeyEmptyViewHost) {
|
||||||
|
using ExecutionSpace = Kokkos::DefaultHostExecutionSpace;
|
||||||
|
|
||||||
|
// does not matter if we use int or something else
|
||||||
|
Kokkos::View<int *, ExecutionSpace> keys("keys", 0);
|
||||||
|
Kokkos::View<float *, ExecutionSpace> values("values", 0);
|
||||||
|
|
||||||
|
ASSERT_NO_THROW(
|
||||||
|
Kokkos::Experimental::sort_by_key(ExecutionSpace(), keys, values));
|
||||||
|
}
|
||||||
|
|
||||||
TEST(TEST_CATEGORY, SortByKey) {
|
TEST(TEST_CATEGORY, SortByKey) {
|
||||||
using ExecutionSpace = TEST_EXECSPACE;
|
using ExecutionSpace = TEST_EXECSPACE;
|
||||||
using MemorySpace = typename ExecutionSpace::memory_space;
|
using MemorySpace = typename ExecutionSpace::memory_space;
|
||||||
|
|||||||
@ -81,5 +81,114 @@ TEST(std_algorithms, is_admissible_to_std_algorithms) {
|
|||||||
strided_view_3d_t>::value);
|
strided_view_3d_t>::value);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
TEST(std_algorithms, expect_no_overlap) {
|
||||||
|
namespace KE = Kokkos::Experimental;
|
||||||
|
using value_type = double;
|
||||||
|
|
||||||
|
static constexpr size_t extent0 = 13;
|
||||||
|
|
||||||
|
//-------------
|
||||||
|
// 1d views
|
||||||
|
//-------------
|
||||||
|
using static_view_1d_t = Kokkos::View<value_type[extent0]>;
|
||||||
|
[[maybe_unused]] static_view_1d_t static_view_1d{
|
||||||
|
"std-algo-test-1d-contiguous-view-static"};
|
||||||
|
|
||||||
|
using dyn_view_1d_t = Kokkos::View<value_type*>;
|
||||||
|
[[maybe_unused]] dyn_view_1d_t dynamic_view_1d{
|
||||||
|
"std-algo-test-1d-contiguous-view-dynamic", extent0};
|
||||||
|
|
||||||
|
using strided_view_1d_t = Kokkos::View<value_type*, Kokkos::LayoutStride>;
|
||||||
|
Kokkos::LayoutStride layout1d{extent0, 2};
|
||||||
|
strided_view_1d_t strided_view_1d{"std-algo-test-1d-strided-view", layout1d};
|
||||||
|
|
||||||
|
// Overlapping because iterators are identical
|
||||||
|
#if defined(KOKKOS_ENABLE_DEBUG)
|
||||||
|
auto first_s = KE::begin(static_view_1d);
|
||||||
|
auto last_s = first_s + extent0;
|
||||||
|
EXPECT_DEATH({ KE::Impl::expect_no_overlap(first_s, last_s, first_s); },
|
||||||
|
"Kokkos contract violation:.*");
|
||||||
|
|
||||||
|
auto first_d = KE::begin(dynamic_view_1d);
|
||||||
|
auto last_d = first_d + extent0;
|
||||||
|
EXPECT_DEATH({ KE::Impl::expect_no_overlap(first_d, last_d, first_d); },
|
||||||
|
"Kokkos contract violation:.*");
|
||||||
|
|
||||||
|
auto first_st = KE::begin(strided_view_1d);
|
||||||
|
auto last_st = first_st + extent0;
|
||||||
|
EXPECT_DEATH({ KE::Impl::expect_no_overlap(first_st, last_st, first_st); },
|
||||||
|
"Kokkos contract violation:.*");
|
||||||
|
#endif
|
||||||
|
|
||||||
|
// Ranges are overlapped
|
||||||
|
static constexpr size_t sub_extent0 = 6, offset0 = 3;
|
||||||
|
std::pair<size_t, size_t> range0(0, sub_extent0),
|
||||||
|
range1(offset0, offset0 + sub_extent0);
|
||||||
|
#if defined(KOKKOS_ENABLE_DEBUG)
|
||||||
|
auto static_view_1d_0 = Kokkos::subview(static_view_1d, range0);
|
||||||
|
auto static_view_1d_1 = Kokkos::subview(static_view_1d, range1);
|
||||||
|
auto first_s0 = KE::begin(static_view_1d_0); // [0, 6)
|
||||||
|
auto last_s0 = first_s0 + static_view_1d_0.extent(0);
|
||||||
|
auto first_s1 = KE::begin(static_view_1d_1); // [3, 9)
|
||||||
|
EXPECT_DEATH({ KE::Impl::expect_no_overlap(first_s0, last_s0, first_s1); },
|
||||||
|
"Kokkos contract violation:.*");
|
||||||
|
|
||||||
|
auto dynamic_view_1d_0 = Kokkos::subview(dynamic_view_1d, range0);
|
||||||
|
auto dynamic_view_1d_1 = Kokkos::subview(dynamic_view_1d, range1);
|
||||||
|
auto first_d0 = KE::begin(dynamic_view_1d_0); // [0, 6)
|
||||||
|
auto last_d0 = first_d0 + dynamic_view_1d_0.extent(0);
|
||||||
|
auto first_d1 = KE::begin(dynamic_view_1d_1); // [3, 9)
|
||||||
|
EXPECT_DEATH({ KE::Impl::expect_no_overlap(first_d0, last_d0, first_d1); },
|
||||||
|
"Kokkos contract violation:.*");
|
||||||
|
#endif
|
||||||
|
|
||||||
|
auto strided_view_1d_0 = Kokkos::subview(strided_view_1d, range0);
|
||||||
|
auto strided_view_1d_1 = Kokkos::subview(strided_view_1d, range1);
|
||||||
|
auto first_st0 = KE::begin(strided_view_1d_0); // [0, 12)
|
||||||
|
auto last_st0 = first_st0 + strided_view_1d_0.extent(0);
|
||||||
|
auto first_st1 = KE::begin(strided_view_1d_1); // [3, 15)
|
||||||
|
// Does not overlap since offset (=3) is not divisible by stride (=2)
|
||||||
|
EXPECT_NO_THROW(
|
||||||
|
{ KE::Impl::expect_no_overlap(first_st0, last_st0, first_st1); });
|
||||||
|
|
||||||
|
// Iterating over the same range without overlapping
|
||||||
|
Kokkos::View<value_type[2][extent0], Kokkos::LayoutLeft> static_view_2d{
|
||||||
|
"std-algo-test-2d-contiguous-view-static"};
|
||||||
|
auto sub_static_view_1d_0 = Kokkos::subview(static_view_2d, 0, Kokkos::ALL);
|
||||||
|
auto sub_static_view_1d_1 = Kokkos::subview(static_view_2d, 1, Kokkos::ALL);
|
||||||
|
auto sub_first_s0 = KE::begin(sub_static_view_1d_0); // 0, 2, 4, ...
|
||||||
|
auto sub_last_s0 = sub_first_s0 + sub_static_view_1d_0.extent(0);
|
||||||
|
auto sub_first_s1 = KE::begin(sub_static_view_1d_1); // 1, 3, 5, ...
|
||||||
|
|
||||||
|
EXPECT_NO_THROW({
|
||||||
|
KE::Impl::expect_no_overlap(sub_first_s0, sub_last_s0, sub_first_s1);
|
||||||
|
});
|
||||||
|
|
||||||
|
Kokkos::View<value_type**, Kokkos::LayoutLeft> dynamic_view_2d{
|
||||||
|
"std-algo-test-2d-contiguous-view-dynamic", 2, extent0};
|
||||||
|
auto sub_dynamic_view_1d_0 = Kokkos::subview(dynamic_view_2d, 0, Kokkos::ALL);
|
||||||
|
auto sub_dynamic_view_1d_1 = Kokkos::subview(dynamic_view_2d, 1, Kokkos::ALL);
|
||||||
|
auto sub_first_d0 = KE::begin(sub_dynamic_view_1d_0); // 0, 2, 4, ...
|
||||||
|
auto sub_last_d0 = sub_first_d0 + sub_dynamic_view_1d_0.extent(0);
|
||||||
|
auto sub_first_d1 = KE::begin(sub_dynamic_view_1d_1); // 1, 3, 5, ...
|
||||||
|
|
||||||
|
EXPECT_NO_THROW({
|
||||||
|
KE::Impl::expect_no_overlap(sub_first_d0, sub_last_d0, sub_first_d1);
|
||||||
|
});
|
||||||
|
|
||||||
|
Kokkos::LayoutStride layout2d{2, 3, extent0, 2 * 3};
|
||||||
|
Kokkos::View<value_type**, Kokkos::LayoutStride> strided_view_2d{
|
||||||
|
"std-algo-test-2d-contiguous-view-strided", layout2d};
|
||||||
|
auto sub_strided_view_1d_0 = Kokkos::subview(strided_view_2d, 0, Kokkos::ALL);
|
||||||
|
auto sub_strided_view_1d_1 = Kokkos::subview(strided_view_2d, 1, Kokkos::ALL);
|
||||||
|
auto sub_first_st0 = KE::begin(sub_strided_view_1d_0); // 0, 6, 12, ...
|
||||||
|
auto sub_last_st0 = sub_first_st0 + sub_strided_view_1d_0.extent(0);
|
||||||
|
auto sub_first_st1 = KE::begin(sub_strided_view_1d_1); // 1, 7, 13, ...
|
||||||
|
|
||||||
|
EXPECT_NO_THROW({
|
||||||
|
KE::Impl::expect_no_overlap(sub_first_st0, sub_last_st0, sub_first_st1);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
} // namespace stdalgos
|
} // namespace stdalgos
|
||||||
} // namespace Test
|
} // namespace Test
|
||||||
|
|||||||
@ -85,7 +85,7 @@ struct TestFunctorA {
|
|||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
|
|
||||||
case 2: {
|
case 2: {
|
||||||
auto it = KE::exclusive_scan(
|
auto it = KE::exclusive_scan(
|
||||||
@ -213,7 +213,7 @@ void test_A(std::size_t numTeams, std::size_t numCols, int apiId) {
|
|||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
case 2:
|
case 2:
|
||||||
case 3: {
|
case 3: {
|
||||||
auto it = exclusive_scan(KE::cbegin(rowFrom), KE::cend(rowFrom),
|
auto it = exclusive_scan(KE::cbegin(rowFrom), KE::cend(rowFrom),
|
||||||
@ -242,7 +242,7 @@ template <class LayoutTag, class ValueType, class InPlaceOrVoid = void>
|
|||||||
void run_all_scenarios() {
|
void run_all_scenarios() {
|
||||||
for (int numTeams : teamSizesToTest) {
|
for (int numTeams : teamSizesToTest) {
|
||||||
for (const auto& numCols : {0, 1, 2, 13, 101, 1444, 8153}) {
|
for (const auto& numCols : {0, 1, 2, 13, 101, 1444, 8153}) {
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
for (int apiId : {0, 1, 2, 3}) {
|
for (int apiId : {0, 1, 2, 3}) {
|
||||||
#else
|
#else
|
||||||
for (int apiId : {0, 1}) {
|
for (int apiId : {0, 1}) {
|
||||||
|
|||||||
@ -52,7 +52,7 @@ struct TestFunctorA {
|
|||||||
Kokkos::single(Kokkos::PerTeam(member),
|
Kokkos::single(Kokkos::PerTeam(member),
|
||||||
[=, *this]() { m_returnsView(myRowIndex) = result; });
|
[=, *this]() { m_returnsView(myRowIndex) = result; });
|
||||||
}
|
}
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
else if (m_apiPick == 2) {
|
else if (m_apiPick == 2) {
|
||||||
using value_type = typename ViewType::value_type;
|
using value_type = typename ViewType::value_type;
|
||||||
result = KE::is_sorted(member, KE::cbegin(myRowView), KE::cend(myRowView),
|
result = KE::is_sorted(member, KE::cbegin(myRowView), KE::cend(myRowView),
|
||||||
@ -179,7 +179,7 @@ template <class LayoutTag, class ValueType>
|
|||||||
void run_all_scenarios(bool makeDataSortedOnPurpose) {
|
void run_all_scenarios(bool makeDataSortedOnPurpose) {
|
||||||
for (int numTeams : teamSizesToTest) {
|
for (int numTeams : teamSizesToTest) {
|
||||||
for (const auto& numCols : {0, 1, 2, 13, 101, 1444, 5153}) {
|
for (const auto& numCols : {0, 1, 2, 13, 101, 1444, 5153}) {
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
for (int apiId : {0, 1, 2, 3}) {
|
for (int apiId : {0, 1, 2, 3}) {
|
||||||
#else
|
#else
|
||||||
for (int apiId : {0, 1}) {
|
for (int apiId : {0, 1}) {
|
||||||
|
|||||||
@ -73,7 +73,7 @@ struct TestFunctorA {
|
|||||||
m_distancesView(myRowIndex) = resultDist;
|
m_distancesView(myRowIndex) = resultDist;
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
else if (m_apiPick == 2) {
|
else if (m_apiPick == 2) {
|
||||||
using value_type = typename ViewType::value_type;
|
using value_type = typename ViewType::value_type;
|
||||||
auto it = KE::is_sorted_until(member, KE::cbegin(myRowView),
|
auto it = KE::is_sorted_until(member, KE::cbegin(myRowView),
|
||||||
@ -226,7 +226,7 @@ template <class LayoutTag, class ValueType>
|
|||||||
void run_all_scenarios(const std::string& name, const std::vector<int>& cols) {
|
void run_all_scenarios(const std::string& name, const std::vector<int>& cols) {
|
||||||
for (int numTeams : teamSizesToTest) {
|
for (int numTeams : teamSizesToTest) {
|
||||||
for (const auto& numCols : cols) {
|
for (const auto& numCols : cols) {
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
for (int apiId : {0, 1, 2, 3}) {
|
for (int apiId : {0, 1, 2, 3}) {
|
||||||
#else
|
#else
|
||||||
for (int apiId : {0, 1}) {
|
for (int apiId : {0, 1}) {
|
||||||
|
|||||||
@ -59,7 +59,7 @@ struct TestFunctorA {
|
|||||||
m_distancesView(myRowIndex) = resultDist;
|
m_distancesView(myRowIndex) = resultDist;
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
else if (m_apiPick == 2) {
|
else if (m_apiPick == 2) {
|
||||||
using value_type = typename ViewType::value_type;
|
using value_type = typename ViewType::value_type;
|
||||||
auto it =
|
auto it =
|
||||||
@ -170,7 +170,7 @@ void run_all_scenarios() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
TEST(std_algorithms_max_element_team_test, test) {
|
TEST(std_algorithms_max_element_team_test, test) {
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
run_all_scenarios<DynamicTag, int>();
|
run_all_scenarios<DynamicTag, int>();
|
||||||
run_all_scenarios<StridedTwoRowsTag, double>();
|
run_all_scenarios<StridedTwoRowsTag, double>();
|
||||||
run_all_scenarios<StridedThreeRowsTag, int>();
|
run_all_scenarios<StridedThreeRowsTag, int>();
|
||||||
|
|||||||
@ -59,7 +59,7 @@ struct TestFunctorA {
|
|||||||
m_distancesView(myRowIndex) = resultDist;
|
m_distancesView(myRowIndex) = resultDist;
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
else if (m_apiPick == 2) {
|
else if (m_apiPick == 2) {
|
||||||
using value_type = typename ViewType::value_type;
|
using value_type = typename ViewType::value_type;
|
||||||
auto it =
|
auto it =
|
||||||
@ -169,7 +169,7 @@ void run_all_scenarios() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
TEST(std_algorithms_min_element_team_test, test) {
|
TEST(std_algorithms_min_element_team_test, test) {
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
run_all_scenarios<DynamicTag, int>();
|
run_all_scenarios<DynamicTag, int>();
|
||||||
run_all_scenarios<StridedTwoRowsTag, double>();
|
run_all_scenarios<StridedTwoRowsTag, double>();
|
||||||
run_all_scenarios<StridedThreeRowsTag, int>();
|
run_all_scenarios<StridedThreeRowsTag, int>();
|
||||||
|
|||||||
@ -66,7 +66,7 @@ struct TestFunctorA {
|
|||||||
m_distancesView(myRowIndex, 1) = resultDist2;
|
m_distancesView(myRowIndex, 1) = resultDist2;
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
else if (m_apiPick == 2) {
|
else if (m_apiPick == 2) {
|
||||||
using value_type = typename ViewType::value_type;
|
using value_type = typename ViewType::value_type;
|
||||||
auto itPair =
|
auto itPair =
|
||||||
@ -188,7 +188,7 @@ void run_all_scenarios() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
TEST(std_algorithms_minmax_element_team_test, test) {
|
TEST(std_algorithms_minmax_element_team_test, test) {
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
run_all_scenarios<DynamicTag, int>();
|
run_all_scenarios<DynamicTag, int>();
|
||||||
run_all_scenarios<StridedTwoRowsTag, double>();
|
run_all_scenarios<StridedTwoRowsTag, double>();
|
||||||
run_all_scenarios<StridedThreeRowsTag, int>();
|
run_all_scenarios<StridedThreeRowsTag, int>();
|
||||||
|
|||||||
@ -16,7 +16,7 @@
|
|||||||
|
|
||||||
#include <TestStdAlgorithmsCommon.hpp>
|
#include <TestStdAlgorithmsCommon.hpp>
|
||||||
|
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
|
|
||||||
namespace Test {
|
namespace Test {
|
||||||
namespace stdalgos {
|
namespace stdalgos {
|
||||||
|
|||||||
@ -16,7 +16,7 @@
|
|||||||
|
|
||||||
#include <TestStdAlgorithmsCommon.hpp>
|
#include <TestStdAlgorithmsCommon.hpp>
|
||||||
|
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
|
|
||||||
namespace Test {
|
namespace Test {
|
||||||
namespace stdalgos {
|
namespace stdalgos {
|
||||||
|
|||||||
@ -16,7 +16,7 @@
|
|||||||
|
|
||||||
#include <TestStdAlgorithmsCommon.hpp>
|
#include <TestStdAlgorithmsCommon.hpp>
|
||||||
|
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
|
|
||||||
namespace Test {
|
namespace Test {
|
||||||
namespace stdalgos {
|
namespace stdalgos {
|
||||||
|
|||||||
@ -16,7 +16,7 @@
|
|||||||
|
|
||||||
#include <TestStdAlgorithmsCommon.hpp>
|
#include <TestStdAlgorithmsCommon.hpp>
|
||||||
|
|
||||||
#if not defined KOKKOS_ENABLE_OPENMPTARGET
|
#ifndef KOKKOS_ENABLE_OPENMPTARGET
|
||||||
|
|
||||||
namespace Test {
|
namespace Test {
|
||||||
namespace stdalgos {
|
namespace stdalgos {
|
||||||
|
|||||||
@ -5,6 +5,6 @@ build_script:
|
|||||||
- cmd: >-
|
- cmd: >-
|
||||||
mkdir build &&
|
mkdir build &&
|
||||||
cd build &&
|
cd build &&
|
||||||
cmake c:\projects\source -DKokkos_ENABLE_TESTS=ON -DCMAKE_CXX_FLAGS="/W0 /EHsc" -DKokkos_ENABLE_DEPRECATED_CODE_4=ON -DKokkos_ENABLE_DEPRECATION_WARNINGS=OFF &&
|
cmake c:\projects\source -DKokkos_ENABLE_IMPL_MDSPAN=OFF -DKokkos_ENABLE_TESTS=ON -DCMAKE_CXX_FLAGS="/W0 /EHsc" -DKokkos_ENABLE_DEPRECATED_CODE_4=ON -DKokkos_ENABLE_DEPRECATION_WARNINGS=OFF &&
|
||||||
cmake --build . --target install &&
|
cmake --build . --target install &&
|
||||||
ctest -C Debug --output-on-failure
|
ctest -C Debug --output-on-failure
|
||||||
|
|||||||
@ -4,7 +4,7 @@ KOKKOS_ADD_BENCHMARK_DIRECTORIES(gather)
|
|||||||
KOKKOS_ADD_BENCHMARK_DIRECTORIES(gups)
|
KOKKOS_ADD_BENCHMARK_DIRECTORIES(gups)
|
||||||
KOKKOS_ADD_BENCHMARK_DIRECTORIES(launch_latency)
|
KOKKOS_ADD_BENCHMARK_DIRECTORIES(launch_latency)
|
||||||
KOKKOS_ADD_BENCHMARK_DIRECTORIES(stream)
|
KOKKOS_ADD_BENCHMARK_DIRECTORIES(stream)
|
||||||
|
KOKKOS_ADD_BENCHMARK_DIRECTORIES(view_copy_constructor)
|
||||||
#FIXME_OPENMPTARGET - These two benchmarks cause ICE. Commenting them for now but a deeper analysis on the cause and a possible fix will follow.
|
#FIXME_OPENMPTARGET - These two benchmarks cause ICE. Commenting them for now but a deeper analysis on the cause and a possible fix will follow.
|
||||||
IF(NOT Kokkos_ENABLE_OPENMPTARGET)
|
IF(NOT Kokkos_ENABLE_OPENMPTARGET)
|
||||||
KOKKOS_ADD_BENCHMARK_DIRECTORIES(policy_performance)
|
KOKKOS_ADD_BENCHMARK_DIRECTORIES(policy_performance)
|
||||||
|
|||||||
@ -0,0 +1,4 @@
|
|||||||
|
KOKKOS_ADD_EXECUTABLE(
|
||||||
|
view_copy_constructor
|
||||||
|
SOURCES view_copy_constructor.cpp
|
||||||
|
)
|
||||||
46
lib/kokkos/benchmarks/view_copy_constructor/Makefile
Normal file
46
lib/kokkos/benchmarks/view_copy_constructor/Makefile
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
KOKKOS_DEVICES=Serial
|
||||||
|
KOKKOS_ARCH = ""
|
||||||
|
|
||||||
|
|
||||||
|
MAKEFILE_PATH := $(subst Makefile,,$(abspath $(lastword $(MAKEFILE_LIST))))
|
||||||
|
|
||||||
|
ifndef KOKKOS_PATH
|
||||||
|
KOKKOS_PATH = $(MAKEFILE_PATH)../..
|
||||||
|
endif
|
||||||
|
|
||||||
|
SRC = $(wildcard $(MAKEFILE_PATH)*.cpp)
|
||||||
|
HEADERS = $(wildcard $(MAKEFILE_PATH)*.hpp)
|
||||||
|
|
||||||
|
vpath %.cpp $(sort $(dir $(SRC)))
|
||||||
|
|
||||||
|
default: build
|
||||||
|
echo "Start Build"
|
||||||
|
|
||||||
|
CXX = clang++
|
||||||
|
EXE = view_copy_constructor.exe
|
||||||
|
|
||||||
|
CXXFLAGS ?= -Ofast
|
||||||
|
override CXXFLAGS += -I$(MAKEFILE_PATH)
|
||||||
|
|
||||||
|
DEPFLAGS = -M
|
||||||
|
LINK = ${CXX}
|
||||||
|
LINKFLAGS = -Ofast
|
||||||
|
KOKKOS_CXX_STANDARD=c++20
|
||||||
|
|
||||||
|
OBJ = $(notdir $(SRC:.cpp=.o))
|
||||||
|
LIB =
|
||||||
|
|
||||||
|
include $(KOKKOS_PATH)/Makefile.kokkos
|
||||||
|
|
||||||
|
build: $(EXE)
|
||||||
|
|
||||||
|
$(EXE): $(OBJ) $(KOKKOS_LINK_DEPENDS)
|
||||||
|
$(LINK) $(KOKKOS_LDFLAGS) $(LINKFLAGS) $(EXTRA_PATH) $(OBJ) $(KOKKOS_LIBS) $(LIB) -o $(EXE)
|
||||||
|
|
||||||
|
clean: kokkos-clean
|
||||||
|
rm -f *.o view_copy_constructor.cuda view_copy_constructor.exe
|
||||||
|
|
||||||
|
# Compilation rules
|
||||||
|
|
||||||
|
%.o:%.cpp $(KOKKOS_CPP_DEPENDS) $(HEADERS)
|
||||||
|
$(CXX) $(KOKKOS_CPPFLAGS) $(KOKKOS_CXXFLAGS) $(CXXFLAGS) $(EXTRA_INC) -c $< -o $(notdir $@)
|
||||||
@ -0,0 +1,310 @@
|
|||||||
|
//@HEADER
|
||||||
|
// ************************************************************************
|
||||||
|
//
|
||||||
|
// Kokkos v. 4.0
|
||||||
|
// Copyright (2022) National Technology & Engineering
|
||||||
|
// Solutions of Sandia, LLC (NTESS).
|
||||||
|
//
|
||||||
|
// Under the terms of Contract DE-NA0003525 with NTESS,
|
||||||
|
// the U.S. Government retains certain rights in this software.
|
||||||
|
//
|
||||||
|
// Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions.
|
||||||
|
// See https://kokkos.org/LICENSE for license information.
|
||||||
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
||||||
|
//
|
||||||
|
//@HEADER
|
||||||
|
|
||||||
|
// The function "test_view_collection" exposes the copy constructor
|
||||||
|
// and destructor overheads in Kokkos View objects
|
||||||
|
// Please see the lines marked by "NOTE".
|
||||||
|
|
||||||
|
#include <limits>
|
||||||
|
#include <cstdio>
|
||||||
|
#include <cstdlib>
|
||||||
|
#include <cstring>
|
||||||
|
#include <sys/time.h>
|
||||||
|
#include <Kokkos_Core.hpp>
|
||||||
|
#include <iostream>
|
||||||
|
|
||||||
|
// NVIEWS is the number of Kokkos View objects in our ViewCollection object
|
||||||
|
// We have chosen a large value of 40 to make it easier to see performance
|
||||||
|
// differences when using the likelihood attribute
|
||||||
|
#define NVIEWS 40
|
||||||
|
|
||||||
|
class ViewCollection {
|
||||||
|
public:
|
||||||
|
Kokkos::View<double*> v1, v2, v3, v4, v5, v6, v7, v8, v9, v10, v11, v12, v13,
|
||||||
|
v14, v15, v16, v17, v18, v19, v20, v21, v22, v23, v24, v25, v26, v27, v28,
|
||||||
|
v29, v30, v31, v32, v33, v34, v35, v36, v37, v38, v39, v40;
|
||||||
|
double m_expected_sum;
|
||||||
|
double m_side_effect;
|
||||||
|
int m_N;
|
||||||
|
|
||||||
|
ViewCollection(int N)
|
||||||
|
: v1("v1", N),
|
||||||
|
v2("v2", N),
|
||||||
|
v3("v3", N),
|
||||||
|
v4("v4", N),
|
||||||
|
v5("v5", N),
|
||||||
|
v6("v6", N),
|
||||||
|
v7("v7", N),
|
||||||
|
v8("v8", N),
|
||||||
|
v9("v9", N),
|
||||||
|
v10("v10", N),
|
||||||
|
v11("v11", N),
|
||||||
|
v12("v12", N),
|
||||||
|
v13("v13", N),
|
||||||
|
v14("v14", N),
|
||||||
|
v15("v15", N),
|
||||||
|
v16("v16", N),
|
||||||
|
v17("v17", N),
|
||||||
|
v18("v18", N),
|
||||||
|
v19("v19", N),
|
||||||
|
v20("v20", N),
|
||||||
|
v21("v21", N),
|
||||||
|
v22("v22", N),
|
||||||
|
v23("v23", N),
|
||||||
|
v24("v24", N),
|
||||||
|
v25("v25", N),
|
||||||
|
v26("v26", N),
|
||||||
|
v27("v27", N),
|
||||||
|
v28("v28", N),
|
||||||
|
v29("v29", N),
|
||||||
|
v30("v30", N),
|
||||||
|
v31("v31", N),
|
||||||
|
v32("v32", N),
|
||||||
|
v33("v33", N),
|
||||||
|
v34("v34", N),
|
||||||
|
v35("v35", N),
|
||||||
|
v36("v36", N),
|
||||||
|
v37("v37", N),
|
||||||
|
v38("v38", N),
|
||||||
|
v39("v39", N),
|
||||||
|
v40("v40", N),
|
||||||
|
m_expected_sum(N * NVIEWS),
|
||||||
|
m_side_effect(0.0),
|
||||||
|
m_N(N) {
|
||||||
|
for (int i = 0; i < N; ++i) {
|
||||||
|
v1(i) = 1;
|
||||||
|
v2(i) = 1;
|
||||||
|
v3(i) = 1;
|
||||||
|
v4(i) = 1;
|
||||||
|
v5(i) = 1;
|
||||||
|
v6(i) = 1;
|
||||||
|
v7(i) = 1;
|
||||||
|
v8(i) = 1;
|
||||||
|
v9(i) = 1;
|
||||||
|
v10(i) = 1;
|
||||||
|
v11(i) = 1;
|
||||||
|
v12(i) = 1;
|
||||||
|
v13(i) = 1;
|
||||||
|
v14(i) = 1;
|
||||||
|
v15(i) = 1;
|
||||||
|
v16(i) = 1;
|
||||||
|
v17(i) = 1;
|
||||||
|
v18(i) = 1;
|
||||||
|
v19(i) = 1;
|
||||||
|
v20(i) = 1;
|
||||||
|
v21(i) = 1;
|
||||||
|
v22(i) = 1;
|
||||||
|
v23(i) = 1;
|
||||||
|
v24(i) = 1;
|
||||||
|
v25(i) = 1;
|
||||||
|
v26(i) = 1;
|
||||||
|
v27(i) = 1;
|
||||||
|
v28(i) = 1;
|
||||||
|
v29(i) = 1;
|
||||||
|
v30(i) = 1;
|
||||||
|
v31(i) = 1;
|
||||||
|
v32(i) = 1;
|
||||||
|
v33(i) = 1;
|
||||||
|
v34(i) = 1;
|
||||||
|
v35(i) = 1;
|
||||||
|
v36(i) = 1;
|
||||||
|
v37(i) = 1;
|
||||||
|
v38(i) = 1;
|
||||||
|
v39(i) = 1;
|
||||||
|
v40(i) = 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// The ADD_COPY_CONSTRUCTOR macro is helpful to compare time in the copy
|
||||||
|
// constructor between compilers. We have found that the GNU compiler
|
||||||
|
// is sometimes able to inline the default copy constructor.
|
||||||
|
#ifdef ADD_COPY_CONSTRUCTOR
|
||||||
|
__attribute__((noinline)) ViewCollection(const ViewCollection& other)
|
||||||
|
: v1(other.v1),
|
||||||
|
v2(other.v2),
|
||||||
|
v3(other.v3),
|
||||||
|
v4(other.v4),
|
||||||
|
v5(other.v5),
|
||||||
|
v6(other.v6),
|
||||||
|
v7(other.v7),
|
||||||
|
v8(other.v8),
|
||||||
|
v9(other.v9),
|
||||||
|
v10(other.v10),
|
||||||
|
v11(other.v11),
|
||||||
|
v12(other.v12),
|
||||||
|
v13(other.v13),
|
||||||
|
v14(other.v14),
|
||||||
|
v15(other.v15),
|
||||||
|
v16(other.v16),
|
||||||
|
v17(other.v17),
|
||||||
|
v18(other.v18),
|
||||||
|
v19(other.v19),
|
||||||
|
v20(other.v20),
|
||||||
|
v21(other.v21),
|
||||||
|
v22(other.v22),
|
||||||
|
v23(other.v23),
|
||||||
|
v24(other.v24),
|
||||||
|
v25(other.v25),
|
||||||
|
v26(other.v26),
|
||||||
|
v27(other.v27),
|
||||||
|
v28(other.v28),
|
||||||
|
v29(other.v29),
|
||||||
|
v30(other.v30),
|
||||||
|
v31(other.v31),
|
||||||
|
v32(other.v32),
|
||||||
|
v33(other.v33),
|
||||||
|
v34(other.v34),
|
||||||
|
v35(other.v35),
|
||||||
|
v36(other.v36),
|
||||||
|
v37(other.v37),
|
||||||
|
v38(other.v38),
|
||||||
|
v39(other.v39),
|
||||||
|
v40(other.v40),
|
||||||
|
m_expected_sum(other.m_expected_sum),
|
||||||
|
m_side_effect(other.m_side_effect),
|
||||||
|
m_N(other.m_N) {}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
KOKKOS_INLINE_FUNCTION
|
||||||
|
double sum_views(int ii, bool execute_kernel) {
|
||||||
|
double result = 0.0;
|
||||||
|
if (execute_kernel) {
|
||||||
|
// This code is only executed when using the command line option -k
|
||||||
|
// The computation references all Kokkos views. This may help our
|
||||||
|
// effort to stop compilers from optimizing away the Kokkos views
|
||||||
|
for (int i = 0; i < m_N; ++i) {
|
||||||
|
result += v1(i) + v2(i) + v3(i) + v4(i) + v5(i) + v6(i) + v7(i) +
|
||||||
|
v8(i) + v9(i) + v10(i) + v11(i) + v12(i) + v13(i) + v14(i) +
|
||||||
|
v15(i) + v16(i) + v17(i) + v18(i) + v19(i) + v20(i) + v21(i) +
|
||||||
|
v22(i) + v23(i) + v24(i) + v25(i) + v26(i) + v27(i) + v28(i) +
|
||||||
|
v29(i) + v30(i) + v31(i) + v32(i) + v33(i) + v34(i) + v35(i) +
|
||||||
|
v36(i) + v37(i) + v38(i) + v39(i) + v40(i);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
result = m_expected_sum;
|
||||||
|
}
|
||||||
|
// This statement introduces a side effect that may help our effort to
|
||||||
|
// stop compilers from optimizing away the temporary ViewCollection object
|
||||||
|
m_side_effect = result * (ii + 1);
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
void test_view_collection_kk(int N, int num_iter, bool execute_kernel) {
|
||||||
|
ViewCollection view_collection(N);
|
||||||
|
|
||||||
|
Kokkos::Timer view_collection_timer;
|
||||||
|
double max_value = 0.0;
|
||||||
|
// Max Reduction boilerplate code taken from slide 53 of
|
||||||
|
// kokkos-tutorials/LectureSeries/KokkosTutorial_02_ViewsAndSpaces.pdf
|
||||||
|
Kokkos::parallel_reduce(
|
||||||
|
"collection-reduction", num_iter,
|
||||||
|
KOKKOS_LAMBDA(int i, double& valueToUpdate) {
|
||||||
|
// NOTE: The following lines expose the Kokkos View overheads
|
||||||
|
ViewCollection tmp_view_collection = view_collection;
|
||||||
|
double my_value = tmp_view_collection.sum_views(i, execute_kernel);
|
||||||
|
if (my_value > valueToUpdate) valueToUpdate = my_value;
|
||||||
|
},
|
||||||
|
Kokkos::Max<double>(max_value));
|
||||||
|
double view_collection_time = view_collection_timer.seconds();
|
||||||
|
|
||||||
|
bool success = std::fabs(max_value - N * NVIEWS) < 1.E-6;
|
||||||
|
std::cout << "View Time = " << view_collection_time << " seconds"
|
||||||
|
<< std::endl;
|
||||||
|
if (success) {
|
||||||
|
std::cout << "Kokkos run:" << std::endl;
|
||||||
|
std::cout << "SUCCESS" << std::endl;
|
||||||
|
} else {
|
||||||
|
std::cout << "FAILURE" << std::endl;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void test_view_collection_serial(int N, int num_iter, bool execute_kernel) {
|
||||||
|
ViewCollection view_collection(N);
|
||||||
|
|
||||||
|
Kokkos::Timer view_collection_timer;
|
||||||
|
double max_value = 0.0;
|
||||||
|
// Max Reduction boilerplate code taken from slide 53 of
|
||||||
|
// kokkos-tutorials/LectureSeries/KokkosTutorial_02_ViewsAndSpaces.pdf
|
||||||
|
for (int i = 0; i < num_iter; ++i) {
|
||||||
|
// NOTE: The following lines expose the Kokkos View overheads
|
||||||
|
ViewCollection tmp_view_collection = view_collection;
|
||||||
|
double my_value = tmp_view_collection.sum_views(i, execute_kernel);
|
||||||
|
if (my_value > max_value) max_value = my_value;
|
||||||
|
}
|
||||||
|
double view_collection_time = view_collection_timer.seconds();
|
||||||
|
|
||||||
|
bool success = std::fabs(max_value - N * NVIEWS) < 1.E-6;
|
||||||
|
std::cout << "View Time 2 = " << view_collection_time << " seconds"
|
||||||
|
<< std::endl;
|
||||||
|
if (success) {
|
||||||
|
std::cout << "Serial run:" << std::endl;
|
||||||
|
std::cout << "SUCCESS" << std::endl;
|
||||||
|
} else {
|
||||||
|
std::cout << "FAILURE" << std::endl;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
int main(int argc, char* argv[]) {
|
||||||
|
// The benchmark is only testing reference counting for views on host.
|
||||||
|
#if defined(KOKKOS_ENABLE_OPENMP) || defined(KOKKOS_ENABLE_SERIAL) || \
|
||||||
|
defined(KOKKOS_ENABLE_THREADS) || defined(KOKKOS_ENABLE_HPX)
|
||||||
|
int N = 1;
|
||||||
|
int num_iter = 1 << 27;
|
||||||
|
bool execute_kernel = false;
|
||||||
|
|
||||||
|
for (int i = 0; i < argc; i++) {
|
||||||
|
if ((strcmp(argv[i], "-N") == 0)) {
|
||||||
|
N = atoi(argv[++i]);
|
||||||
|
if (N < 1) {
|
||||||
|
std::cout << "Array extent must be >= 1" << std::endl;
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
} else if (strcmp(argv[i], "-i") == 0) {
|
||||||
|
num_iter = atoi(argv[++i]);
|
||||||
|
if (num_iter < 1) {
|
||||||
|
std::cout << "Number of iterations must be >= 1" << std::endl;
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
} else if (strcmp(argv[i], "-k") == 0) {
|
||||||
|
execute_kernel = true;
|
||||||
|
} else if ((strcmp(argv[i], "-h") == 0)) {
|
||||||
|
printf(" Options:\n");
|
||||||
|
printf(" -N <int>: Array extent\n");
|
||||||
|
printf(" -i <int>: Number of iterations\n");
|
||||||
|
printf(" -k: Execute the summation kernel\n");
|
||||||
|
printf(" -h: Print this message\n\n");
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
std::cout << "Array extent = " << N << std::endl;
|
||||||
|
std::cout << "Iterations = " << num_iter << std::endl;
|
||||||
|
std::cout << "Execute summation kernel = " << std::boolalpha << execute_kernel
|
||||||
|
<< std::noboolalpha << std::endl;
|
||||||
|
|
||||||
|
// Test inside a Kokkos kernel.
|
||||||
|
Kokkos::initialize(argc, argv);
|
||||||
|
{ test_view_collection_kk(N, num_iter, execute_kernel); }
|
||||||
|
|
||||||
|
// Test outside Kokkos kernel.
|
||||||
|
test_view_collection_serial(N, num_iter, execute_kernel);
|
||||||
|
|
||||||
|
Kokkos::finalize();
|
||||||
|
#endif
|
||||||
|
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
@ -233,7 +233,7 @@ do
|
|||||||
cuda_args="$cuda_args $1"
|
cuda_args="$cuda_args $1"
|
||||||
;;
|
;;
|
||||||
#Handle more known nvcc args
|
#Handle more known nvcc args
|
||||||
--extended-lambda|--expt-extended-lambda|--expt-relaxed-constexpr|--Wno-deprecated-gpu-targets|-Wno-deprecated-gpu-targets|-allow-unsupported-compiler|--allow-unsupported-compiler)
|
--extended-lambda|--expt-extended-lambda|--expt-relaxed-constexpr|--Wno-deprecated-gpu-targets|-Wno-deprecated-gpu-targets|-allow-unsupported-compiler|--allow-unsupported-compiler|--disable-warnings)
|
||||||
cuda_args="$cuda_args $1"
|
cuda_args="$cuda_args $1"
|
||||||
;;
|
;;
|
||||||
#Handle known nvcc args that have an argument
|
#Handle known nvcc args that have an argument
|
||||||
|
|||||||
@ -1,6 +1,5 @@
|
|||||||
TRIBITS_PACKAGE_DEFINE_DEPENDENCIES(
|
TRIBITS_PACKAGE_DEFINE_DEPENDENCIES(
|
||||||
LIB_OPTIONAL_TPLS Pthread CUDA HWLOC DLlib
|
LIB_OPTIONAL_TPLS Pthread CUDA HWLOC DLlib
|
||||||
TEST_OPTIONAL_TPLS CUSPARSE
|
|
||||||
)
|
)
|
||||||
|
|
||||||
TRIBITS_TPL_TENTATIVELY_ENABLE(DLlib)
|
TRIBITS_TPL_TENTATIVELY_ENABLE(DLlib)
|
||||||
|
|||||||
@ -225,8 +225,13 @@ FUNCTION(kokkos_compilation)
|
|||||||
# if built w/o CUDA support, we want to basically make this a no-op
|
# if built w/o CUDA support, we want to basically make this a no-op
|
||||||
SET(_Kokkos_ENABLE_CUDA @Kokkos_ENABLE_CUDA@)
|
SET(_Kokkos_ENABLE_CUDA @Kokkos_ENABLE_CUDA@)
|
||||||
|
|
||||||
|
|
||||||
|
IF(CMAKE_VERSION VERSION_GREATER_EQUAL 3.17)
|
||||||
|
SET(MAYBE_CURRENT_INSTALLATION_ROOT "${CMAKE_CURRENT_FUNCTION_LIST_DIR}/../../..")
|
||||||
|
ENDIF()
|
||||||
|
|
||||||
# search relative first and then absolute
|
# search relative first and then absolute
|
||||||
SET(_HINTS "${CMAKE_CURRENT_LIST_DIR}/../.." "@CMAKE_INSTALL_PREFIX@")
|
SET(_HINTS "${MAYBE_CURRENT_INSTALLATION_ROOT}" "@CMAKE_INSTALL_PREFIX@")
|
||||||
|
|
||||||
# find kokkos_launch_compiler
|
# find kokkos_launch_compiler
|
||||||
FIND_PROGRAM(Kokkos_COMPILE_LAUNCHER
|
FIND_PROGRAM(Kokkos_COMPILE_LAUNCHER
|
||||||
|
|||||||
@ -52,6 +52,8 @@
|
|||||||
#cmakedefine KOKKOS_OPT_RANGE_AGGRESSIVE_VECTORIZATION // deprecated
|
#cmakedefine KOKKOS_OPT_RANGE_AGGRESSIVE_VECTORIZATION // deprecated
|
||||||
#cmakedefine KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION
|
#cmakedefine KOKKOS_ENABLE_AGGRESSIVE_VECTORIZATION
|
||||||
#cmakedefine KOKKOS_ENABLE_IMPL_MDSPAN
|
#cmakedefine KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
|
#cmakedefine KOKKOS_ENABLE_IMPL_REF_COUNT_BRANCH_UNLIKELY
|
||||||
|
#cmakedefine KOKKOS_ENABLE_IMPL_VIEW_OF_VIEWS_DESTRUCTOR_PRECONDITION_VIOLATION_WORKAROUND
|
||||||
#cmakedefine KOKKOS_ENABLE_ATOMICS_BYPASS
|
#cmakedefine KOKKOS_ENABLE_ATOMICS_BYPASS
|
||||||
|
|
||||||
/* TPL Settings */
|
/* TPL Settings */
|
||||||
@ -65,6 +67,7 @@
|
|||||||
#cmakedefine KOKKOS_ARCH_ARMV8_THUNDERX
|
#cmakedefine KOKKOS_ARCH_ARMV8_THUNDERX
|
||||||
#cmakedefine KOKKOS_ARCH_ARMV81
|
#cmakedefine KOKKOS_ARCH_ARMV81
|
||||||
#cmakedefine KOKKOS_ARCH_ARMV8_THUNDERX2
|
#cmakedefine KOKKOS_ARCH_ARMV8_THUNDERX2
|
||||||
|
#cmakedefine KOKKOS_ARCH_ARMV9_GRACE
|
||||||
#cmakedefine KOKKOS_ARCH_A64FX
|
#cmakedefine KOKKOS_ARCH_A64FX
|
||||||
#cmakedefine KOKKOS_ARCH_AVX
|
#cmakedefine KOKKOS_ARCH_AVX
|
||||||
#cmakedefine KOKKOS_ARCH_AVX2
|
#cmakedefine KOKKOS_ARCH_AVX2
|
||||||
@ -116,7 +119,6 @@
|
|||||||
#cmakedefine KOKKOS_ARCH_AMD_GFX942
|
#cmakedefine KOKKOS_ARCH_AMD_GFX942
|
||||||
#cmakedefine KOKKOS_ARCH_AMD_GFX1030
|
#cmakedefine KOKKOS_ARCH_AMD_GFX1030
|
||||||
#cmakedefine KOKKOS_ARCH_AMD_GFX1100
|
#cmakedefine KOKKOS_ARCH_AMD_GFX1100
|
||||||
#cmakedefine KOKKOS_ARCH_AMD_GFX1103
|
|
||||||
#cmakedefine KOKKOS_ARCH_AMD_GPU
|
#cmakedefine KOKKOS_ARCH_AMD_GPU
|
||||||
#cmakedefine KOKKOS_ARCH_VEGA // deprecated
|
#cmakedefine KOKKOS_ARCH_VEGA // deprecated
|
||||||
#cmakedefine KOKKOS_ARCH_VEGA906 // deprecated
|
#cmakedefine KOKKOS_ARCH_VEGA906 // deprecated
|
||||||
|
|||||||
@ -7,37 +7,38 @@ IF (NOT CUDAToolkit_ROOT)
|
|||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
# FIXME CMake 3.28.4 creates more targets than we export
|
IF(KOKKOS_CXX_HOST_COMPILER_ID STREQUAL NVHPC AND CMAKE_VERSION VERSION_LESS "3.20.1")
|
||||||
IF(CMAKE_VERSION VERSION_GREATER_EQUAL "3.17.0" AND CMAKE_VERSION VERSION_LESS "3.28.4")
|
MESSAGE(FATAL_ERROR "Using NVHPC as host compiler requires at least CMake 3.20.1")
|
||||||
find_package(CUDAToolkit)
|
|
||||||
ELSE()
|
|
||||||
include(${CMAKE_CURRENT_LIST_DIR}/CudaToolkit.cmake)
|
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
|
IF(CMAKE_VERSION VERSION_GREATER_EQUAL "3.17.0")
|
||||||
IF (TARGET CUDA::cudart)
|
find_package(CUDAToolkit REQUIRED)
|
||||||
SET(FOUND_CUDART TRUE)
|
|
||||||
KOKKOS_EXPORT_IMPORTED_TPL(CUDA::cudart)
|
|
||||||
ELSE()
|
|
||||||
SET(FOUND_CUDART FALSE)
|
|
||||||
ENDIF()
|
|
||||||
|
|
||||||
IF (TARGET CUDA::cuda_driver)
|
|
||||||
SET(FOUND_CUDA_DRIVER TRUE)
|
|
||||||
KOKKOS_EXPORT_IMPORTED_TPL(CUDA::cuda_driver)
|
|
||||||
ELSE()
|
|
||||||
SET(FOUND_CUDA_DRIVER FALSE)
|
|
||||||
ENDIF()
|
|
||||||
|
|
||||||
include(FindPackageHandleStandardArgs)
|
|
||||||
IF(KOKKOS_CXX_HOST_COMPILER_ID STREQUAL NVHPC)
|
|
||||||
SET(KOKKOS_CUDA_ERROR "Using NVHPC as host compiler requires at least CMake 3.20.1")
|
|
||||||
ELSE()
|
|
||||||
SET(KOKKOS_CUDA_ERROR DEFAULT_MSG)
|
|
||||||
ENDIF()
|
|
||||||
FIND_PACKAGE_HANDLE_STANDARD_ARGS(TPLCUDA ${KOKKOS_CUDA_ERROR} FOUND_CUDART FOUND_CUDA_DRIVER)
|
|
||||||
IF (FOUND_CUDA_DRIVER AND FOUND_CUDART)
|
|
||||||
KOKKOS_CREATE_IMPORTED_TPL(CUDA INTERFACE
|
KOKKOS_CREATE_IMPORTED_TPL(CUDA INTERFACE
|
||||||
LINK_LIBRARIES CUDA::cuda_driver CUDA::cudart
|
LINK_LIBRARIES CUDA::cuda_driver CUDA::cudart
|
||||||
)
|
)
|
||||||
|
KOKKOS_EXPORT_CMAKE_TPL(CUDAToolkit REQUIRED)
|
||||||
|
ELSE()
|
||||||
|
include(${CMAKE_CURRENT_LIST_DIR}/CudaToolkit.cmake)
|
||||||
|
|
||||||
|
IF (TARGET CUDA::cudart)
|
||||||
|
SET(FOUND_CUDART TRUE)
|
||||||
|
KOKKOS_EXPORT_IMPORTED_TPL(CUDA::cudart)
|
||||||
|
ELSE()
|
||||||
|
SET(FOUND_CUDART FALSE)
|
||||||
|
ENDIF()
|
||||||
|
|
||||||
|
IF (TARGET CUDA::cuda_driver)
|
||||||
|
SET(FOUND_CUDA_DRIVER TRUE)
|
||||||
|
KOKKOS_EXPORT_IMPORTED_TPL(CUDA::cuda_driver)
|
||||||
|
ELSE()
|
||||||
|
SET(FOUND_CUDA_DRIVER FALSE)
|
||||||
|
ENDIF()
|
||||||
|
|
||||||
|
include(FindPackageHandleStandardArgs)
|
||||||
|
FIND_PACKAGE_HANDLE_STANDARD_ARGS(TPLCUDA ${DEFAULT_MSG} FOUND_CUDART FOUND_CUDA_DRIVER)
|
||||||
|
IF (FOUND_CUDA_DRIVER AND FOUND_CUDART)
|
||||||
|
KOKKOS_CREATE_IMPORTED_TPL(CUDA INTERFACE
|
||||||
|
LINK_LIBRARIES CUDA::cuda_driver CUDA::cudart
|
||||||
|
)
|
||||||
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|||||||
@ -35,7 +35,6 @@ IF(NOT _CUDA_FAILURE)
|
|||||||
GLOBAL_SET(TPL_CUDA_LIBRARY_DIRS)
|
GLOBAL_SET(TPL_CUDA_LIBRARY_DIRS)
|
||||||
GLOBAL_SET(TPL_CUDA_INCLUDE_DIRS ${CUDA_TOOLKIT_INCLUDE})
|
GLOBAL_SET(TPL_CUDA_INCLUDE_DIRS ${CUDA_TOOLKIT_INCLUDE})
|
||||||
GLOBAL_SET(TPL_CUDA_LIBRARIES ${CUDA_CUDART_LIBRARY} ${CUDA_cublas_LIBRARY} ${CUDA_cufft_LIBRARY})
|
GLOBAL_SET(TPL_CUDA_LIBRARIES ${CUDA_CUDART_LIBRARY} ${CUDA_cublas_LIBRARY} ${CUDA_cufft_LIBRARY})
|
||||||
KOKKOS_CREATE_IMPORTED_TPL_LIBRARY(CUSPARSE)
|
|
||||||
ELSE()
|
ELSE()
|
||||||
SET(TPL_ENABLE_CUDA OFF)
|
SET(TPL_ENABLE_CUDA OFF)
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|||||||
@ -1,26 +0,0 @@
|
|||||||
#@HEADER
|
|
||||||
# ************************************************************************
|
|
||||||
#
|
|
||||||
# Kokkos v. 4.0
|
|
||||||
# Copyright (2022) National Technology & Engineering
|
|
||||||
# Solutions of Sandia, LLC (NTESS).
|
|
||||||
#
|
|
||||||
# Under the terms of Contract DE-NA0003525 with NTESS,
|
|
||||||
# the U.S. Government retains certain rights in this software.
|
|
||||||
#
|
|
||||||
# Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions.
|
|
||||||
#
|
|
||||||
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
||||||
#
|
|
||||||
# ************************************************************************
|
|
||||||
# @HEADER
|
|
||||||
|
|
||||||
#include(${TRIBITS_DEPS_DIR}/CUDA.cmake)
|
|
||||||
|
|
||||||
#IF (TPL_ENABLE_CUDA)
|
|
||||||
# GLOBAL_SET(TPL_CUSPARSE_LIBRARY_DIRS)
|
|
||||||
# GLOBAL_SET(TPL_CUSPARSE_INCLUDE_DIRS ${TPL_CUDA_INCLUDE_DIRS})
|
|
||||||
# GLOBAL_SET(TPL_CUSPARSE_LIBRARIES ${CUDA_cusparse_LIBRARY})
|
|
||||||
# KOKKOS_CREATE_IMPORTED_TPL_LIBRARY(CUSPARSE)
|
|
||||||
#ENDIF()
|
|
||||||
|
|
||||||
@ -118,14 +118,6 @@ FUNCTION(KOKKOS_ADD_TEST)
|
|||||||
ENDIF()
|
ENDIF()
|
||||||
ENDFUNCTION()
|
ENDFUNCTION()
|
||||||
|
|
||||||
FUNCTION(KOKKOS_ADD_ADVANCED_TEST)
|
|
||||||
if (KOKKOS_HAS_TRILINOS)
|
|
||||||
TRIBITS_ADD_ADVANCED_TEST(${ARGN})
|
|
||||||
else()
|
|
||||||
# TODO Write this
|
|
||||||
endif()
|
|
||||||
ENDFUNCTION()
|
|
||||||
|
|
||||||
MACRO(KOKKOS_CREATE_IMPORTED_TPL_LIBRARY TPL_NAME)
|
MACRO(KOKKOS_CREATE_IMPORTED_TPL_LIBRARY TPL_NAME)
|
||||||
ADD_INTERFACE_LIBRARY(TPL_LIB_${TPL_NAME})
|
ADD_INTERFACE_LIBRARY(TPL_LIB_${TPL_NAME})
|
||||||
TARGET_LINK_LIBRARIES(TPL_LIB_${TPL_NAME} LINK_PUBLIC ${TPL_${TPL_NAME}_LIBRARIES})
|
TARGET_LINK_LIBRARIES(TPL_LIB_${TPL_NAME} LINK_PUBLIC ${TPL_${TPL_NAME}_LIBRARIES})
|
||||||
|
|||||||
@ -28,6 +28,7 @@ KOKKOS_CHECK_DEPRECATED_OPTIONS(
|
|||||||
#-------------------------------------------------------------------------------
|
#-------------------------------------------------------------------------------
|
||||||
SET(KOKKOS_ARCH_LIST)
|
SET(KOKKOS_ARCH_LIST)
|
||||||
|
|
||||||
|
include(CheckCXXCompilerFlag)
|
||||||
|
|
||||||
KOKKOS_DEPRECATED_LIST(ARCH ARCH)
|
KOKKOS_DEPRECATED_LIST(ARCH ARCH)
|
||||||
|
|
||||||
@ -49,6 +50,7 @@ DECLARE_AND_CHECK_HOST_ARCH(ARMV81 "ARMv8.1 Compatible CPU")
|
|||||||
DECLARE_AND_CHECK_HOST_ARCH(ARMV8_THUNDERX "ARMv8 Cavium ThunderX CPU")
|
DECLARE_AND_CHECK_HOST_ARCH(ARMV8_THUNDERX "ARMv8 Cavium ThunderX CPU")
|
||||||
DECLARE_AND_CHECK_HOST_ARCH(ARMV8_THUNDERX2 "ARMv8 Cavium ThunderX2 CPU")
|
DECLARE_AND_CHECK_HOST_ARCH(ARMV8_THUNDERX2 "ARMv8 Cavium ThunderX2 CPU")
|
||||||
DECLARE_AND_CHECK_HOST_ARCH(A64FX "ARMv8.2 with SVE Support")
|
DECLARE_AND_CHECK_HOST_ARCH(A64FX "ARMv8.2 with SVE Support")
|
||||||
|
DECLARE_AND_CHECK_HOST_ARCH(ARMV9_GRACE "ARMv9 NVIDIA Grace CPU")
|
||||||
DECLARE_AND_CHECK_HOST_ARCH(SNB "Intel Sandy/Ivy Bridge CPUs")
|
DECLARE_AND_CHECK_HOST_ARCH(SNB "Intel Sandy/Ivy Bridge CPUs")
|
||||||
DECLARE_AND_CHECK_HOST_ARCH(HSW "Intel Haswell CPUs")
|
DECLARE_AND_CHECK_HOST_ARCH(HSW "Intel Haswell CPUs")
|
||||||
DECLARE_AND_CHECK_HOST_ARCH(BDW "Intel Broadwell Xeon E-class CPUs")
|
DECLARE_AND_CHECK_HOST_ARCH(BDW "Intel Broadwell Xeon E-class CPUs")
|
||||||
@ -101,9 +103,9 @@ LIST(APPEND CORRESPONDING_AMD_FLAGS gfx90a gfx90a gfx908 gfx908)
|
|||||||
LIST(APPEND SUPPORTED_AMD_GPUS MI50/60 MI50/60)
|
LIST(APPEND SUPPORTED_AMD_GPUS MI50/60 MI50/60)
|
||||||
LIST(APPEND SUPPORTED_AMD_ARCHS VEGA906 AMD_GFX906)
|
LIST(APPEND SUPPORTED_AMD_ARCHS VEGA906 AMD_GFX906)
|
||||||
LIST(APPEND CORRESPONDING_AMD_FLAGS gfx906 gfx906)
|
LIST(APPEND CORRESPONDING_AMD_FLAGS gfx906 gfx906)
|
||||||
LIST(APPEND SUPPORTED_AMD_GPUS PHOENIX RX7900XTX V620/W6800 V620/W6800)
|
LIST(APPEND SUPPORTED_AMD_GPUS RX7900XTX RX7900XTX V620/W6800 V620/W6800)
|
||||||
LIST(APPEND SUPPORTED_AMD_ARCHS AMD_GFX1103 AMD_GFX1100 NAVI1030 AMD_GFX1030)
|
LIST(APPEND SUPPORTED_AMD_ARCHS NAVI1100 AMD_GFX1100 NAVI1030 AMD_GFX1030)
|
||||||
LIST(APPEND CORRESPONDING_AMD_FLAGS gfx1103 gfx1100 gfx1030 gfx1030)
|
LIST(APPEND CORRESPONDING_AMD_FLAGS gfx1100 gfx1100 gfx1030 gfx1030)
|
||||||
|
|
||||||
#FIXME CAN BE REPLACED WITH LIST_ZIP IN CMAKE 3.17
|
#FIXME CAN BE REPLACED WITH LIST_ZIP IN CMAKE 3.17
|
||||||
FOREACH(ARCH IN LISTS SUPPORTED_AMD_ARCHS)
|
FOREACH(ARCH IN LISTS SUPPORTED_AMD_ARCHS)
|
||||||
@ -189,12 +191,6 @@ IF (KOKKOS_CXX_COMPILER_ID STREQUAL Clang)
|
|||||||
ELSEIF(CUDAToolkit_BIN_DIR)
|
ELSEIF(CUDAToolkit_BIN_DIR)
|
||||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS --cuda-path=${CUDAToolkit_BIN_DIR}/..)
|
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS --cuda-path=${CUDAToolkit_BIN_DIR}/..)
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF (KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC)
|
|
||||||
SET(CUDA_ARCH_FLAG "-gpu")
|
|
||||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS -cuda)
|
|
||||||
IF (KOKKOS_ENABLE_CUDA) # FIXME ideally unreachable when CUDA not enabled
|
|
||||||
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS -cuda)
|
|
||||||
ENDIF()
|
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||||
SET(CUDA_ARCH_FLAG "-arch")
|
SET(CUDA_ARCH_FLAG "-arch")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
@ -209,6 +205,11 @@ ENDIF()
|
|||||||
|
|
||||||
|
|
||||||
#------------------------------- KOKKOS_HIP_OPTIONS ---------------------------
|
#------------------------------- KOKKOS_HIP_OPTIONS ---------------------------
|
||||||
|
KOKKOS_OPTION(IMPL_AMDGPU_FLAGS "" STRING "Set compiler flags for AMD GPUs")
|
||||||
|
KOKKOS_OPTION(IMPL_AMDGPU_LINK "" STRING "Set linker flags for AMD GPUs")
|
||||||
|
MARK_AS_ADVANCED(Kokkos_IMPL_AMDGPU_FLAGS)
|
||||||
|
MARK_AS_ADVANCED(Kokkos_IMPL_AMDGPU_LINK)
|
||||||
|
|
||||||
#clear anything that might be in the cache
|
#clear anything that might be in the cache
|
||||||
GLOBAL_SET(KOKKOS_AMDGPU_OPTIONS)
|
GLOBAL_SET(KOKKOS_AMDGPU_OPTIONS)
|
||||||
IF(KOKKOS_ENABLE_HIP)
|
IF(KOKKOS_ENABLE_HIP)
|
||||||
@ -301,6 +302,20 @@ IF (KOKKOS_ARCH_A64FX)
|
|||||||
)
|
)
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
|
IF (KOKKOS_ARCH_ARMV9_GRACE)
|
||||||
|
SET(KOKKOS_ARCH_ARM_NEON ON)
|
||||||
|
check_cxx_compiler_flag("-mcpu=neoverse-n2" COMPILER_SUPPORTS_NEOVERSE_N2)
|
||||||
|
check_cxx_compiler_flag("-msve-vector-bits=128" COMPILER_SUPPORTS_SVE_VECTOR_BITS)
|
||||||
|
IF (COMPILER_SUPPORTS_NEOVERSE_N2 AND COMPILER_SUPPORTS_SVE_VECTOR_BITS)
|
||||||
|
COMPILER_SPECIFIC_FLAGS(
|
||||||
|
COMPILER_ID KOKKOS_CXX_HOST_COMPILER_ID
|
||||||
|
DEFAULT -mcpu=neoverse-n2 -msve-vector-bits=128
|
||||||
|
)
|
||||||
|
ELSE()
|
||||||
|
MESSAGE(WARNING "Compiler does not support ARMv9 Grace architecture")
|
||||||
|
ENDIF()
|
||||||
|
ENDIF()
|
||||||
|
|
||||||
IF (KOKKOS_ARCH_ZEN)
|
IF (KOKKOS_ARCH_ZEN)
|
||||||
COMPILER_SPECIFIC_FLAGS(
|
COMPILER_SPECIFIC_FLAGS(
|
||||||
COMPILER_ID KOKKOS_CXX_HOST_COMPILER_ID
|
COMPILER_ID KOKKOS_CXX_HOST_COMPILER_ID
|
||||||
@ -535,17 +550,17 @@ IF (KOKKOS_CXX_HOST_COMPILER_ID STREQUAL NVHPC)
|
|||||||
SET(KOKKOS_ARCH_AVX512XEON OFF)
|
SET(KOKKOS_ARCH_AVX512XEON OFF)
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
|
# FIXME_NVCC nvcc doesn't seem to support Arm Neon.
|
||||||
|
IF(KOKKOS_ARCH_ARM_NEON AND KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||||
|
UNSET(KOKKOS_ARCH_ARM_NEON)
|
||||||
|
ENDIF()
|
||||||
|
|
||||||
IF (NOT KOKKOS_COMPILE_LANGUAGE STREQUAL CUDA)
|
IF (NOT KOKKOS_COMPILE_LANGUAGE STREQUAL CUDA)
|
||||||
IF (KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE)
|
IF (KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE)
|
||||||
COMPILER_SPECIFIC_FLAGS(
|
COMPILER_SPECIFIC_FLAGS(
|
||||||
Clang -fcuda-rdc
|
Clang -fcuda-rdc
|
||||||
NVIDIA --relocatable-device-code=true
|
NVIDIA --relocatable-device-code=true
|
||||||
NVHPC -gpu=rdc
|
|
||||||
)
|
)
|
||||||
ELSEIF(KOKKOS_ENABLE_CUDA)
|
|
||||||
COMPILER_SPECIFIC_FLAGS(
|
|
||||||
NVHPC -gpu=nordc
|
|
||||||
)
|
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
@ -571,7 +586,7 @@ IF (KOKKOS_ENABLE_HIP)
|
|||||||
COMPILER_SPECIFIC_FLAGS(
|
COMPILER_SPECIFIC_FLAGS(
|
||||||
DEFAULT -fgpu-rdc
|
DEFAULT -fgpu-rdc
|
||||||
)
|
)
|
||||||
IF (NOT KOKKOS_CXX_COMPILER_ID STREQUAL HIPCC)
|
IF (NOT KOKKOS_CXX_COMPILER_ID STREQUAL HIPCC AND NOT KOKKOS_IMPL_AMDGPU_FLAGS)
|
||||||
COMPILER_SPECIFIC_LINK_OPTIONS(
|
COMPILER_SPECIFIC_LINK_OPTIONS(
|
||||||
DEFAULT --hip-link
|
DEFAULT --hip-link
|
||||||
)
|
)
|
||||||
@ -654,15 +669,9 @@ FUNCTION(CHECK_CUDA_ARCH ARCH FLAG)
|
|||||||
IF(KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE)
|
IF(KOKKOS_ENABLE_COMPILE_AS_CMAKE_LANGUAGE)
|
||||||
SET(CMAKE_CUDA_ARCHITECTURES ${KOKKOS_CUDA_ARCHITECTURES} PARENT_SCOPE)
|
SET(CMAKE_CUDA_ARCHITECTURES ${KOKKOS_CUDA_ARCHITECTURES} PARENT_SCOPE)
|
||||||
ELSE()
|
ELSE()
|
||||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC)
|
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS "${CUDA_ARCH_FLAG}=${FLAG}")
|
||||||
STRING(REPLACE "sm_" "cc" NVHPC_CUDA_ARCH ${FLAG})
|
IF(KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE OR KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS "${CUDA_ARCH_FLAG}=${NVHPC_CUDA_ARCH}")
|
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS "${CUDA_ARCH_FLAG}=${FLAG}")
|
||||||
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS "${CUDA_ARCH_FLAG}=${NVHPC_CUDA_ARCH}")
|
|
||||||
ELSE()
|
|
||||||
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS "${CUDA_ARCH_FLAG}=${FLAG}")
|
|
||||||
IF(KOKKOS_ENABLE_CUDA_RELOCATABLE_DEVICE_CODE OR KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
|
||||||
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS "${CUDA_ARCH_FLAG}=${FLAG}")
|
|
||||||
ENDIF()
|
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
@ -704,14 +713,16 @@ FUNCTION(CHECK_AMDGPU_ARCH ARCH FLAG)
|
|||||||
MESSAGE(WARNING "Given AMD GPU architecture ${ARCH}, but Kokkos_ENABLE_HIP, Kokkos_ENABLE_SYCL, Kokkos_ENABLE_OPENACC, and Kokkos_ENABLE_OPENMPTARGET are OFF. Option will be ignored.")
|
MESSAGE(WARNING "Given AMD GPU architecture ${ARCH}, but Kokkos_ENABLE_HIP, Kokkos_ENABLE_SYCL, Kokkos_ENABLE_OPENACC, and Kokkos_ENABLE_OPENMPTARGET are OFF. Option will be ignored.")
|
||||||
UNSET(KOKKOS_ARCH_${ARCH} PARENT_SCOPE)
|
UNSET(KOKKOS_ARCH_${ARCH} PARENT_SCOPE)
|
||||||
ELSE()
|
ELSE()
|
||||||
IF(KOKKOS_ENABLE_HIP)
|
IF(KOKKOS_ENABLE_HIP)
|
||||||
SET(KOKKOS_HIP_ARCHITECTURES ${FLAG} PARENT_SCOPE)
|
SET(KOKKOS_HIP_ARCHITECTURES ${FLAG} PARENT_SCOPE)
|
||||||
ENDIF()
|
ENDIF()
|
||||||
SET(KOKKOS_AMDGPU_ARCH_FLAG ${FLAG} PARENT_SCOPE)
|
IF(NOT KOKKOS_IMPL_AMDGPU_FLAGS)
|
||||||
GLOBAL_APPEND(KOKKOS_AMDGPU_OPTIONS "${AMDGPU_ARCH_FLAG}=${FLAG}")
|
SET(KOKKOS_AMDGPU_ARCH_FLAG ${FLAG} PARENT_SCOPE)
|
||||||
IF(KOKKOS_ENABLE_HIP_RELOCATABLE_DEVICE_CODE)
|
GLOBAL_APPEND(KOKKOS_AMDGPU_OPTIONS "${AMDGPU_ARCH_FLAG}=${FLAG}")
|
||||||
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS "${AMDGPU_ARCH_FLAG}=${FLAG}")
|
ENDIF()
|
||||||
ENDIF()
|
IF(KOKKOS_ENABLE_HIP_RELOCATABLE_DEVICE_CODE)
|
||||||
|
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS "${AMDGPU_ARCH_FLAG}=${FLAG}")
|
||||||
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDFUNCTION()
|
ENDFUNCTION()
|
||||||
@ -724,6 +735,15 @@ FOREACH(ARCH IN LISTS SUPPORTED_AMD_ARCHS)
|
|||||||
CHECK_AMDGPU_ARCH(${ARCH} ${FLAG})
|
CHECK_AMDGPU_ARCH(${ARCH} ${FLAG})
|
||||||
ENDFOREACH()
|
ENDFOREACH()
|
||||||
|
|
||||||
|
IF(KOKKOS_IMPL_AMDGPU_FLAGS)
|
||||||
|
IF (NOT AMDGPU_ARCH_ALREADY_SPECIFIED)
|
||||||
|
MESSAGE(FATAL_ERROR "When IMPL_AMDGPU_FLAGS is set the architecture autodectection is disabled. "
|
||||||
|
"Please explicitly set the GPU architecture.")
|
||||||
|
ENDIF()
|
||||||
|
GLOBAL_APPEND(KOKKOS_AMDGPU_OPTIONS "${KOKKOS_IMPL_AMDGPU_FLAGS}")
|
||||||
|
GLOBAL_APPEND(KOKKOS_LINK_OPTIONS "${KOKKOS_IMPL_AMDGPU_LINK}")
|
||||||
|
ENDIF()
|
||||||
|
|
||||||
MACRO(SET_AND_CHECK_AMD_ARCH ARCH FLAG)
|
MACRO(SET_AND_CHECK_AMD_ARCH ARCH FLAG)
|
||||||
KOKKOS_SET_OPTION(ARCH_${ARCH} ON)
|
KOKKOS_SET_OPTION(ARCH_${ARCH} ON)
|
||||||
CHECK_AMDGPU_ARCH(${ARCH} ${FLAG})
|
CHECK_AMDGPU_ARCH(${ARCH} ${FLAG})
|
||||||
@ -984,7 +1004,7 @@ IF (KOKKOS_ARCH_HOPPER90)
|
|||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
#HIP detection of gpu arch
|
#HIP detection of gpu arch
|
||||||
IF(KOKKOS_ENABLE_HIP AND NOT AMDGPU_ARCH_ALREADY_SPECIFIED)
|
IF(KOKKOS_ENABLE_HIP AND NOT AMDGPU_ARCH_ALREADY_SPECIFIED AND NOT KOKKOS_IMPL_AMDGPU_FLAGS)
|
||||||
FIND_PROGRAM(ROCM_ENUMERATOR rocm_agent_enumerator)
|
FIND_PROGRAM(ROCM_ENUMERATOR rocm_agent_enumerator)
|
||||||
IF(NOT ROCM_ENUMERATOR)
|
IF(NOT ROCM_ENUMERATOR)
|
||||||
MESSAGE(FATAL_ERROR "Autodetection of AMD GPU architecture not possible as "
|
MESSAGE(FATAL_ERROR "Autodetection of AMD GPU architecture not possible as "
|
||||||
|
|||||||
@ -42,12 +42,8 @@ IF(Kokkos_ENABLE_CUDA)
|
|||||||
# If launcher was found and nvcc_wrapper was not specified as
|
# If launcher was found and nvcc_wrapper was not specified as
|
||||||
# compiler and `CMAKE_CXX_COMPILIER_LAUNCHER` is not set, set to use launcher.
|
# compiler and `CMAKE_CXX_COMPILIER_LAUNCHER` is not set, set to use launcher.
|
||||||
# Will ensure CMAKE_CXX_COMPILER is replaced by nvcc_wrapper
|
# Will ensure CMAKE_CXX_COMPILER is replaced by nvcc_wrapper
|
||||||
IF(Kokkos_COMPILE_LAUNCHER AND NOT INTERNAL_HAVE_COMPILER_NVCC AND NOT KOKKOS_CXX_COMPILER_ID STREQUAL Clang
|
IF(Kokkos_COMPILE_LAUNCHER AND NOT INTERNAL_HAVE_COMPILER_NVCC AND NOT KOKKOS_CXX_COMPILER_ID STREQUAL Clang)
|
||||||
AND NOT (Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER AND KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC))
|
|
||||||
IF(CMAKE_CXX_COMPILER_LAUNCHER)
|
IF(CMAKE_CXX_COMPILER_LAUNCHER)
|
||||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC)
|
|
||||||
MESSAGE(STATUS "Using nvc++ as device compiler requires Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON!")
|
|
||||||
ENDIF()
|
|
||||||
MESSAGE(FATAL_ERROR "Cannot use CMAKE_CXX_COMPILER_LAUNCHER if the CMAKE_CXX_COMPILER is not able to compile CUDA code, i.e. nvcc_wrapper or clang++!")
|
MESSAGE(FATAL_ERROR "Cannot use CMAKE_CXX_COMPILER_LAUNCHER if the CMAKE_CXX_COMPILER is not able to compile CUDA code, i.e. nvcc_wrapper or clang++!")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
# the first argument to launcher is always the C++ compiler defined by cmake
|
# the first argument to launcher is always the C++ compiler defined by cmake
|
||||||
@ -149,56 +145,85 @@ IF(KOKKOS_CXX_COMPILER_ID STREQUAL Fujitsu)
|
|||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
# Enforce the minimum compilers supported by Kokkos.
|
# Enforce the minimum compilers supported by Kokkos.
|
||||||
SET(KOKKOS_MESSAGE_TEXT "Compiler not supported by Kokkos. Required compiler versions:")
|
IF(NOT CMAKE_CXX_STANDARD)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Clang(CPU) 8.0.0 or higher")
|
SET(CMAKE_CXX_STANDARD 17)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Clang(CUDA) 10.0.0 or higher")
|
ENDIF()
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Clang(OpenMPTarget) 15.0.0 or higher")
|
IF(CMAKE_CXX_STANDARD EQUAL 17)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n GCC 8.2.0 or higher")
|
SET(KOKKOS_CLANG_CPU_MINIMUM 8.0.0)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Intel 19.0.5 or higher")
|
SET(KOKKOS_CLANG_CUDA_MINIMUM 10.0.0)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n IntelLLVM(CPU) 2021.1.1 or higher")
|
SET(KOKKOS_CLANG_OPENMPTARGET_MINIMUM 15.0.0)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n IntelLLVM(SYCL) 2023.0.0 or higher")
|
SET(KOKKOS_GCC_MINIMUM 8.2.0)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n NVCC 11.0.0 or higher")
|
SET(KOKKOS_INTEL_MINIMUM 19.0.5)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n HIPCC 5.2.0 or higher")
|
SET(KOKKOS_INTEL_LLVM_CPU_MINIMUM 2021.1.1)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n NVHPC/PGI 22.3 or higher")
|
SET(KOKKOS_INTEL_LLVM_SYCL_MINIMUM 2023.0.0)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n MSVC 19.29 or higher")
|
SET(KOKKOS_NVCC_MINIMUM 11.0.0)
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n XL/XLClang not supported")
|
SET(KOKKOS_HIPCC_MINIMUM 5.2.0)
|
||||||
|
SET(KOKKOS_NVHPC_MINIMUM 22.3)
|
||||||
|
SET(KOKKOS_MSVC_MINIMUM 19.29)
|
||||||
|
ELSE()
|
||||||
|
SET(KOKKOS_CLANG_CPU_MINIMUM 14.0.0)
|
||||||
|
SET(KOKKOS_CLANG_CUDA_MINIMUM 14.0.0)
|
||||||
|
SET(KOKKOS_CLANG_OPENMPTARGET_MINIMUM 15.0.0)
|
||||||
|
SET(KOKKOS_GCC_MINIMUM 10.1.0)
|
||||||
|
SET(KOKKOS_INTEL_MINIMUM "not supported")
|
||||||
|
SET(KOKKOS_INTEL_LLVM_CPU_MINIMUM 2022.0.0)
|
||||||
|
SET(KOKKOS_INTEL_LLVM_SYCL_MINIMUM 2023.0.0)
|
||||||
|
SET(KOKKOS_NVCC_MINIMUM 12.0.0)
|
||||||
|
SET(KOKKOS_HIPCC_MINIMUM 5.2.0)
|
||||||
|
SET(KOKKOS_NVHPC_MINIMUM 22.3)
|
||||||
|
SET(KOKKOS_MSVC_MINIMUM 19.30)
|
||||||
|
ENDIF()
|
||||||
|
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "Compiler not supported by Kokkos for C++${CMAKE_CXX_STANDARD}. Required minimum compiler versions:")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Clang(CPU) ${KOKKOS_CLANG_CPU_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Clang(CUDA) ${KOKKOS_CLANG_CUDA_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Clang(OpenMPTarget) ${KOKKOS_CLANG_OPENMPTARGET_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n GCC ${KOKKOS_GCC_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n Intel ${KOKKOS_INTEL_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n IntelLLVM(CPU) ${KOKKOS_INTEL_LLVM_CPU_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n IntelLLVM(SYCL) ${KOKKOS_INTEL_LLVM_SYCL_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n NVCC ${KOKKOS_NVCC_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n HIPCC ${KOKKOS_HIPCC_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n NVHPC/PGI ${KOKKOS_NVHPC_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n MSVC ${KOKKOS_MSVC_MINIMUM}")
|
||||||
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\n XL/XLClang not supported")
|
||||||
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\nCompiler: ${KOKKOS_CXX_COMPILER_ID} ${KOKKOS_CXX_COMPILER_VERSION}\n")
|
SET(KOKKOS_MESSAGE_TEXT "${KOKKOS_MESSAGE_TEXT}\nCompiler: ${KOKKOS_CXX_COMPILER_ID} ${KOKKOS_CXX_COMPILER_VERSION}\n")
|
||||||
|
|
||||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang AND NOT Kokkos_ENABLE_CUDA)
|
IF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang AND NOT Kokkos_ENABLE_CUDA)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 8.0.0)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_CLANG_CPU_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang AND Kokkos_ENABLE_CUDA)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang AND Kokkos_ENABLE_CUDA)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 10.0.0)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_CLANG_CUDA_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL GNU)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL GNU)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 8.2.0)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_GCC_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Intel)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Intel)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 19.0.5)
|
IF((NOT CMAKE_CXX_STANDARD EQUAL 17) OR (KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_INTEL_MINIMUM}))
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL IntelLLVM AND NOT Kokkos_ENABLE_SYCL)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL IntelLLVM AND NOT Kokkos_ENABLE_SYCL)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 2021.1.1)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_INTEL_LLVM_CPU_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL IntelLLVM AND Kokkos_ENABLE_SYCL)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL IntelLLVM AND Kokkos_ENABLE_SYCL)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 2023.0.0)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_INTEL_LLVM_SYCL_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 11.0.0)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_NVCC_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
SET(CMAKE_CXX_EXTENSIONS OFF CACHE BOOL "Kokkos turns off CXX extensions" FORCE)
|
SET(CMAKE_CXX_EXTENSIONS OFF CACHE BOOL "Kokkos turns off CXX extensions" FORCE)
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL HIPCC)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL HIPCC)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 5.2.0)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_HIPCC_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL PGI OR KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL PGI OR KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 22.3)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_NVHPC_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
# Treat PGI internally as NVHPC to simplify handling both compilers.
|
# Treat PGI internally as NVHPC to simplify handling both compilers.
|
||||||
@ -206,13 +231,13 @@ ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL PGI OR KOKKOS_CXX_COMPILER_ID STREQUAL NV
|
|||||||
# backward-compatible to pgc++.
|
# backward-compatible to pgc++.
|
||||||
SET(KOKKOS_CXX_COMPILER_ID NVHPC CACHE STRING INTERNAL FORCE)
|
SET(KOKKOS_CXX_COMPILER_ID NVHPC CACHE STRING INTERNAL FORCE)
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL "MSVC")
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL "MSVC")
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 19.29)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS ${KOKKOS_MSVC_MINIMUM})
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL XL OR KOKKOS_CXX_COMPILER_ID STREQUAL XLClang)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL XL OR KOKKOS_CXX_COMPILER_ID STREQUAL XLClang)
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang AND Kokkos_ENABLE_OPENMPTARGET)
|
ELSEIF(KOKKOS_CXX_COMPILER_ID STREQUAL Clang AND Kokkos_ENABLE_OPENMPTARGET)
|
||||||
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS 15.0.0)
|
IF(KOKKOS_CXX_COMPILER_VERSION VERSION_LESS KOKKOS_CLANG_OPENMPTARGET_MINIMUM)
|
||||||
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
MESSAGE(FATAL_ERROR "${KOKKOS_MESSAGE_TEXT}")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|||||||
@ -75,8 +75,12 @@ KOKKOS_ENABLE_OPTION(IMPL_HIP_UNIFIED_MEMORY OFF "Whether to leverage unified me
|
|||||||
# This option will go away eventually, but allows fallback to old implementation when needed.
|
# This option will go away eventually, but allows fallback to old implementation when needed.
|
||||||
KOKKOS_ENABLE_OPTION(DESUL_ATOMICS_EXTERNAL OFF "Whether to use an external desul installation")
|
KOKKOS_ENABLE_OPTION(DESUL_ATOMICS_EXTERNAL OFF "Whether to use an external desul installation")
|
||||||
KOKKOS_ENABLE_OPTION(ATOMICS_BYPASS OFF "**NOT RECOMMENDED** Whether to make atomics non-atomic for non-threaded MPI-only use cases")
|
KOKKOS_ENABLE_OPTION(ATOMICS_BYPASS OFF "**NOT RECOMMENDED** Whether to make atomics non-atomic for non-threaded MPI-only use cases")
|
||||||
|
KOKKOS_ENABLE_OPTION(IMPL_REF_COUNT_BRANCH_UNLIKELY ON "Whether to use the C++20 `[[unlikely]]` attribute in the view reference counting")
|
||||||
|
mark_as_advanced(Kokkos_ENABLE_IMPL_REF_COUNT_BRANCH_UNLIKELY)
|
||||||
|
KOKKOS_ENABLE_OPTION(IMPL_VIEW_OF_VIEWS_DESTRUCTOR_PRECONDITION_VIOLATION_WORKAROUND OFF "Whether to enable a workaround for invalid use of View of Views that causes program hang on destruction.")
|
||||||
|
mark_as_advanced(Kokkos_ENABLE_IMPL_VIEW_OF_VIEWS_DESTRUCTOR_PRECONDITION_VIOLATION_WORKAROUND)
|
||||||
|
|
||||||
KOKKOS_ENABLE_OPTION(IMPL_MDSPAN OFF "Whether to enable experimental mdspan support")
|
KOKKOS_ENABLE_OPTION(IMPL_MDSPAN ON "Whether to enable experimental mdspan support")
|
||||||
KOKKOS_ENABLE_OPTION(MDSPAN_EXTERNAL OFF BOOL "Whether to use an external version of mdspan")
|
KOKKOS_ENABLE_OPTION(MDSPAN_EXTERNAL OFF BOOL "Whether to use an external version of mdspan")
|
||||||
KOKKOS_ENABLE_OPTION(IMPL_SKIP_COMPILER_MDSPAN ON BOOL "Whether to use an internal version of mdspan even if the compiler supports mdspan")
|
KOKKOS_ENABLE_OPTION(IMPL_SKIP_COMPILER_MDSPAN ON BOOL "Whether to use an internal version of mdspan even if the compiler supports mdspan")
|
||||||
mark_as_advanced(Kokkos_ENABLE_IMPL_MDSPAN)
|
mark_as_advanced(Kokkos_ENABLE_IMPL_MDSPAN)
|
||||||
|
|||||||
@ -709,7 +709,12 @@ MACRO(kokkos_find_imported NAME)
|
|||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
IF (NOT TPL_LIBRARY_SUFFIXES)
|
IF (NOT TPL_LIBRARY_SUFFIXES)
|
||||||
SET(TPL_LIBRARY_SUFFIXES lib lib64)
|
SET(TPL_LIBRARY_SUFFIXES lib)
|
||||||
|
IF(KOKKOS_IMPL_32BIT)
|
||||||
|
LIST(APPEND TPL_LIBRARY_SUFFIXES lib32)
|
||||||
|
ELSE()
|
||||||
|
LIST(APPEND TPL_LIBRARY_SUFFIXES lib64)
|
||||||
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
SET(${NAME}_INCLUDE_DIRS)
|
SET(${NAME}_INCLUDE_DIRS)
|
||||||
|
|||||||
@ -124,12 +124,8 @@ IF(KOKKOS_ENABLE_CUDA)
|
|||||||
ELSEIF(CMAKE_CXX_EXTENSIONS)
|
ELSEIF(CMAKE_CXX_EXTENSIONS)
|
||||||
MESSAGE(FATAL_ERROR "Compiling CUDA code with clang doesn't support C++ extensions. Set -DCMAKE_CXX_EXTENSIONS=OFF")
|
MESSAGE(FATAL_ERROR "Compiling CUDA code with clang doesn't support C++ extensions. Set -DCMAKE_CXX_EXTENSIONS=OFF")
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ELSEIF(NOT KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA AND NOT (Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER AND KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC))
|
ELSEIF(NOT KOKKOS_CXX_COMPILER_ID STREQUAL NVIDIA)
|
||||||
IF(KOKKOS_CXX_COMPILER_ID STREQUAL NVHPC)
|
MESSAGE(FATAL_ERROR "Invalid compiler for CUDA. The compiler must be nvcc_wrapper or Clang or use kokkos_launch_compiler, but compiler ID was ${KOKKOS_CXX_COMPILER_ID}")
|
||||||
MESSAGE(FATAL_ERROR "Invalid compiler for CUDA. To allow nvc++ as Cuda compiler, Kokkos_ENABLE_IMPL_NVHPC_AS_DEVICE_COMPILER=ON must be set!")
|
|
||||||
ELSE()
|
|
||||||
MESSAGE(FATAL_ERROR "Invalid compiler for CUDA. The compiler must be nvcc_wrapper or Clang or NVC++ or use kokkos_launch_compiler, but compiler ID was ${KOKKOS_CXX_COMPILER_ID}")
|
|
||||||
ENDIF()
|
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
|
|||||||
@ -103,13 +103,19 @@ if (Kokkos_ENABLE_IMPL_MDSPAN AND Kokkos_ENABLE_MDSPAN_EXTERNAL)
|
|||||||
endif()
|
endif()
|
||||||
|
|
||||||
IF (Kokkos_ENABLE_OPENMP)
|
IF (Kokkos_ENABLE_OPENMP)
|
||||||
find_package(OpenMP REQUIRED)
|
find_package(OpenMP REQUIRED COMPONENTS CXX)
|
||||||
# FIXME_TRILINOS Trilinos doesn't allow for Kokkos to use find_dependency
|
# FIXME_TRILINOS Trilinos doesn't allow for Kokkos to use find_dependency
|
||||||
# so we just append the flags here instead of linking with the OpenMP target.
|
# so we just append the flags here instead of linking with the OpenMP target.
|
||||||
IF(KOKKOS_HAS_TRILINOS)
|
IF(KOKKOS_HAS_TRILINOS)
|
||||||
COMPILER_SPECIFIC_FLAGS(DEFAULT ${OpenMP_CXX_FLAGS})
|
COMPILER_SPECIFIC_FLAGS(DEFAULT ${OpenMP_CXX_FLAGS})
|
||||||
ELSE()
|
ELSE()
|
||||||
KOKKOS_EXPORT_CMAKE_TPL(OpenMP REQUIRED)
|
KOKKOS_EXPORT_CMAKE_TPL(OpenMP REQUIRED COMPONENTS CXX)
|
||||||
|
ENDIF()
|
||||||
|
IF(Kokkos_ENABLE_HIP AND KOKKOS_COMPILE_LANGUAGE STREQUAL HIP)
|
||||||
|
GLOBAL_APPEND(KOKKOS_AMDGPU_OPTIONS ${OpenMP_CXX_FLAGS})
|
||||||
|
ENDIF()
|
||||||
|
IF(Kokkos_ENABLE_CUDA AND KOKKOS_COMPILE_LANGUAGE STREQUAL CUDA)
|
||||||
|
GLOBAL_APPEND(KOKKOS_CUDA_OPTIONS -Xcompiler ${OpenMP_CXX_FLAGS})
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
|
||||||
|
|||||||
@ -160,6 +160,12 @@ FUNCTION(KOKKOS_ADD_EXECUTABLE_AND_TEST ROOT_NAME)
|
|||||||
)
|
)
|
||||||
ENDIF()
|
ENDIF()
|
||||||
ENDIF()
|
ENDIF()
|
||||||
|
# We noticed problems with -fvisibility=hidden for inline static variables
|
||||||
|
# if Kokkos was built as shared library.
|
||||||
|
IF(BUILD_SHARED_LIBS)
|
||||||
|
SET_PROPERTY(TARGET ${PACKAGE_NAME}_${ROOT_NAME} PROPERTY VISIBILITY_INLINES_HIDDEN ON)
|
||||||
|
SET_PROPERTY(TARGET ${PACKAGE_NAME}_${ROOT_NAME} PROPERTY CXX_VISIBILITY_PRESET hidden)
|
||||||
|
ENDIF()
|
||||||
ENDFUNCTION()
|
ENDFUNCTION()
|
||||||
|
|
||||||
FUNCTION(KOKKOS_SET_EXE_PROPERTY ROOT_NAME)
|
FUNCTION(KOKKOS_SET_EXE_PROPERTY ROOT_NAME)
|
||||||
@ -241,34 +247,6 @@ MACRO(KOKKOS_CONFIGURE_CORE)
|
|||||||
KOKKOS_CONFIG_HEADER( KokkosCore_Config_HeaderSet.in KokkosCore_Config_FwdBackend.hpp "KOKKOS_FWD" "fwd/Kokkos_Fwd" "${KOKKOS_ENABLED_DEVICES}")
|
KOKKOS_CONFIG_HEADER( KokkosCore_Config_HeaderSet.in KokkosCore_Config_FwdBackend.hpp "KOKKOS_FWD" "fwd/Kokkos_Fwd" "${KOKKOS_ENABLED_DEVICES}")
|
||||||
KOKKOS_CONFIG_HEADER( KokkosCore_Config_HeaderSet.in KokkosCore_Config_SetupBackend.hpp "KOKKOS_SETUP" "setup/Kokkos_Setup" "${DEVICE_SETUP_LIST}")
|
KOKKOS_CONFIG_HEADER( KokkosCore_Config_HeaderSet.in KokkosCore_Config_SetupBackend.hpp "KOKKOS_SETUP" "setup/Kokkos_Setup" "${DEVICE_SETUP_LIST}")
|
||||||
KOKKOS_CONFIG_HEADER( KokkosCore_Config_HeaderSet.in KokkosCore_Config_DeclareBackend.hpp "KOKKOS_DECLARE" "decl/Kokkos_Declare" "${KOKKOS_ENABLED_DEVICES}")
|
KOKKOS_CONFIG_HEADER( KokkosCore_Config_HeaderSet.in KokkosCore_Config_DeclareBackend.hpp "KOKKOS_DECLARE" "decl/Kokkos_Declare" "${KOKKOS_ENABLED_DEVICES}")
|
||||||
SET(_DEFAULT_HOST_MEMSPACE "::Kokkos::HostSpace")
|
|
||||||
KOKKOS_OPTION(DEFAULT_DEVICE_MEMORY_SPACE "" STRING "Override default device memory space")
|
|
||||||
KOKKOS_OPTION(DEFAULT_HOST_MEMORY_SPACE "" STRING "Override default host memory space")
|
|
||||||
KOKKOS_OPTION(DEFAULT_DEVICE_EXECUTION_SPACE "" STRING "Override default device execution space")
|
|
||||||
KOKKOS_OPTION(DEFAULT_HOST_PARALLEL_EXECUTION_SPACE "" STRING "Override default host parallel execution space")
|
|
||||||
IF (NOT Kokkos_DEFAULT_DEVICE_EXECUTION_SPACE STREQUAL "")
|
|
||||||
SET(_DEVICE_PARALLEL ${Kokkos_DEFAULT_DEVICE_EXECUTION_SPACE})
|
|
||||||
MESSAGE(STATUS "Override default device execution space: ${_DEVICE_PARALLEL}")
|
|
||||||
SET(KOKKOS_DEVICE_SPACE_ACTIVE ON)
|
|
||||||
ELSE()
|
|
||||||
IF (_DEVICE_PARALLEL STREQUAL "NoTypeDefined")
|
|
||||||
SET(KOKKOS_DEVICE_SPACE_ACTIVE OFF)
|
|
||||||
ELSE()
|
|
||||||
SET(KOKKOS_DEVICE_SPACE_ACTIVE ON)
|
|
||||||
ENDIF()
|
|
||||||
ENDIF()
|
|
||||||
IF (NOT Kokkos_DEFAULT_HOST_PARALLEL_EXECUTION_SPACE STREQUAL "")
|
|
||||||
SET(_HOST_PARALLEL ${Kokkos_DEFAULT_HOST_PARALLEL_EXECUTION_SPACE})
|
|
||||||
MESSAGE(STATUS "Override default host parallel execution space: ${_HOST_PARALLEL}")
|
|
||||||
SET(KOKKOS_HOSTPARALLEL_SPACE_ACTIVE ON)
|
|
||||||
ELSE()
|
|
||||||
IF (_HOST_PARALLEL STREQUAL "NoTypeDefined")
|
|
||||||
SET(KOKKOS_HOSTPARALLEL_SPACE_ACTIVE OFF)
|
|
||||||
ELSE()
|
|
||||||
SET(KOKKOS_HOSTPARALLEL_SPACE_ACTIVE ON)
|
|
||||||
ENDIF()
|
|
||||||
ENDIF()
|
|
||||||
#We are ready to configure the header
|
|
||||||
CONFIGURE_FILE(cmake/KokkosCore_config.h.in KokkosCore_config.h @ONLY)
|
CONFIGURE_FILE(cmake/KokkosCore_config.h.in KokkosCore_config.h @ONLY)
|
||||||
ENDMACRO()
|
ENDMACRO()
|
||||||
|
|
||||||
@ -484,15 +462,10 @@ ENDFUNCTION()
|
|||||||
|
|
||||||
|
|
||||||
FUNCTION(KOKKOS_LIB_INCLUDE_DIRECTORIES TARGET)
|
FUNCTION(KOKKOS_LIB_INCLUDE_DIRECTORIES TARGET)
|
||||||
IF(KOKKOS_HAS_TRILINOS)
|
KOKKOS_LIB_TYPE(${TARGET} INCTYPE)
|
||||||
#ignore the target, tribits doesn't do anything directly with targets
|
FOREACH(DIR ${ARGN})
|
||||||
TRIBITS_INCLUDE_DIRECTORIES(${ARGN})
|
TARGET_INCLUDE_DIRECTORIES(${TARGET} ${INCTYPE} $<BUILD_INTERFACE:${DIR}>)
|
||||||
ELSE() #append to a list for later
|
ENDFOREACH()
|
||||||
KOKKOS_LIB_TYPE(${TARGET} INCTYPE)
|
|
||||||
FOREACH(DIR ${ARGN})
|
|
||||||
TARGET_INCLUDE_DIRECTORIES(${TARGET} ${INCTYPE} $<BUILD_INTERFACE:${DIR}>)
|
|
||||||
ENDFOREACH()
|
|
||||||
ENDIF()
|
|
||||||
ENDFUNCTION()
|
ENDFUNCTION()
|
||||||
|
|
||||||
FUNCTION(KOKKOS_LIB_COMPILE_OPTIONS TARGET)
|
FUNCTION(KOKKOS_LIB_COMPILE_OPTIONS TARGET)
|
||||||
|
|||||||
@ -1,26 +0,0 @@
|
|||||||
#@HEADER
|
|
||||||
# ************************************************************************
|
|
||||||
#
|
|
||||||
# Kokkos v. 4.0
|
|
||||||
# Copyright (2022) National Technology & Engineering
|
|
||||||
# Solutions of Sandia, LLC (NTESS).
|
|
||||||
#
|
|
||||||
# Under the terms of Contract DE-NA0003525 with NTESS,
|
|
||||||
# the U.S. Government retains certain rights in this software.
|
|
||||||
#
|
|
||||||
# Part of Kokkos, under the Apache License v2.0 with LLVM Exceptions.
|
|
||||||
#
|
|
||||||
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
||||||
#
|
|
||||||
#@HEADER
|
|
||||||
|
|
||||||
# Check for CUDA support
|
|
||||||
|
|
||||||
IF (NOT TPL_ENABLE_CUDA)
|
|
||||||
MESSAGE(FATAL_ERROR "\nCUSPARSE requires CUDA")
|
|
||||||
ELSE()
|
|
||||||
GLOBAL_SET(TPL_CUSPARSE_LIBRARY_DIRS)
|
|
||||||
GLOBAL_SET(TPL_CUSPARSE_INCLUDE_DIRS ${TPL_CUDA_INCLUDE_DIRS})
|
|
||||||
GLOBAL_SET(TPL_CUSPARSE_LIBRARIES ${CUDA_cusparse_LIBRARY})
|
|
||||||
ENDIF()
|
|
||||||
|
|
||||||
@ -944,13 +944,13 @@ class DualView : public ViewTraits<DataType, Properties...> {
|
|||||||
|
|
||||||
if (sizeMismatch) {
|
if (sizeMismatch) {
|
||||||
::Kokkos::realloc(arg_prop, d_view, n0, n1, n2, n3, n4, n5, n6, n7);
|
::Kokkos::realloc(arg_prop, d_view, n0, n1, n2, n3, n4, n5, n6, n7);
|
||||||
if (alloc_prop_input::initialize) {
|
if constexpr (alloc_prop_input::initialize) {
|
||||||
h_view = create_mirror_view(typename t_host::memory_space(), d_view);
|
h_view = create_mirror_view(typename t_host::memory_space(), d_view);
|
||||||
} else {
|
} else {
|
||||||
h_view = create_mirror_view(Kokkos::WithoutInitializing,
|
h_view = create_mirror_view(Kokkos::WithoutInitializing,
|
||||||
typename t_host::memory_space(), d_view);
|
typename t_host::memory_space(), d_view);
|
||||||
}
|
}
|
||||||
} else if (alloc_prop_input::initialize) {
|
} else if constexpr (alloc_prop_input::initialize) {
|
||||||
if constexpr (alloc_prop_input::has_execution_space) {
|
if constexpr (alloc_prop_input::has_execution_space) {
|
||||||
const auto& exec_space =
|
const auto& exec_space =
|
||||||
Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop);
|
Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop);
|
||||||
@ -1038,12 +1038,10 @@ class DualView : public ViewTraits<DataType, Properties...> {
|
|||||||
/* Resize on Device */
|
/* Resize on Device */
|
||||||
if (sizeMismatch) {
|
if (sizeMismatch) {
|
||||||
::Kokkos::resize(properties, d_view, n0, n1, n2, n3, n4, n5, n6, n7);
|
::Kokkos::resize(properties, d_view, n0, n1, n2, n3, n4, n5, n6, n7);
|
||||||
if (alloc_prop_input::initialize) {
|
// this part of the lambda was relocated in a method as it contains a
|
||||||
h_view = create_mirror_view(typename t_host::memory_space(), d_view);
|
// `if constexpr`. In some cases, both branches were evaluated
|
||||||
} else {
|
// leading to a compile error
|
||||||
h_view = create_mirror_view(Kokkos::WithoutInitializing,
|
resync_host(properties);
|
||||||
typename t_host::memory_space(), d_view);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Mark Device copy as modified */
|
/* Mark Device copy as modified */
|
||||||
++modified_flags(1);
|
++modified_flags(1);
|
||||||
@ -1054,13 +1052,10 @@ class DualView : public ViewTraits<DataType, Properties...> {
|
|||||||
/* Resize on Host */
|
/* Resize on Host */
|
||||||
if (sizeMismatch) {
|
if (sizeMismatch) {
|
||||||
::Kokkos::resize(properties, h_view, n0, n1, n2, n3, n4, n5, n6, n7);
|
::Kokkos::resize(properties, h_view, n0, n1, n2, n3, n4, n5, n6, n7);
|
||||||
if (alloc_prop_input::initialize) {
|
// this part of the lambda was relocated in a method as it contains a
|
||||||
d_view = create_mirror_view(typename t_dev::memory_space(), h_view);
|
// `if constexpr`. In some cases, both branches were evaluated
|
||||||
|
// leading to a compile error
|
||||||
} else {
|
resync_device(properties);
|
||||||
d_view = create_mirror_view(Kokkos::WithoutInitializing,
|
|
||||||
typename t_dev::memory_space(), h_view);
|
|
||||||
}
|
|
||||||
|
|
||||||
/* Mark Host copy as modified */
|
/* Mark Host copy as modified */
|
||||||
++modified_flags(0);
|
++modified_flags(0);
|
||||||
@ -1099,6 +1094,39 @@ class DualView : public ViewTraits<DataType, Properties...> {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
private:
|
||||||
|
// resync host mirror from device
|
||||||
|
// this code was relocated from a lambda as it contains a `if constexpr`.
|
||||||
|
// In some cases, both branches were evaluated, leading to a compile error
|
||||||
|
template <class... ViewCtorArgs>
|
||||||
|
inline void resync_host(Impl::ViewCtorProp<ViewCtorArgs...> const&) {
|
||||||
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
|
|
||||||
|
if constexpr (alloc_prop_input::initialize) {
|
||||||
|
h_view = create_mirror_view(typename t_host::memory_space(), d_view);
|
||||||
|
} else {
|
||||||
|
h_view = create_mirror_view(Kokkos::WithoutInitializing,
|
||||||
|
typename t_host::memory_space(), d_view);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// resync device mirror from host
|
||||||
|
// this code was relocated from a lambda as it contains a `if constexpr`
|
||||||
|
// In some cases, both branches were evaluated leading to a compile error
|
||||||
|
template <class... ViewCtorArgs>
|
||||||
|
inline void resync_device(Impl::ViewCtorProp<ViewCtorArgs...> const&) {
|
||||||
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
|
|
||||||
|
if constexpr (alloc_prop_input::initialize) {
|
||||||
|
d_view = create_mirror_view(typename t_dev::memory_space(), h_view);
|
||||||
|
|
||||||
|
} else {
|
||||||
|
d_view = create_mirror_view(Kokkos::WithoutInitializing,
|
||||||
|
typename t_dev::memory_space(), h_view);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
public:
|
||||||
void resize(const size_t n0 = KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
void resize(const size_t n0 = KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||||
const size_t n1 = KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
const size_t n1 = KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||||
const size_t n2 = KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
const size_t n2 = KOKKOS_IMPL_CTOR_DEFAULT_ARG,
|
||||||
|
|||||||
@ -1657,8 +1657,7 @@ KOKKOS_FUNCTION auto as_view_of_rank_n(
|
|||||||
|
|
||||||
if constexpr (std::is_same_v<decltype(layout), Kokkos::LayoutLeft> ||
|
if constexpr (std::is_same_v<decltype(layout), Kokkos::LayoutLeft> ||
|
||||||
std::is_same_v<decltype(layout), Kokkos::LayoutRight> ||
|
std::is_same_v<decltype(layout), Kokkos::LayoutRight> ||
|
||||||
std::is_same_v<decltype(layout), Kokkos::LayoutStride> ||
|
std::is_same_v<decltype(layout), Kokkos::LayoutStride>) {
|
||||||
is_layouttiled<decltype(layout)>::value) {
|
|
||||||
for (int i = N; i < 7; ++i)
|
for (int i = N; i < 7; ++i)
|
||||||
layout.dimension[i] = KOKKOS_IMPL_CTOR_DEFAULT_ARG;
|
layout.dimension[i] = KOKKOS_IMPL_CTOR_DEFAULT_ARG;
|
||||||
}
|
}
|
||||||
@ -1933,254 +1932,155 @@ struct MirrorDRVType {
|
|||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
|
// create a mirror
|
||||||
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline typename DynRankView<T, P...>::HostMirror create_mirror(
|
inline auto create_mirror(const DynRankView<T, P...>& src,
|
||||||
const DynRankView<T, P...>& src,
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
||||||
std::enable_if_t<!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>* =
|
|
||||||
nullptr) {
|
|
||||||
using src_type = DynRankView<T, P...>;
|
|
||||||
using dst_type = typename src_type::HostMirror;
|
|
||||||
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
|
||||||
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::has_label,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror "
|
|
||||||
"must not include a label!");
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
|
|
||||||
auto prop_copy = Impl::with_properties_if_unset(
|
auto prop_copy = Impl::with_properties_if_unset(
|
||||||
arg_prop, std::string(src.label()).append("_mirror"));
|
arg_prop, std::string(src.label()).append("_mirror"));
|
||||||
|
|
||||||
return dst_type(prop_copy, Impl::reconstructLayout(src.layout(), src.rank()));
|
if constexpr (Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
}
|
using dst_type = typename Impl::MirrorDRVType<
|
||||||
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
||||||
|
P...>::view_type;
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
return dst_type(prop_copy,
|
||||||
inline auto create_mirror(
|
Impl::reconstructLayout(src.layout(), src.rank()));
|
||||||
const DynRankView<T, P...>& src,
|
} else {
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
using src_type = DynRankView<T, P...>;
|
||||||
std::enable_if_t<Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>* =
|
using dst_type = typename src_type::HostMirror;
|
||||||
nullptr) {
|
|
||||||
using dst_type = typename Impl::MirrorDRVType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::view_type;
|
|
||||||
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
return dst_type(prop_copy,
|
||||||
|
Impl::reconstructLayout(src.layout(), src.rank()));
|
||||||
static_assert(
|
}
|
||||||
!alloc_prop_input::has_label,
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror "
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
"must not include a label!");
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
static_assert(
|
__builtin_unreachable();
|
||||||
!alloc_prop_input::has_pointer,
|
#endif
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
|
|
||||||
auto prop_copy = Impl::with_properties_if_unset(
|
|
||||||
arg_prop, std::string(src.label()).append("_mirror"));
|
|
||||||
|
|
||||||
return dst_type(prop_copy, Impl::reconstructLayout(src.layout(), src.rank()));
|
|
||||||
}
|
}
|
||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
// Create a mirror in host space
|
// public interface
|
||||||
template <class T, class... P>
|
template <class T, class... P,
|
||||||
inline typename DynRankView<T, P...>::HostMirror create_mirror(
|
class Enable = std::enable_if_t<
|
||||||
const DynRankView<T, P...>& src,
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
std::enable_if_t<std::is_same<typename ViewTraits<T, P...>::specialize,
|
inline auto create_mirror(const DynRankView<T, P...>& src) {
|
||||||
void>::value>* = nullptr) {
|
return Impl::create_mirror(src, Kokkos::view_alloc());
|
||||||
return Impl::create_mirror(src, Kokkos::Impl::ViewCtorProp<>{});
|
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P>
|
// public interface that accepts a without initializing flag
|
||||||
inline typename DynRankView<T, P...>::HostMirror create_mirror(
|
template <class T, class... P,
|
||||||
Kokkos::Impl::WithoutInitializing_t wi, const DynRankView<T, P...>& src,
|
class Enable = std::enable_if_t<
|
||||||
std::enable_if_t<std::is_same<typename ViewTraits<T, P...>::specialize,
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
void>::value>* = nullptr) {
|
inline auto create_mirror(Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
|
const DynRankView<T, P...>& src) {
|
||||||
return Impl::create_mirror(src, Kokkos::view_alloc(wi));
|
return Impl::create_mirror(src, Kokkos::view_alloc(wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
// public interface that accepts a space
|
||||||
inline typename DynRankView<T, P...>::HostMirror create_mirror(
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
|
||||||
const DynRankView<T, P...>& src,
|
|
||||||
std::enable_if_t<
|
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>* = nullptr) {
|
|
||||||
return Impl::create_mirror(src, arg_prop);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create a mirror in a new space
|
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
typename Enable = std::enable_if_t<
|
class Enable = std::enable_if_t<
|
||||||
Kokkos::is_space<Space>::value &&
|
Kokkos::is_space<Space>::value &&
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value>>
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
typename Impl::MirrorDRVType<Space, T, P...>::view_type create_mirror(
|
auto create_mirror(const Space&, const Kokkos::DynRankView<T, P...>& src) {
|
||||||
const Space&, const Kokkos::DynRankView<T, P...>& src) {
|
|
||||||
return Impl::create_mirror(
|
return Impl::create_mirror(
|
||||||
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
// public interface that accepts a space and a without initializing flag
|
||||||
typename Impl::MirrorDRVType<Space, T, P...>::view_type create_mirror(
|
template <class Space, class T, class... P,
|
||||||
Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
class Enable = std::enable_if_t<
|
||||||
const Kokkos::DynRankView<T, P...>& src,
|
Kokkos::is_space<Space>::value &&
|
||||||
std::enable_if_t<std::is_same<typename ViewTraits<T, P...>::specialize,
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
void>::value>* = nullptr) {
|
auto create_mirror(Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
||||||
|
const Kokkos::DynRankView<T, P...>& src) {
|
||||||
return Impl::create_mirror(
|
return Impl::create_mirror(
|
||||||
src, Kokkos::view_alloc(wi, typename Space::memory_space{}));
|
src, Kokkos::view_alloc(wi, typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
inline auto create_mirror(
|
// view_alloc
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
template <class T, class... P, class... ViewCtorArgs,
|
||||||
const DynRankView<T, P...>& src,
|
typename Enable = std::enable_if_t<
|
||||||
std::enable_if_t<
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
inline auto create_mirror(const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>* = nullptr) {
|
const DynRankView<T, P...>& src) {
|
||||||
using ReturnType = typename Impl::MirrorDRVType<
|
return Impl::create_mirror(src, arg_prop);
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::view_type;
|
|
||||||
return ReturnType{Impl::create_mirror(src, arg_prop)};
|
|
||||||
}
|
}
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
|
||||||
inline std::enable_if_t<
|
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
|
||||||
std::is_same<
|
|
||||||
typename DynRankView<T, P...>::memory_space,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<
|
|
||||||
typename DynRankView<T, P...>::data_type,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::data_type>::value,
|
|
||||||
typename DynRankView<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(const DynRankView<T, P...>& src,
|
|
||||||
const typename Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
|
// create a mirror view
|
||||||
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline std::enable_if_t<
|
inline auto create_mirror_view(
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
|
||||||
!(std::is_same<
|
|
||||||
typename DynRankView<T, P...>::memory_space,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<
|
|
||||||
typename DynRankView<T, P...>::data_type,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::data_type>::value),
|
|
||||||
typename DynRankView<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(
|
|
||||||
const DynRankView<T, P...>& src,
|
const DynRankView<T, P...>& src,
|
||||||
const typename Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
[[maybe_unused]] const typename Impl::ViewCtorProp<ViewCtorArgs...>&
|
||||||
return Kokkos::Impl::create_mirror(src, arg_prop);
|
arg_prop) {
|
||||||
|
if constexpr (!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
|
if constexpr (std::is_same<typename DynRankView<T, P...>::memory_space,
|
||||||
|
typename DynRankView<
|
||||||
|
T, P...>::HostMirror::memory_space>::value &&
|
||||||
|
std::is_same<typename DynRankView<T, P...>::data_type,
|
||||||
|
typename DynRankView<
|
||||||
|
T, P...>::HostMirror::data_type>::value) {
|
||||||
|
return typename DynRankView<T, P...>::HostMirror(src);
|
||||||
|
} else {
|
||||||
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if constexpr (Impl::MirrorDRViewType<typename Impl::ViewCtorProp<
|
||||||
|
ViewCtorArgs...>::memory_space,
|
||||||
|
T, P...>::is_same_memspace) {
|
||||||
|
return typename Impl::MirrorDRViewType<
|
||||||
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
||||||
|
P...>::view_type(src);
|
||||||
|
} else {
|
||||||
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
inline std::enable_if_t<
|
|
||||||
Kokkos::is_space<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space>::value &&
|
|
||||||
Impl::MirrorDRViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorDRViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::view_type>
|
|
||||||
create_mirror_view(const Kokkos::DynRankView<T, P...>& src,
|
|
||||||
const typename Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
inline std::enable_if_t<
|
|
||||||
Kokkos::is_space<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space>::value &&
|
|
||||||
!Impl::MirrorDRViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorDRViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::view_type>
|
|
||||||
create_mirror_view(
|
|
||||||
const Kokkos::DynRankView<T, P...>& src,
|
|
||||||
const typename Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
return Kokkos::Impl::create_mirror(src, arg_prop);
|
|
||||||
}
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
// Create a mirror view in host space
|
// public interface
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
inline std::enable_if_t<
|
inline auto create_mirror_view(const Kokkos::DynRankView<T, P...>& src) {
|
||||||
(std::is_same<
|
return Impl::create_mirror_view(src, Kokkos::view_alloc());
|
||||||
typename DynRankView<T, P...>::memory_space,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<typename DynRankView<T, P...>::data_type,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::data_type>::value),
|
|
||||||
typename DynRankView<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(const Kokkos::DynRankView<T, P...>& src) {
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P>
|
|
||||||
inline std::enable_if_t<
|
|
||||||
!(std::is_same<
|
|
||||||
typename DynRankView<T, P...>::memory_space,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<
|
|
||||||
typename DynRankView<T, P...>::data_type,
|
|
||||||
typename DynRankView<T, P...>::HostMirror::data_type>::value),
|
|
||||||
typename DynRankView<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(const Kokkos::DynRankView<T, P...>& src) {
|
|
||||||
return Kokkos::create_mirror(src);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts a without initializing flag
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
inline auto create_mirror_view(Kokkos::Impl::WithoutInitializing_t wi,
|
inline auto create_mirror_view(Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
const DynRankView<T, P...>& src) {
|
const DynRankView<T, P...>& src) {
|
||||||
return Impl::create_mirror_view(src, Kokkos::view_alloc(wi));
|
return Impl::create_mirror_view(src, Kokkos::view_alloc(wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Create a mirror view in a new space
|
// public interface that accepts a space
|
||||||
// FIXME_C++17 Improve SFINAE here.
|
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
inline typename Impl::MirrorDRViewType<Space, T, P...>::view_type
|
inline auto create_mirror_view(const Space&,
|
||||||
create_mirror_view(
|
const Kokkos::DynRankView<T, P...>& src) {
|
||||||
const Space&, const Kokkos::DynRankView<T, P...>& src,
|
return Impl::create_mirror_view(
|
||||||
std::enable_if_t<
|
src, Kokkos::view_alloc(typename Space::memory_space()));
|
||||||
Impl::MirrorDRViewType<Space, T, P...>::is_same_memspace>* = nullptr) {
|
|
||||||
return src;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// FIXME_C++17 Improve SFINAE here.
|
// public interface that accepts a space and a without initializing flag
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
inline typename Impl::MirrorDRViewType<Space, T, P...>::view_type
|
|
||||||
create_mirror_view(
|
|
||||||
const Space& space, const Kokkos::DynRankView<T, P...>& src,
|
|
||||||
std::enable_if_t<
|
|
||||||
!Impl::MirrorDRViewType<Space, T, P...>::is_same_memspace>* = nullptr) {
|
|
||||||
return Kokkos::create_mirror(space, src);
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
|
||||||
inline auto create_mirror_view(Kokkos::Impl::WithoutInitializing_t wi,
|
inline auto create_mirror_view(Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
const Space&,
|
const Space&,
|
||||||
const Kokkos::DynRankView<T, P...>& src) {
|
const Kokkos::DynRankView<T, P...>& src) {
|
||||||
@ -2188,6 +2088,8 @@ inline auto create_mirror_view(Kokkos::Impl::WithoutInitializing_t wi,
|
|||||||
src, Kokkos::view_alloc(typename Space::memory_space{}, wi));
|
src, Kokkos::view_alloc(typename Space::memory_space{}, wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
const typename Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
const typename Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
@ -2195,75 +2097,51 @@ inline auto create_mirror_view(
|
|||||||
return Impl::create_mirror_view(src, arg_prop);
|
return Impl::create_mirror_view(src, arg_prop);
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class... ViewCtorArgs, class T, class... P>
|
// create a mirror view and deep copy it
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
|
template <class... ViewCtorArgs, class T, class... P,
|
||||||
|
class Enable = std::enable_if_t<
|
||||||
|
std::is_void<typename ViewTraits<T, P...>::specialize>::value>>
|
||||||
auto create_mirror_view_and_copy(
|
auto create_mirror_view_and_copy(
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&,
|
[[maybe_unused]] const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
const Kokkos::DynRankView<T, P...>& src,
|
const Kokkos::DynRankView<T, P...>& src) {
|
||||||
std::enable_if_t<
|
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
|
||||||
Impl::MirrorDRViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::is_same_memspace>* = nullptr) {
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
static_assert(
|
|
||||||
alloc_prop_input::has_memory_space,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must include a memory space!");
|
|
||||||
static_assert(!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
|
|
||||||
// same behavior as deep_copy(src, src)
|
Impl::check_view_ctor_args_create_mirror_view_and_copy<ViewCtorArgs...>();
|
||||||
if (!alloc_prop_input::has_execution_space)
|
|
||||||
fence(
|
|
||||||
"Kokkos::create_mirror_view_and_copy: fence before returning src view");
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class... ViewCtorArgs, class T, class... P>
|
if constexpr (Impl::MirrorDRViewType<
|
||||||
auto create_mirror_view_and_copy(
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
T, P...>::is_same_memspace) {
|
||||||
const Kokkos::DynRankView<T, P...>& src,
|
// same behavior as deep_copy(src, src)
|
||||||
std::enable_if_t<
|
if constexpr (!alloc_prop_input::has_execution_space)
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
fence(
|
||||||
!Impl::MirrorDRViewType<
|
"Kokkos::create_mirror_view_and_copy: fence before returning src "
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
"view");
|
||||||
P...>::is_same_memspace>* = nullptr) {
|
return src;
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
} else {
|
||||||
static_assert(
|
using Space = typename alloc_prop_input::memory_space;
|
||||||
alloc_prop_input::has_memory_space,
|
using Mirror = typename Impl::MirrorDRViewType<Space, T, P...>::view_type;
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must include a memory space!");
|
|
||||||
static_assert(!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
using Space = typename alloc_prop_input::memory_space;
|
|
||||||
using Mirror = typename Impl::MirrorDRViewType<Space, T, P...>::view_type;
|
|
||||||
|
|
||||||
auto arg_prop_copy = Impl::with_properties_if_unset(
|
auto arg_prop_copy = Impl::with_properties_if_unset(
|
||||||
arg_prop, std::string{}, WithoutInitializing,
|
arg_prop, std::string{}, WithoutInitializing,
|
||||||
typename Space::execution_space{});
|
typename Space::execution_space{});
|
||||||
|
|
||||||
std::string& label = Impl::get_property<Impl::LabelTag>(arg_prop_copy);
|
std::string& label = Impl::get_property<Impl::LabelTag>(arg_prop_copy);
|
||||||
if (label.empty()) label = src.label();
|
if (label.empty()) label = src.label();
|
||||||
auto mirror = typename Mirror::non_const_type{
|
auto mirror = typename Mirror::non_const_type{
|
||||||
arg_prop_copy, Impl::reconstructLayout(src.layout(), src.rank())};
|
arg_prop_copy, Impl::reconstructLayout(src.layout(), src.rank())};
|
||||||
if constexpr (alloc_prop_input::has_execution_space) {
|
if constexpr (alloc_prop_input::has_execution_space) {
|
||||||
deep_copy(Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop_copy),
|
deep_copy(Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop_copy),
|
||||||
mirror, src);
|
mirror, src);
|
||||||
} else
|
} else
|
||||||
deep_copy(mirror, src);
|
deep_copy(mirror, src);
|
||||||
return mirror;
|
return mirror;
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC)
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
template <class Space, class T, class... P>
|
||||||
|
|||||||
@ -590,96 +590,81 @@ struct MirrorDynamicViewType {
|
|||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
|
||||||
inline auto create_mirror(
|
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
|
||||||
std::enable_if_t<!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>* =
|
|
||||||
nullptr) {
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
|
||||||
|
|
||||||
static_assert(
|
// create a mirror
|
||||||
!alloc_prop_input::has_label,
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror "
|
// view_alloc
|
||||||
"must not include a label!");
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
static_assert(
|
inline auto create_mirror(const Kokkos::Experimental::DynamicView<T, P...>& src,
|
||||||
!alloc_prop_input::has_pointer,
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
"not include a pointer!");
|
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
|
|
||||||
auto prop_copy = Impl::with_properties_if_unset(
|
auto prop_copy = Impl::with_properties_if_unset(
|
||||||
arg_prop, std::string(src.label()).append("_mirror"));
|
arg_prop, std::string(src.label()).append("_mirror"));
|
||||||
|
|
||||||
auto ret = typename Kokkos::Experimental::DynamicView<T, P...>::HostMirror(
|
if constexpr (Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
prop_copy, src.chunk_size(), src.chunk_max() * src.chunk_size());
|
using MemorySpace = typename alloc_prop_input::memory_space;
|
||||||
|
|
||||||
ret.resize_serial(src.extent(0));
|
auto ret = typename Kokkos::Impl::MirrorDynamicViewType<
|
||||||
|
MemorySpace, T, P...>::view_type(prop_copy, src.chunk_size(),
|
||||||
|
src.chunk_max() * src.chunk_size());
|
||||||
|
|
||||||
return ret;
|
ret.resize_serial(src.extent(0));
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
} else {
|
||||||
|
auto ret = typename Kokkos::Experimental::DynamicView<T, P...>::HostMirror(
|
||||||
|
prop_copy, src.chunk_size(), src.chunk_max() * src.chunk_size());
|
||||||
|
|
||||||
|
ret.resize_serial(src.extent(0));
|
||||||
|
|
||||||
|
return ret;
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
|
||||||
inline auto create_mirror(
|
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
|
||||||
std::enable_if_t<Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>* =
|
|
||||||
nullptr) {
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
|
||||||
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::has_label,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror "
|
|
||||||
"must not include a label!");
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
|
|
||||||
using MemorySpace = typename alloc_prop_input::memory_space;
|
|
||||||
auto prop_copy = Impl::with_properties_if_unset(
|
|
||||||
arg_prop, std::string(src.label()).append("_mirror"));
|
|
||||||
|
|
||||||
auto ret = typename Kokkos::Impl::MirrorDynamicViewType<
|
|
||||||
MemorySpace, T, P...>::view_type(prop_copy, src.chunk_size(),
|
|
||||||
src.chunk_max() * src.chunk_size());
|
|
||||||
|
|
||||||
ret.resize_serial(src.extent(0));
|
|
||||||
|
|
||||||
return ret;
|
|
||||||
}
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
// Create a mirror in host space
|
// public interface
|
||||||
template <class T, class... P>
|
template <class T, class... P,
|
||||||
|
typename Enable = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
return Impl::create_mirror(src, Impl::ViewCtorProp<>{});
|
return Impl::create_mirror(src, Impl::ViewCtorProp<>{});
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P>
|
// public interface that accepts a without initializing flag
|
||||||
|
template <class T, class... P,
|
||||||
|
typename Enable = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
Kokkos::Impl::WithoutInitializing_t wi,
|
Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
return Impl::create_mirror(src, Kokkos::view_alloc(wi));
|
return Impl::create_mirror(src, Kokkos::view_alloc(wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Create a mirror in a new space
|
// public interface that accepts a space
|
||||||
template <class Space, class T, class... P>
|
template <class Space, class T, class... P,
|
||||||
|
typename Enable = std::enable_if_t<
|
||||||
|
Kokkos::is_space<Space>::value &&
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
const Space&, const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const Space&, const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
return Impl::create_mirror(
|
return Impl::create_mirror(
|
||||||
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
// public interface that accepts a space and a without initializing flag
|
||||||
|
template <class Space, class T, class... P,
|
||||||
|
typename Enable = std::enable_if_t<
|
||||||
|
Kokkos::is_space<Space>::value &&
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
typename Kokkos::Impl::MirrorDynamicViewType<Space, T, P...>::view_type
|
typename Kokkos::Impl::MirrorDynamicViewType<Space, T, P...>::view_type
|
||||||
create_mirror(Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
create_mirror(Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
@ -687,7 +672,11 @@ create_mirror(Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
|||||||
src, Kokkos::view_alloc(wi, typename Space::memory_space{}));
|
src, Kokkos::view_alloc(wi, typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
|
template <class T, class... P, class... ViewCtorArgs,
|
||||||
|
typename Enable = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
@ -696,76 +685,56 @@ inline auto create_mirror(
|
|||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
|
// create a mirror view
|
||||||
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline std::enable_if_t<
|
inline auto create_mirror_view(
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
const Kokkos::Experimental::DynamicView<T, P...>& src,
|
||||||
(std::is_same<
|
[[maybe_unused]] const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
typename Kokkos::Experimental::DynamicView<T, P...>::memory_space,
|
if constexpr (!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
typename Kokkos::Experimental::DynamicView<
|
if constexpr (std::is_same<typename Kokkos::Experimental::DynamicView<
|
||||||
T, P...>::HostMirror::memory_space>::value &&
|
T, P...>::memory_space,
|
||||||
std::is_same<
|
typename Kokkos::Experimental::DynamicView<
|
||||||
typename Kokkos::Experimental::DynamicView<T, P...>::data_type,
|
T, P...>::HostMirror::memory_space>::value &&
|
||||||
typename Kokkos::Experimental::DynamicView<
|
std::is_same<typename Kokkos::Experimental::DynamicView<
|
||||||
T, P...>::HostMirror::data_type>::value),
|
T, P...>::data_type,
|
||||||
typename Kokkos::Experimental::DynamicView<T, P...>::HostMirror>
|
typename Kokkos::Experimental::DynamicView<
|
||||||
create_mirror_view(const Kokkos::Experimental::DynamicView<T, P...>& src,
|
T, P...>::HostMirror::data_type>::value) {
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
return
|
||||||
return src;
|
typename Kokkos::Experimental::DynamicView<T, P...>::HostMirror(src);
|
||||||
|
} else {
|
||||||
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if constexpr (Impl::MirrorDynamicViewType<
|
||||||
|
typename Impl::ViewCtorProp<
|
||||||
|
ViewCtorArgs...>::memory_space,
|
||||||
|
T, P...>::is_same_memspace) {
|
||||||
|
return typename Impl::MirrorDynamicViewType<
|
||||||
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
||||||
|
P...>::view_type(src);
|
||||||
|
} else {
|
||||||
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
|
||||||
inline std::enable_if_t<
|
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
|
||||||
!(std::is_same<
|
|
||||||
typename Kokkos::Experimental::DynamicView<T, P...>::memory_space,
|
|
||||||
typename Kokkos::Experimental::DynamicView<
|
|
||||||
T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<
|
|
||||||
typename Kokkos::Experimental::DynamicView<T, P...>::data_type,
|
|
||||||
typename Kokkos::Experimental::DynamicView<
|
|
||||||
T, P...>::HostMirror::data_type>::value),
|
|
||||||
typename Kokkos::Experimental::DynamicView<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(const Kokkos::Experimental::DynamicView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
return Kokkos::create_mirror(arg_prop, src);
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
std::enable_if_t<Impl::MirrorDynamicViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorDynamicViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::view_type>
|
|
||||||
create_mirror_view(const Kokkos::Experimental::DynamicView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
std::enable_if_t<!Impl::MirrorDynamicViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorDynamicViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::view_type>
|
|
||||||
create_mirror_view(const Kokkos::Experimental::DynamicView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
return Kokkos::Impl::create_mirror(src, arg_prop);
|
|
||||||
}
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
// Create a mirror view in host space
|
// public interface
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
const typename Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const typename Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
return Impl::create_mirror_view(src, Impl::ViewCtorProp<>{});
|
return Impl::create_mirror_view(src, Impl::ViewCtorProp<>{});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts a without initializing flag
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
Kokkos::Impl::WithoutInitializing_t wi,
|
Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
@ -773,15 +742,18 @@ inline auto create_mirror_view(
|
|||||||
return Impl::create_mirror_view(src, Kokkos::view_alloc(wi));
|
return Impl::create_mirror_view(src, Kokkos::view_alloc(wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Create a mirror in a new space
|
// public interface that accepts a space
|
||||||
template <class Space, class T, class... P>
|
template <class Space, class T, class... P,
|
||||||
|
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
const Space&, const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const Space&, const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
return Impl::create_mirror_view(src,
|
return Impl::create_mirror_view(src,
|
||||||
view_alloc(typename Space::memory_space{}));
|
view_alloc(typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
// public interface that accepts a space and a without initializing flag
|
||||||
|
template <class Space, class T, class... P,
|
||||||
|
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
@ -789,6 +761,8 @@ inline auto create_mirror_view(
|
|||||||
src, Kokkos::view_alloc(wi, typename Space::memory_space{}));
|
src, Kokkos::view_alloc(wi, typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
@ -985,80 +959,57 @@ struct ViewCopy<Kokkos::Experimental::DynamicView<DP...>,
|
|||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
template <class... ViewCtorArgs, class T, class... P>
|
// create a mirror view and deep copy it
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
|
template <class... ViewCtorArgs, class T, class... P,
|
||||||
|
class Enable = std::enable_if_t<
|
||||||
|
std::is_void<typename ViewTraits<T, P...>::specialize>::value>>
|
||||||
auto create_mirror_view_and_copy(
|
auto create_mirror_view_and_copy(
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&,
|
[[maybe_unused]] const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src,
|
const Kokkos::Experimental::DynamicView<T, P...>& src) {
|
||||||
std::enable_if_t<
|
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
|
||||||
Impl::MirrorDynamicViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::is_same_memspace>* = nullptr) {
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
static_assert(
|
|
||||||
alloc_prop_input::has_memory_space,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must include a memory space!");
|
|
||||||
static_assert(!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
|
|
||||||
// same behavior as deep_copy(src, src)
|
Impl::check_view_ctor_args_create_mirror_view_and_copy<ViewCtorArgs...>();
|
||||||
if (!alloc_prop_input::has_execution_space)
|
|
||||||
fence(
|
if constexpr (Impl::MirrorDynamicViewType<
|
||||||
"Kokkos::create_mirror_view_and_copy: fence before returning src view");
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
||||||
return src;
|
T, P...>::is_same_memspace) {
|
||||||
|
// same behavior as deep_copy(src, src)
|
||||||
|
if constexpr (!alloc_prop_input::has_execution_space)
|
||||||
|
fence(
|
||||||
|
"Kokkos::create_mirror_view_and_copy: fence before returning src "
|
||||||
|
"view");
|
||||||
|
return src;
|
||||||
|
} else {
|
||||||
|
using Space = typename alloc_prop_input::memory_space;
|
||||||
|
using Mirror =
|
||||||
|
typename Impl::MirrorDynamicViewType<Space, T, P...>::view_type;
|
||||||
|
|
||||||
|
auto arg_prop_copy = Impl::with_properties_if_unset(
|
||||||
|
arg_prop, std::string{}, WithoutInitializing,
|
||||||
|
typename Space::execution_space{});
|
||||||
|
|
||||||
|
std::string& label = Impl::get_property<Impl::LabelTag>(arg_prop_copy);
|
||||||
|
if (label.empty()) label = src.label();
|
||||||
|
auto mirror = typename Mirror::non_const_type(
|
||||||
|
arg_prop_copy, src.chunk_size(), src.chunk_max() * src.chunk_size());
|
||||||
|
mirror.resize_serial(src.extent(0));
|
||||||
|
if constexpr (alloc_prop_input::has_execution_space) {
|
||||||
|
deep_copy(Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop_copy),
|
||||||
|
mirror, src);
|
||||||
|
} else
|
||||||
|
deep_copy(mirror, src);
|
||||||
|
return mirror;
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC)
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class... ViewCtorArgs, class T, class... P>
|
template <class Space, class T, class... P,
|
||||||
auto create_mirror_view_and_copy(
|
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
|
||||||
const Kokkos::Experimental::DynamicView<T, P...>& src,
|
|
||||||
std::enable_if_t<
|
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
|
||||||
!Impl::MirrorDynamicViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::is_same_memspace>* = nullptr) {
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
|
||||||
static_assert(
|
|
||||||
alloc_prop_input::has_memory_space,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must include a memory space!");
|
|
||||||
static_assert(!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
using Space = typename alloc_prop_input::memory_space;
|
|
||||||
using Mirror =
|
|
||||||
typename Impl::MirrorDynamicViewType<Space, T, P...>::view_type;
|
|
||||||
|
|
||||||
auto arg_prop_copy = Impl::with_properties_if_unset(
|
|
||||||
arg_prop, std::string{}, WithoutInitializing,
|
|
||||||
typename Space::execution_space{});
|
|
||||||
|
|
||||||
std::string& label = Impl::get_property<Impl::LabelTag>(arg_prop_copy);
|
|
||||||
if (label.empty()) label = src.label();
|
|
||||||
auto mirror = typename Mirror::non_const_type(
|
|
||||||
arg_prop_copy, src.chunk_size(), src.chunk_max() * src.chunk_size());
|
|
||||||
mirror.resize_serial(src.extent(0));
|
|
||||||
if constexpr (alloc_prop_input::has_execution_space) {
|
|
||||||
deep_copy(Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop_copy),
|
|
||||||
mirror, src);
|
|
||||||
} else
|
|
||||||
deep_copy(mirror, src);
|
|
||||||
return mirror;
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
|
||||||
auto create_mirror_view_and_copy(
|
auto create_mirror_view_and_copy(
|
||||||
const Space&, const Kokkos::Experimental::DynamicView<T, P...>& src,
|
const Space&, const Kokkos::Experimental::DynamicView<T, P...>& src,
|
||||||
std::string const& name = "") {
|
std::string const& name = "") {
|
||||||
|
|||||||
@ -471,62 +471,31 @@ class OffsetView : public ViewTraits<DataType, Properties...> {
|
|||||||
template <typename I0, typename I1>
|
template <typename I0, typename I1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
||||||
(Kokkos::Impl::are_integral<I0, I1>::value && (2 == Rank) &&
|
(Kokkos::Impl::are_integral<I0, I1>::value && (2 == Rank) &&
|
||||||
is_default_map && is_layout_left && (traits::rank_dynamic == 0)),
|
is_default_map &&
|
||||||
|
(is_layout_left || is_layout_right || is_layout_stride)),
|
||||||
reference_type>
|
reference_type>
|
||||||
operator()(const I0& i0, const I1& i1) const {
|
operator()(const I0& i0, const I1& i1) const {
|
||||||
KOKKOS_IMPL_OFFSETVIEW_OPERATOR_VERIFY((m_track, m_map, m_begins, i0, i1))
|
KOKKOS_IMPL_OFFSETVIEW_OPERATOR_VERIFY((m_track, m_map, m_begins, i0, i1))
|
||||||
const size_t j0 = i0 - m_begins[0];
|
const size_t j0 = i0 - m_begins[0];
|
||||||
const size_t j1 = i1 - m_begins[1];
|
const size_t j1 = i1 - m_begins[1];
|
||||||
return m_map.m_impl_handle[j0 + m_map.m_impl_offset.m_dim.N0 * j1];
|
if constexpr (is_layout_left) {
|
||||||
}
|
if constexpr (traits::rank_dynamic == 0)
|
||||||
|
return m_map.m_impl_handle[j0 + m_map.m_impl_offset.m_dim.N0 * j1];
|
||||||
template <typename I0, typename I1>
|
else
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
return m_map.m_impl_handle[j0 + m_map.m_impl_offset.m_stride * j1];
|
||||||
(Kokkos::Impl::are_integral<I0, I1>::value && (2 == Rank) &&
|
} else if constexpr (is_layout_right) {
|
||||||
is_default_map && is_layout_left && (traits::rank_dynamic != 0)),
|
if constexpr (traits::rank_dynamic == 0)
|
||||||
reference_type>
|
return m_map.m_impl_handle[j1 + m_map.m_impl_offset.m_dim.N1 * j0];
|
||||||
operator()(const I0& i0, const I1& i1) const {
|
else
|
||||||
KOKKOS_IMPL_OFFSETVIEW_OPERATOR_VERIFY((m_track, m_map, m_begins, i0, i1))
|
return m_map.m_impl_handle[j1 + m_map.m_impl_offset.m_stride * j0];
|
||||||
const size_t j0 = i0 - m_begins[0];
|
} else {
|
||||||
const size_t j1 = i1 - m_begins[1];
|
static_assert(is_layout_stride);
|
||||||
return m_map.m_impl_handle[j0 + m_map.m_impl_offset.m_stride * j1];
|
return m_map.m_impl_handle[j0 * m_map.m_impl_offset.m_stride.S0 +
|
||||||
}
|
j1 * m_map.m_impl_offset.m_stride.S1];
|
||||||
|
}
|
||||||
template <typename I0, typename I1>
|
#if defined(KOKKOS_COMPILER_INTEL)
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
__builtin_unreachable();
|
||||||
(Kokkos::Impl::are_integral<I0, I1>::value && (2 == Rank) &&
|
#endif
|
||||||
is_default_map && is_layout_right && (traits::rank_dynamic == 0)),
|
|
||||||
reference_type>
|
|
||||||
operator()(const I0& i0, const I1& i1) const {
|
|
||||||
KOKKOS_IMPL_OFFSETVIEW_OPERATOR_VERIFY((m_track, m_map, m_begins, i0, i1))
|
|
||||||
const size_t j0 = i0 - m_begins[0];
|
|
||||||
const size_t j1 = i1 - m_begins[1];
|
|
||||||
return m_map.m_impl_handle[j1 + m_map.m_impl_offset.m_dim.N1 * j0];
|
|
||||||
}
|
|
||||||
|
|
||||||
template <typename I0, typename I1>
|
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
|
||||||
(Kokkos::Impl::are_integral<I0, I1>::value && (2 == Rank) &&
|
|
||||||
is_default_map && is_layout_right && (traits::rank_dynamic != 0)),
|
|
||||||
reference_type>
|
|
||||||
operator()(const I0& i0, const I1& i1) const {
|
|
||||||
KOKKOS_IMPL_OFFSETVIEW_OPERATOR_VERIFY((m_track, m_map, m_begins, i0, i1))
|
|
||||||
const size_t j0 = i0 - m_begins[0];
|
|
||||||
const size_t j1 = i1 - m_begins[1];
|
|
||||||
return m_map.m_impl_handle[j1 + m_map.m_impl_offset.m_stride * j0];
|
|
||||||
}
|
|
||||||
|
|
||||||
template <typename I0, typename I1>
|
|
||||||
KOKKOS_FORCEINLINE_FUNCTION
|
|
||||||
std::enable_if_t<(Kokkos::Impl::are_integral<I0, I1>::value &&
|
|
||||||
(2 == Rank) && is_default_map && is_layout_stride),
|
|
||||||
reference_type>
|
|
||||||
operator()(const I0& i0, const I1& i1) const {
|
|
||||||
KOKKOS_IMPL_OFFSETVIEW_OPERATOR_VERIFY((m_track, m_map, m_begins, i0, i1))
|
|
||||||
const size_t j0 = i0 - m_begins[0];
|
|
||||||
const size_t j1 = i1 - m_begins[1];
|
|
||||||
return m_map.m_impl_handle[j0 * m_map.m_impl_offset.m_stride.S0 +
|
|
||||||
j1 * m_map.m_impl_offset.m_stride.S1];
|
|
||||||
}
|
}
|
||||||
|
|
||||||
//------------------------------
|
//------------------------------
|
||||||
@ -1841,71 +1810,73 @@ struct MirrorOffsetType {
|
|||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
|
||||||
inline std::enable_if_t<
|
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space,
|
|
||||||
typename Kokkos::Experimental::OffsetView<T, P...>::HostMirror>
|
|
||||||
create_mirror(const Kokkos::Experimental::OffsetView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
return typename Kokkos::Experimental::OffsetView<T, P...>::HostMirror(
|
|
||||||
Kokkos::create_mirror(arg_prop, src.view()), src.begins());
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
// create a mirror
|
||||||
class = std::enable_if_t<
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
// view_alloc
|
||||||
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline auto create_mirror(const Kokkos::Experimental::OffsetView<T, P...>& src,
|
inline auto create_mirror(const Kokkos::Experimental::OffsetView<T, P...>& src,
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
||||||
using Space = typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space;
|
|
||||||
|
|
||||||
static_assert(
|
if constexpr (Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
!alloc_prop_input::has_label,
|
using Space = typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space;
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror "
|
|
||||||
"must not include a label!");
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(
|
|
||||||
!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to Kokkos::create_mirror must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
|
|
||||||
auto prop_copy = Impl::with_properties_if_unset(
|
auto prop_copy = Impl::with_properties_if_unset(
|
||||||
arg_prop, std::string(src.label()).append("_mirror"));
|
arg_prop, std::string(src.label()).append("_mirror"));
|
||||||
|
|
||||||
return typename Kokkos::Impl::MirrorOffsetType<Space, T, P...>::view_type(
|
return typename Kokkos::Impl::MirrorOffsetType<Space, T, P...>::view_type(
|
||||||
prop_copy, src.layout(),
|
prop_copy, src.layout(),
|
||||||
{src.begin(0), src.begin(1), src.begin(2), src.begin(3), src.begin(4),
|
{src.begin(0), src.begin(1), src.begin(2), src.begin(3), src.begin(4),
|
||||||
src.begin(5), src.begin(6), src.begin(7)});
|
src.begin(5), src.begin(6), src.begin(7)});
|
||||||
|
} else {
|
||||||
|
return typename Kokkos::Experimental::OffsetView<T, P...>::HostMirror(
|
||||||
|
Kokkos::create_mirror(arg_prop, src.view()), src.begins());
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
// Create a mirror in host space
|
// public interface
|
||||||
template <class T, class... P>
|
template <class T, class... P,
|
||||||
|
typename = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
||||||
return Impl::create_mirror(src, Impl::ViewCtorProp<>{});
|
return Impl::create_mirror(src, Impl::ViewCtorProp<>{});
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P>
|
// public interface that accepts a without initializing flag
|
||||||
|
template <class T, class... P,
|
||||||
|
typename = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
Kokkos::Impl::WithoutInitializing_t wi,
|
Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
||||||
return Impl::create_mirror(src, Kokkos::view_alloc(wi));
|
return Impl::create_mirror(src, Kokkos::view_alloc(wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Create a mirror in a new space
|
// public interface that accepts a space
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
typename Enable = std::enable_if_t<
|
||||||
|
Kokkos::is_space<Space>::value &&
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
const Space&, const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
const Space&, const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
||||||
return Impl::create_mirror(
|
return Impl::create_mirror(
|
||||||
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
// public interface that accepts a space and a without initializing flag
|
||||||
|
template <class Space, class T, class... P,
|
||||||
|
typename Enable = std::enable_if_t<
|
||||||
|
Kokkos::is_space<Space>::value &&
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
typename Kokkos::Impl::MirrorOffsetType<Space, T, P...>::view_type
|
typename Kokkos::Impl::MirrorOffsetType<Space, T, P...>::view_type
|
||||||
create_mirror(Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
create_mirror(Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
||||||
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
||||||
@ -1913,7 +1884,11 @@ create_mirror(Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
|||||||
src, Kokkos::view_alloc(typename Space::memory_space{}, wi));
|
src, Kokkos::view_alloc(typename Space::memory_space{}, wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
|
template <class T, class... P, class... ViewCtorArgs,
|
||||||
|
typename = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
inline auto create_mirror(
|
inline auto create_mirror(
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
||||||
@ -1921,76 +1896,56 @@ inline auto create_mirror(
|
|||||||
}
|
}
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
|
// create a mirror view
|
||||||
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline std::enable_if_t<
|
inline auto create_mirror_view(
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
const Kokkos::Experimental::OffsetView<T, P...>& src,
|
||||||
(std::is_same<
|
[[maybe_unused]] const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
typename Kokkos::Experimental::OffsetView<T, P...>::memory_space,
|
if constexpr (!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
typename Kokkos::Experimental::OffsetView<
|
if constexpr (std::is_same<typename Kokkos::Experimental::OffsetView<
|
||||||
T, P...>::HostMirror::memory_space>::value &&
|
T, P...>::memory_space,
|
||||||
std::is_same<
|
typename Kokkos::Experimental::OffsetView<
|
||||||
typename Kokkos::Experimental::OffsetView<T, P...>::data_type,
|
T, P...>::HostMirror::memory_space>::value &&
|
||||||
typename Kokkos::Experimental::OffsetView<
|
std::is_same<typename Kokkos::Experimental::OffsetView<
|
||||||
T, P...>::HostMirror::data_type>::value),
|
T, P...>::data_type,
|
||||||
typename Kokkos::Experimental::OffsetView<T, P...>::HostMirror>
|
typename Kokkos::Experimental::OffsetView<
|
||||||
create_mirror_view(const Kokkos::Experimental::OffsetView<T, P...>& src,
|
T, P...>::HostMirror::data_type>::value) {
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
return
|
||||||
return src;
|
typename Kokkos::Experimental::OffsetView<T, P...>::HostMirror(src);
|
||||||
|
} else {
|
||||||
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
if constexpr (Impl::MirrorOffsetViewType<typename Impl::ViewCtorProp<
|
||||||
|
ViewCtorArgs...>::memory_space,
|
||||||
|
T, P...>::is_same_memspace) {
|
||||||
|
return typename Impl::MirrorOffsetViewType<
|
||||||
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
||||||
|
P...>::view_type(src);
|
||||||
|
} else {
|
||||||
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
|
||||||
inline std::enable_if_t<
|
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
|
||||||
!(std::is_same<
|
|
||||||
typename Kokkos::Experimental::OffsetView<T, P...>::memory_space,
|
|
||||||
typename Kokkos::Experimental::OffsetView<
|
|
||||||
T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<
|
|
||||||
typename Kokkos::Experimental::OffsetView<T, P...>::data_type,
|
|
||||||
typename Kokkos::Experimental::OffsetView<
|
|
||||||
T, P...>::HostMirror::data_type>::value),
|
|
||||||
typename Kokkos::Experimental::OffsetView<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(const Kokkos::Experimental::OffsetView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
return Kokkos::create_mirror(arg_prop, src);
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
std::enable_if_t<Impl::MirrorOffsetViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorOffsetViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::view_type>
|
|
||||||
create_mirror_view(const Kokkos::Experimental::OffsetView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
std::enable_if_t<!Impl::MirrorOffsetViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorOffsetViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::view_type>
|
|
||||||
create_mirror_view(const Kokkos::Experimental::OffsetView<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
return Kokkos::Impl::create_mirror(src, arg_prop);
|
|
||||||
}
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
// Create a mirror view in host space
|
// public interface
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
const typename Kokkos::Experimental::OffsetView<T, P...>& src) {
|
const typename Kokkos::Experimental::OffsetView<T, P...>& src) {
|
||||||
return Impl::create_mirror_view(src, Impl::ViewCtorProp<>{});
|
return Impl::create_mirror_view(src, Impl::ViewCtorProp<>{});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts a without initializing flag
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
Kokkos::Impl::WithoutInitializing_t wi,
|
Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
@ -1998,7 +1953,7 @@ inline auto create_mirror_view(
|
|||||||
return Impl::create_mirror_view(src, Kokkos::view_alloc(wi));
|
return Impl::create_mirror_view(src, Kokkos::view_alloc(wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Create a mirror view in a new space
|
// public interface that accepts a space
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
@ -2007,7 +1962,9 @@ inline auto create_mirror_view(
|
|||||||
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
src, Kokkos::view_alloc(typename Space::memory_space{}));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class Space, class T, class... P>
|
// public interface that accepts a space and a without initializing flag
|
||||||
|
template <class Space, class T, class... P,
|
||||||
|
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
Kokkos::Impl::WithoutInitializing_t wi, const Space&,
|
||||||
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
const Kokkos::Experimental::OffsetView<T, P...>& src) {
|
||||||
@ -2015,6 +1972,8 @@ inline auto create_mirror_view(
|
|||||||
src, Kokkos::view_alloc(typename Space::memory_space{}, wi));
|
src, Kokkos::view_alloc(typename Space::memory_space{}, wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline auto create_mirror_view(
|
inline auto create_mirror_view(
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
@ -2022,7 +1981,9 @@ inline auto create_mirror_view(
|
|||||||
return Impl::create_mirror_view(src, arg_prop);
|
return Impl::create_mirror_view(src, arg_prop);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Create a mirror view and deep_copy in a new space
|
// create a mirror view and deep copy it
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class... ViewCtorArgs, class T, class... P>
|
template <class... ViewCtorArgs, class T, class... P>
|
||||||
typename Kokkos::Impl::MirrorOffsetViewType<
|
typename Kokkos::Impl::MirrorOffsetViewType<
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
||||||
|
|||||||
@ -805,56 +805,94 @@ class UnorderedMap {
|
|||||||
return *this;
|
return *this;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Re-allocate the views of the calling UnorderedMap according to src
|
||||||
|
// capacity, and deep copy the src data.
|
||||||
template <typename SKey, typename SValue, typename SDevice>
|
template <typename SKey, typename SValue, typename SDevice>
|
||||||
std::enable_if_t<std::is_same<std::remove_const_t<SKey>, key_type>::value &&
|
std::enable_if_t<std::is_same<std::remove_const_t<SKey>, key_type>::value &&
|
||||||
std::is_same<std::remove_const_t<SValue>, value_type>::value>
|
std::is_same<std::remove_const_t<SValue>, value_type>::value>
|
||||||
create_copy_view(
|
create_copy_view(
|
||||||
UnorderedMap<SKey, SValue, SDevice, Hasher, EqualTo> const &src) {
|
UnorderedMap<SKey, SValue, SDevice, Hasher, EqualTo> const &src) {
|
||||||
if (m_hash_lists.data() != src.m_hash_lists.data()) {
|
if (m_hash_lists.data() != src.m_hash_lists.data()) {
|
||||||
insertable_map_type tmp;
|
allocate_view(src);
|
||||||
|
deep_copy_view(src);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
tmp.m_bounded_insert = src.m_bounded_insert;
|
// Allocate views of the calling UnorderedMap with the same capacity as the
|
||||||
tmp.m_hasher = src.m_hasher;
|
// src.
|
||||||
tmp.m_equal_to = src.m_equal_to;
|
template <typename SKey, typename SValue, typename SDevice>
|
||||||
tmp.m_size() = src.m_size();
|
std::enable_if_t<std::is_same<std::remove_const_t<SKey>, key_type>::value &&
|
||||||
tmp.m_available_indexes = bitset_type(src.capacity());
|
std::is_same<std::remove_const_t<SValue>, value_type>::value>
|
||||||
tmp.m_hash_lists = size_type_view(
|
allocate_view(
|
||||||
view_alloc(WithoutInitializing, "UnorderedMap hash list"),
|
UnorderedMap<SKey, SValue, SDevice, Hasher, EqualTo> const &src) {
|
||||||
src.m_hash_lists.extent(0));
|
insertable_map_type tmp;
|
||||||
tmp.m_next_index = size_type_view(
|
|
||||||
view_alloc(WithoutInitializing, "UnorderedMap next index"),
|
|
||||||
src.m_next_index.extent(0));
|
|
||||||
tmp.m_keys =
|
|
||||||
key_type_view(view_alloc(WithoutInitializing, "UnorderedMap keys"),
|
|
||||||
src.m_keys.extent(0));
|
|
||||||
tmp.m_values = value_type_view(
|
|
||||||
view_alloc(WithoutInitializing, "UnorderedMap values"),
|
|
||||||
src.m_values.extent(0));
|
|
||||||
tmp.m_scalars = scalars_view("UnorderedMap scalars");
|
|
||||||
|
|
||||||
Kokkos::deep_copy(tmp.m_available_indexes, src.m_available_indexes);
|
tmp.m_bounded_insert = src.m_bounded_insert;
|
||||||
|
tmp.m_hasher = src.m_hasher;
|
||||||
|
tmp.m_equal_to = src.m_equal_to;
|
||||||
|
tmp.m_size() = src.m_size();
|
||||||
|
tmp.m_available_indexes = bitset_type(src.capacity());
|
||||||
|
tmp.m_hash_lists = size_type_view(
|
||||||
|
view_alloc(WithoutInitializing, "UnorderedMap hash list"),
|
||||||
|
src.m_hash_lists.extent(0));
|
||||||
|
tmp.m_next_index = size_type_view(
|
||||||
|
view_alloc(WithoutInitializing, "UnorderedMap next index"),
|
||||||
|
src.m_next_index.extent(0));
|
||||||
|
tmp.m_keys =
|
||||||
|
key_type_view(view_alloc(WithoutInitializing, "UnorderedMap keys"),
|
||||||
|
src.m_keys.extent(0));
|
||||||
|
tmp.m_values =
|
||||||
|
value_type_view(view_alloc(WithoutInitializing, "UnorderedMap values"),
|
||||||
|
src.m_values.extent(0));
|
||||||
|
tmp.m_scalars = scalars_view("UnorderedMap scalars");
|
||||||
|
|
||||||
|
*this = tmp;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Deep copy view data from src. This requires that the src capacity is
|
||||||
|
// identical to the capacity of the calling UnorderedMap.
|
||||||
|
template <typename SKey, typename SValue, typename SDevice>
|
||||||
|
std::enable_if_t<std::is_same<std::remove_const_t<SKey>, key_type>::value &&
|
||||||
|
std::is_same<std::remove_const_t<SValue>, value_type>::value>
|
||||||
|
deep_copy_view(
|
||||||
|
UnorderedMap<SKey, SValue, SDevice, Hasher, EqualTo> const &src) {
|
||||||
|
#ifndef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
|
// To deep copy UnorderedMap, capacity must be identical
|
||||||
|
KOKKOS_EXPECTS(capacity() == src.capacity());
|
||||||
|
#else
|
||||||
|
if (capacity() != src.capacity()) {
|
||||||
|
allocate_view(src);
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATION_WARNINGS
|
||||||
|
Kokkos::Impl::log_warning(
|
||||||
|
"Warning: deep_copy_view() allocating views is deprecated. Must call "
|
||||||
|
"with UnorderedMaps of identical capacity, or use "
|
||||||
|
"create_copy_view().\n");
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
if (m_hash_lists.data() != src.m_hash_lists.data()) {
|
||||||
|
Kokkos::deep_copy(m_available_indexes, src.m_available_indexes);
|
||||||
|
|
||||||
using raw_deep_copy =
|
using raw_deep_copy =
|
||||||
Kokkos::Impl::DeepCopy<typename device_type::memory_space,
|
Kokkos::Impl::DeepCopy<typename device_type::memory_space,
|
||||||
typename SDevice::memory_space>;
|
typename SDevice::memory_space>;
|
||||||
|
|
||||||
raw_deep_copy(tmp.m_hash_lists.data(), src.m_hash_lists.data(),
|
raw_deep_copy(m_hash_lists.data(), src.m_hash_lists.data(),
|
||||||
sizeof(size_type) * src.m_hash_lists.extent(0));
|
sizeof(size_type) * src.m_hash_lists.extent(0));
|
||||||
raw_deep_copy(tmp.m_next_index.data(), src.m_next_index.data(),
|
raw_deep_copy(m_next_index.data(), src.m_next_index.data(),
|
||||||
sizeof(size_type) * src.m_next_index.extent(0));
|
sizeof(size_type) * src.m_next_index.extent(0));
|
||||||
raw_deep_copy(tmp.m_keys.data(), src.m_keys.data(),
|
raw_deep_copy(m_keys.data(), src.m_keys.data(),
|
||||||
sizeof(key_type) * src.m_keys.extent(0));
|
sizeof(key_type) * src.m_keys.extent(0));
|
||||||
if (!is_set) {
|
if (!is_set) {
|
||||||
raw_deep_copy(tmp.m_values.data(), src.m_values.data(),
|
raw_deep_copy(m_values.data(), src.m_values.data(),
|
||||||
sizeof(impl_value_type) * src.m_values.extent(0));
|
sizeof(impl_value_type) * src.m_values.extent(0));
|
||||||
}
|
}
|
||||||
raw_deep_copy(tmp.m_scalars.data(), src.m_scalars.data(),
|
raw_deep_copy(m_scalars.data(), src.m_scalars.data(),
|
||||||
sizeof(int) * num_scalars);
|
sizeof(int) * num_scalars);
|
||||||
|
|
||||||
Kokkos::fence(
|
Kokkos::fence(
|
||||||
"Kokkos::UnorderedMap::create_copy_view: fence after copy to tmp");
|
"Kokkos::UnorderedMap::deep_copy_view: fence after copy to dst.");
|
||||||
|
|
||||||
*this = tmp;
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -932,13 +970,25 @@ class UnorderedMap {
|
|||||||
friend struct Impl::UnorderedMapPrint;
|
friend struct Impl::UnorderedMapPrint;
|
||||||
};
|
};
|
||||||
|
|
||||||
// Specialization of deep_copy for two UnorderedMap objects.
|
// Specialization of deep_copy() for two UnorderedMap objects.
|
||||||
template <typename DKey, typename DT, typename DDevice, typename SKey,
|
template <typename DKey, typename DT, typename DDevice, typename SKey,
|
||||||
typename ST, typename SDevice, typename Hasher, typename EqualTo>
|
typename ST, typename SDevice, typename Hasher, typename EqualTo>
|
||||||
inline void deep_copy(
|
inline void deep_copy(
|
||||||
UnorderedMap<DKey, DT, DDevice, Hasher, EqualTo> &dst,
|
UnorderedMap<DKey, DT, DDevice, Hasher, EqualTo> &dst,
|
||||||
const UnorderedMap<SKey, ST, SDevice, Hasher, EqualTo> &src) {
|
const UnorderedMap<SKey, ST, SDevice, Hasher, EqualTo> &src) {
|
||||||
dst.create_copy_view(src);
|
dst.deep_copy_view(src);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Specialization of create_mirror() for an UnorderedMap object.
|
||||||
|
template <typename Key, typename ValueType, typename Device, typename Hasher,
|
||||||
|
typename EqualTo>
|
||||||
|
typename UnorderedMap<Key, ValueType, Device, Hasher, EqualTo>::HostMirror
|
||||||
|
create_mirror(
|
||||||
|
const UnorderedMap<Key, ValueType, Device, Hasher, EqualTo> &src) {
|
||||||
|
typename UnorderedMap<Key, ValueType, Device, Hasher, EqualTo>::HostMirror
|
||||||
|
dst;
|
||||||
|
dst.allocate_view(src);
|
||||||
|
return dst;
|
||||||
}
|
}
|
||||||
|
|
||||||
} // namespace Kokkos
|
} // namespace Kokkos
|
||||||
|
|||||||
@ -55,8 +55,8 @@ struct test_dualview_alloc {
|
|||||||
bool result = false;
|
bool result = false;
|
||||||
|
|
||||||
test_dualview_alloc(unsigned int size) {
|
test_dualview_alloc(unsigned int size) {
|
||||||
result = run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(
|
result =
|
||||||
size, 3);
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(size, 3);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
@ -154,7 +154,7 @@ struct test_dualview_combinations {
|
|||||||
}
|
}
|
||||||
|
|
||||||
test_dualview_combinations(unsigned int size, bool with_init) {
|
test_dualview_combinations(unsigned int size, bool with_init) {
|
||||||
result = run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(
|
result = run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(
|
||||||
size, 3, with_init);
|
size, 3, with_init);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
@ -253,21 +253,18 @@ struct test_dual_view_deep_copy {
|
|||||||
} // end run_me
|
} // end run_me
|
||||||
|
|
||||||
test_dual_view_deep_copy() {
|
test_dual_view_deep_copy() {
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(10, 5,
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(10, 5, true);
|
||||||
true);
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(10, 5,
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(10, 5,
|
false);
|
||||||
false);
|
|
||||||
// Test zero length but allocated (a.d_view.data!=nullptr but
|
// Test zero length but allocated (a.d_view.data!=nullptr but
|
||||||
// a.d_view.span()==0)
|
// a.d_view.span()==0)
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(0, 5, true);
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(0, 5, true);
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(0, 5,
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(0, 5, false);
|
||||||
false);
|
|
||||||
|
|
||||||
// Test default constructed view
|
// Test default constructed view
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(-1, 5,
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(-1, 5, true);
|
||||||
true);
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>(-1, 5,
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >(-1, 5,
|
false);
|
||||||
false);
|
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
@ -282,15 +279,20 @@ struct test_dualview_resize {
|
|||||||
const unsigned int m = 5;
|
const unsigned int m = 5;
|
||||||
const unsigned int factor = 2;
|
const unsigned int factor = 2;
|
||||||
|
|
||||||
ViewType a("A", n, m);
|
ViewType a;
|
||||||
|
if constexpr (Initialize)
|
||||||
|
a = ViewType("A", n, m);
|
||||||
|
else
|
||||||
|
a = ViewType(Kokkos::view_alloc(Kokkos::WithoutInitializing, "A"), n, m);
|
||||||
|
|
||||||
Kokkos::deep_copy(a.d_view, 1);
|
Kokkos::deep_copy(a.d_view, 1);
|
||||||
|
|
||||||
/* Covers case "Resize on Device" */
|
/* Covers case "Resize on Device" */
|
||||||
a.modify_device();
|
a.modify_device();
|
||||||
if (Initialize)
|
if constexpr (Initialize)
|
||||||
Kokkos::resize(Kokkos::WithoutInitializing, a, factor * n, factor * m);
|
|
||||||
else
|
|
||||||
Kokkos::resize(a, factor * n, factor * m);
|
Kokkos::resize(a, factor * n, factor * m);
|
||||||
|
else
|
||||||
|
Kokkos::resize(Kokkos::WithoutInitializing, a, factor * n, factor * m);
|
||||||
ASSERT_EQ(a.extent(0), n * factor);
|
ASSERT_EQ(a.extent(0), n * factor);
|
||||||
ASSERT_EQ(a.extent(1), m * factor);
|
ASSERT_EQ(a.extent(1), m * factor);
|
||||||
|
|
||||||
@ -298,33 +300,38 @@ struct test_dualview_resize {
|
|||||||
a.sync_host();
|
a.sync_host();
|
||||||
|
|
||||||
// Check device view is initialized as expected
|
// Check device view is initialized as expected
|
||||||
scalar_type a_d_sum = 0;
|
|
||||||
// Execute on the execution_space associated with t_dev's memory space
|
// Execute on the execution_space associated with t_dev's memory space
|
||||||
using t_dev_exec_space =
|
using t_dev_exec_space =
|
||||||
typename ViewType::t_dev::memory_space::execution_space;
|
typename ViewType::t_dev::memory_space::execution_space;
|
||||||
Kokkos::parallel_reduce(
|
Kokkos::View<int, typename ViewType::t_dev::memory_space> errors_d(
|
||||||
Kokkos::RangePolicy<t_dev_exec_space>(0, a.d_view.extent(0)),
|
"errors");
|
||||||
SumViewEntriesFunctor<scalar_type, typename ViewType::t_dev>(a.d_view),
|
Kokkos::parallel_for(
|
||||||
a_d_sum);
|
Kokkos::MDRangePolicy<t_dev_exec_space, Kokkos::Rank<2>>(
|
||||||
|
{0, 0}, {a.d_view.extent(0), a.d_view.extent(1)}),
|
||||||
|
KOKKOS_LAMBDA(int i, int j) {
|
||||||
|
if (a.d_view(i, j) != 1) Kokkos::atomic_inc(errors_d.data());
|
||||||
|
});
|
||||||
|
int errors_d_scalar;
|
||||||
|
Kokkos::deep_copy(errors_d_scalar, errors_d);
|
||||||
|
|
||||||
// Check host view is synced as expected
|
// Check host view is synced as expected
|
||||||
scalar_type a_h_sum = 0;
|
int errors_h_scalar = 0;
|
||||||
for (size_t i = 0; i < a.h_view.extent(0); ++i)
|
for (size_t i = 0; i < a.h_view.extent(0); ++i)
|
||||||
for (size_t j = 0; j < a.h_view.extent(1); ++j) {
|
for (size_t j = 0; j < a.h_view.extent(1); ++j) {
|
||||||
a_h_sum += a.h_view(i, j);
|
if (a.h_view(i, j) != 1) ++errors_h_scalar;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check
|
// Check
|
||||||
ASSERT_EQ(a_h_sum, a_d_sum);
|
ASSERT_EQ(errors_d_scalar, 0);
|
||||||
ASSERT_EQ(a_h_sum, scalar_type(a.extent(0) * a.extent(1)));
|
ASSERT_EQ(errors_h_scalar, 0);
|
||||||
|
|
||||||
/* Covers case "Resize on Host" */
|
/* Covers case "Resize on Host" */
|
||||||
a.modify_host();
|
a.modify_host();
|
||||||
|
|
||||||
if (Initialize)
|
if constexpr (Initialize)
|
||||||
Kokkos::resize(Kokkos::WithoutInitializing, a, n / factor, m / factor);
|
|
||||||
else
|
|
||||||
Kokkos::resize(a, n / factor, m / factor);
|
Kokkos::resize(a, n / factor, m / factor);
|
||||||
|
else
|
||||||
|
Kokkos::resize(Kokkos::WithoutInitializing, a, n / factor, m / factor);
|
||||||
ASSERT_EQ(a.extent(0), n / factor);
|
ASSERT_EQ(a.extent(0), n / factor);
|
||||||
ASSERT_EQ(a.extent(1), m / factor);
|
ASSERT_EQ(a.extent(1), m / factor);
|
||||||
|
|
||||||
@ -332,30 +339,33 @@ struct test_dualview_resize {
|
|||||||
a.sync_device(Kokkos::DefaultExecutionSpace{});
|
a.sync_device(Kokkos::DefaultExecutionSpace{});
|
||||||
|
|
||||||
// Check device view is initialized as expected
|
// Check device view is initialized as expected
|
||||||
a_d_sum = 0;
|
Kokkos::deep_copy(errors_d, 0);
|
||||||
// Execute on the execution_space associated with t_dev's memory space
|
// Execute on the execution_space associated with t_dev's memory space
|
||||||
using t_dev_exec_space =
|
using t_dev_exec_space =
|
||||||
typename ViewType::t_dev::memory_space::execution_space;
|
typename ViewType::t_dev::memory_space::execution_space;
|
||||||
Kokkos::parallel_reduce(
|
Kokkos::parallel_for(
|
||||||
Kokkos::RangePolicy<t_dev_exec_space>(0, a.d_view.extent(0)),
|
Kokkos::MDRangePolicy<t_dev_exec_space, Kokkos::Rank<2>>(
|
||||||
SumViewEntriesFunctor<scalar_type, typename ViewType::t_dev>(a.d_view),
|
{0, 0}, {a.d_view.extent(0), a.d_view.extent(1)}),
|
||||||
a_d_sum);
|
KOKKOS_LAMBDA(int i, int j) {
|
||||||
|
if (a.d_view(i, j) != 1) Kokkos::atomic_inc(errors_d.data());
|
||||||
|
});
|
||||||
|
Kokkos::deep_copy(errors_d_scalar, errors_d);
|
||||||
|
|
||||||
// Check host view is synced as expected
|
// Check host view is synced as expected
|
||||||
a_h_sum = 0;
|
errors_h_scalar = 0;
|
||||||
for (size_t i = 0; i < a.h_view.extent(0); ++i)
|
for (size_t i = 0; i < a.h_view.extent(0); ++i)
|
||||||
for (size_t j = 0; j < a.h_view.extent(1); ++j) {
|
for (size_t j = 0; j < a.h_view.extent(1); ++j) {
|
||||||
a_h_sum += a.h_view(i, j);
|
if (a.h_view(i, j) != 1) ++errors_h_scalar;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check
|
// Check
|
||||||
ASSERT_EQ(a_h_sum, scalar_type(a.extent(0) * a.extent(1)));
|
ASSERT_EQ(errors_d_scalar, 0);
|
||||||
ASSERT_EQ(a_h_sum, a_d_sum);
|
ASSERT_EQ(errors_h_scalar, 0);
|
||||||
|
|
||||||
} // end run_me
|
} // end run_me
|
||||||
|
|
||||||
test_dualview_resize() {
|
test_dualview_resize() {
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >();
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>();
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
@ -369,40 +379,51 @@ struct test_dualview_realloc {
|
|||||||
const unsigned int n = 10;
|
const unsigned int n = 10;
|
||||||
const unsigned int m = 5;
|
const unsigned int m = 5;
|
||||||
|
|
||||||
ViewType a("A", n, m);
|
ViewType a;
|
||||||
if (Initialize)
|
if constexpr (Initialize) {
|
||||||
Kokkos::realloc(Kokkos::WithoutInitializing, a, n, m);
|
a = ViewType("A", n, m);
|
||||||
else
|
|
||||||
Kokkos::realloc(a, n, m);
|
Kokkos::realloc(a, n, m);
|
||||||
|
} else {
|
||||||
|
a = ViewType(Kokkos::view_alloc(Kokkos::WithoutInitializing, "A"), n, m);
|
||||||
|
Kokkos::realloc(Kokkos::WithoutInitializing, a, n, m);
|
||||||
|
}
|
||||||
|
ASSERT_EQ(a.extent(0), n);
|
||||||
|
ASSERT_EQ(a.extent(1), m);
|
||||||
|
|
||||||
Kokkos::deep_copy(a.d_view, 1);
|
Kokkos::deep_copy(a.d_view, 1);
|
||||||
|
|
||||||
a.modify_device();
|
a.modify_device();
|
||||||
a.sync_host();
|
a.sync_host();
|
||||||
|
|
||||||
// Check device view is initialized as expected
|
// Check device view is initialized as expected
|
||||||
scalar_type a_d_sum = 0;
|
|
||||||
// Execute on the execution_space associated with t_dev's memory space
|
// Execute on the execution_space associated with t_dev's memory space
|
||||||
using t_dev_exec_space =
|
using t_dev_exec_space =
|
||||||
typename ViewType::t_dev::memory_space::execution_space;
|
typename ViewType::t_dev::memory_space::execution_space;
|
||||||
Kokkos::parallel_reduce(
|
Kokkos::View<int, typename ViewType::t_dev::memory_space> errors_d(
|
||||||
Kokkos::RangePolicy<t_dev_exec_space>(0, a.d_view.extent(0)),
|
"errors");
|
||||||
SumViewEntriesFunctor<scalar_type, typename ViewType::t_dev>(a.d_view),
|
Kokkos::parallel_for(
|
||||||
a_d_sum);
|
Kokkos::MDRangePolicy<t_dev_exec_space, Kokkos::Rank<2>>(
|
||||||
|
{0, 0}, {a.d_view.extent(0), a.d_view.extent(1)}),
|
||||||
|
KOKKOS_LAMBDA(int i, int j) {
|
||||||
|
if (a.d_view(i, j) != 1) Kokkos::atomic_inc(errors_d.data());
|
||||||
|
});
|
||||||
|
int errors_d_scalar;
|
||||||
|
Kokkos::deep_copy(errors_d_scalar, errors_d);
|
||||||
|
|
||||||
// Check host view is synced as expected
|
// Check host view is synced as expected
|
||||||
scalar_type a_h_sum = 0;
|
int errors_h_scalar = 0;
|
||||||
for (size_t i = 0; i < a.h_view.extent(0); ++i)
|
for (size_t i = 0; i < a.h_view.extent(0); ++i)
|
||||||
for (size_t j = 0; j < a.h_view.extent(1); ++j) {
|
for (size_t j = 0; j < a.h_view.extent(1); ++j) {
|
||||||
a_h_sum += a.h_view(i, j);
|
if (a.h_view(i, j) != 1) ++errors_h_scalar;
|
||||||
}
|
}
|
||||||
|
|
||||||
// Check
|
// Check
|
||||||
ASSERT_EQ(a_h_sum, scalar_type(a.extent(0) * a.extent(1)));
|
ASSERT_EQ(errors_d_scalar, 0);
|
||||||
ASSERT_EQ(a_h_sum, a_d_sum);
|
ASSERT_EQ(errors_h_scalar, 0);
|
||||||
} // end run_me
|
} // end run_me
|
||||||
|
|
||||||
test_dualview_realloc() {
|
test_dualview_realloc() {
|
||||||
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device> >();
|
run_me<Kokkos::DualView<Scalar**, Kokkos::LayoutLeft, Device>>();
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
@ -463,12 +484,23 @@ TEST(TEST_CATEGORY, dualview_deep_copy) {
|
|||||||
test_dualview_deep_copy<double, TEST_EXECSPACE>();
|
test_dualview_deep_copy<double, TEST_EXECSPACE>();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
struct NoDefaultConstructor {
|
||||||
|
NoDefaultConstructor(int i_) : i(i_) {}
|
||||||
|
KOKKOS_FUNCTION operator int() const { return i; }
|
||||||
|
|
||||||
|
int i;
|
||||||
|
};
|
||||||
|
|
||||||
TEST(TEST_CATEGORY, dualview_realloc) {
|
TEST(TEST_CATEGORY, dualview_realloc) {
|
||||||
test_dualview_realloc<int, TEST_EXECSPACE>();
|
test_dualview_realloc<int, TEST_EXECSPACE>();
|
||||||
|
Impl::test_dualview_realloc<NoDefaultConstructor, TEST_EXECSPACE,
|
||||||
|
/* Initialize */ false>();
|
||||||
}
|
}
|
||||||
|
|
||||||
TEST(TEST_CATEGORY, dualview_resize) {
|
TEST(TEST_CATEGORY, dualview_resize) {
|
||||||
test_dualview_resize<int, TEST_EXECSPACE>();
|
test_dualview_resize<int, TEST_EXECSPACE>();
|
||||||
|
Impl::test_dualview_resize<NoDefaultConstructor, TEST_EXECSPACE,
|
||||||
|
/* Initialize */ false>();
|
||||||
}
|
}
|
||||||
|
|
||||||
namespace {
|
namespace {
|
||||||
|
|||||||
@ -68,7 +68,7 @@ struct TestInsert {
|
|||||||
} while (rehash_on_fail && failed_count > 0u);
|
} while (rehash_on_fail && failed_count > 0u);
|
||||||
|
|
||||||
// Trigger the m_size mutable bug.
|
// Trigger the m_size mutable bug.
|
||||||
typename map_type::HostMirror map_h;
|
auto map_h = create_mirror(map);
|
||||||
execution_space().fence();
|
execution_space().fence();
|
||||||
Kokkos::deep_copy(map_h, map);
|
Kokkos::deep_copy(map_h, map);
|
||||||
execution_space().fence();
|
execution_space().fence();
|
||||||
@ -367,7 +367,7 @@ void test_deep_copy(uint32_t num_nodes) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
host_map_type hmap;
|
auto hmap = create_mirror(map);
|
||||||
Kokkos::deep_copy(hmap, map);
|
Kokkos::deep_copy(hmap, map);
|
||||||
|
|
||||||
ASSERT_EQ(map.size(), hmap.size());
|
ASSERT_EQ(map.size(), hmap.size());
|
||||||
@ -380,6 +380,7 @@ void test_deep_copy(uint32_t num_nodes) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
map_type mmap;
|
map_type mmap;
|
||||||
|
mmap.allocate_view(hmap);
|
||||||
Kokkos::deep_copy(mmap, hmap);
|
Kokkos::deep_copy(mmap, hmap);
|
||||||
|
|
||||||
const_map_type cmap = mmap;
|
const_map_type cmap = mmap;
|
||||||
@ -424,7 +425,7 @@ TEST(TEST_CATEGORY, UnorderedMap_valid_empty) {
|
|||||||
Map n{};
|
Map n{};
|
||||||
n = Map{m.capacity()};
|
n = Map{m.capacity()};
|
||||||
n.rehash(m.capacity());
|
n.rehash(m.capacity());
|
||||||
Kokkos::deep_copy(n, m);
|
n.create_copy_view(m);
|
||||||
ASSERT_TRUE(m.is_allocated());
|
ASSERT_TRUE(m.is_allocated());
|
||||||
ASSERT_TRUE(n.is_allocated());
|
ASSERT_TRUE(n.is_allocated());
|
||||||
}
|
}
|
||||||
|
|||||||
@ -21,6 +21,8 @@
|
|||||||
#include <iostream>
|
#include <iostream>
|
||||||
#include <cstdlib>
|
#include <cstdlib>
|
||||||
#include <cstdio>
|
#include <cstdio>
|
||||||
|
#include <Kokkos_Macros.hpp>
|
||||||
|
KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_PUSH()
|
||||||
#include <Kokkos_Vector.hpp>
|
#include <Kokkos_Vector.hpp>
|
||||||
|
|
||||||
namespace Test {
|
namespace Test {
|
||||||
@ -231,7 +233,7 @@ void test_vector_allocate(unsigned int size) {
|
|||||||
TEST(TEST_CATEGORY, vector_combination) {
|
TEST(TEST_CATEGORY, vector_combination) {
|
||||||
test_vector_allocate<int, TEST_EXECSPACE>(10);
|
test_vector_allocate<int, TEST_EXECSPACE>(10);
|
||||||
test_vector_combinations<int, TEST_EXECSPACE>(10);
|
test_vector_combinations<int, TEST_EXECSPACE>(10);
|
||||||
test_vector_combinations<int, TEST_EXECSPACE>(3057);
|
test_vector_combinations<long long int, TEST_EXECSPACE>(3057);
|
||||||
}
|
}
|
||||||
|
|
||||||
TEST(TEST_CATEGORY, vector_insert) {
|
TEST(TEST_CATEGORY, vector_insert) {
|
||||||
|
|||||||
@ -390,7 +390,7 @@ static void Test_Atomic(benchmark::State& state) {
|
|||||||
|
|
||||||
static constexpr int LOOP = 100'000;
|
static constexpr int LOOP = 100'000;
|
||||||
|
|
||||||
BENCHMARK(Test_Atomic<int>)->Arg(LOOP)->Iterations(10);
|
BENCHMARK(Test_Atomic<int>)->Arg(30'000)->Iterations(10);
|
||||||
BENCHMARK(Test_Atomic<long int>)->Arg(LOOP)->Iterations(10);
|
BENCHMARK(Test_Atomic<long int>)->Arg(LOOP)->Iterations(10);
|
||||||
BENCHMARK(Test_Atomic<long long int>)->Arg(LOOP)->Iterations(10);
|
BENCHMARK(Test_Atomic<long long int>)->Arg(LOOP)->Iterations(10);
|
||||||
BENCHMARK(Test_Atomic<unsigned int>)->Arg(LOOP)->Iterations(10);
|
BENCHMARK(Test_Atomic<unsigned int>)->Arg(LOOP)->Iterations(10);
|
||||||
@ -398,4 +398,3 @@ BENCHMARK(Test_Atomic<unsigned long int>)->Arg(LOOP)->Iterations(10);
|
|||||||
BENCHMARK(Test_Atomic<unsigned long long int>)->Arg(LOOP)->Iterations(10);
|
BENCHMARK(Test_Atomic<unsigned long long int>)->Arg(LOOP)->Iterations(10);
|
||||||
BENCHMARK(Test_Atomic<float>)->Arg(LOOP)->Iterations(10);
|
BENCHMARK(Test_Atomic<float>)->Arg(LOOP)->Iterations(10);
|
||||||
BENCHMARK(Test_Atomic<double>)->Arg(LOOP)->Iterations(10);
|
BENCHMARK(Test_Atomic<double>)->Arg(LOOP)->Iterations(10);
|
||||||
BENCHMARK(Test_Atomic<int>)->Arg(LOOP)->Iterations(10);
|
|
||||||
|
|||||||
@ -183,7 +183,8 @@ double atomic_contentious_max_replacement(benchmark::State& state,
|
|||||||
Kokkos::parallel_reduce(
|
Kokkos::parallel_reduce(
|
||||||
con_length,
|
con_length,
|
||||||
KOKKOS_LAMBDA(const int i, T& inner) {
|
KOKKOS_LAMBDA(const int i, T& inner) {
|
||||||
inner = Kokkos::atomic_max_fetch(&(input(0)), inner + 1);
|
inner = Kokkos::atomic_max_fetch(&(input(0)),
|
||||||
|
Kokkos::min(inner, max - 1) + 1);
|
||||||
if (i == con_length - 1) {
|
if (i == con_length - 1) {
|
||||||
Kokkos::atomic_max_fetch(&(input(0)), max);
|
Kokkos::atomic_max_fetch(&(input(0)), max);
|
||||||
inner = max;
|
inner = max;
|
||||||
@ -223,7 +224,8 @@ double atomic_contentious_min_replacement(benchmark::State& state,
|
|||||||
Kokkos::parallel_reduce(
|
Kokkos::parallel_reduce(
|
||||||
con_length,
|
con_length,
|
||||||
KOKKOS_LAMBDA(const int i, T& inner) {
|
KOKKOS_LAMBDA(const int i, T& inner) {
|
||||||
inner = Kokkos::atomic_min_fetch(&(input(0)), inner - 1);
|
inner = Kokkos::atomic_min_fetch(&(input(0)),
|
||||||
|
Kokkos::max(inner, min + 1) - 1);
|
||||||
if (i == con_length - 1) {
|
if (i == con_length - 1) {
|
||||||
Kokkos::atomic_min_fetch(&(input(0)), min);
|
Kokkos::atomic_min_fetch(&(input(0)), min);
|
||||||
inner = min;
|
inner = min;
|
||||||
@ -246,7 +248,7 @@ static void Atomic_ContentiousMinReplacements(benchmark::State& state) {
|
|||||||
auto inp = prepare_input(1, std::numeric_limits<T>::max());
|
auto inp = prepare_input(1, std::numeric_limits<T>::max());
|
||||||
|
|
||||||
for (auto _ : state) {
|
for (auto _ : state) {
|
||||||
const auto time = atomic_contentious_max_replacement(state, inp, length);
|
const auto time = atomic_contentious_min_replacement(state, inp, length);
|
||||||
|
|
||||||
state.SetIterationTime(time);
|
state.SetIterationTime(time);
|
||||||
}
|
}
|
||||||
|
|||||||
@ -166,8 +166,17 @@ class Cuda {
|
|||||||
|
|
||||||
Cuda();
|
Cuda();
|
||||||
|
|
||||||
Cuda(cudaStream_t stream,
|
explicit Cuda(cudaStream_t stream) : Cuda(stream, Impl::ManageStream::no) {}
|
||||||
Impl::ManageStream manage_stream = Impl::ManageStream::no);
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
|
template <typename T = void>
|
||||||
|
KOKKOS_DEPRECATED_WITH_COMMENT(
|
||||||
|
"Cuda execution space should be constructed explicitly.")
|
||||||
|
Cuda(cudaStream_t stream)
|
||||||
|
: Cuda(stream) {}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
Cuda(cudaStream_t stream, Impl::ManageStream manage_stream);
|
||||||
|
|
||||||
KOKKOS_DEPRECATED Cuda(cudaStream_t stream, bool manage_stream);
|
KOKKOS_DEPRECATED Cuda(cudaStream_t stream, bool manage_stream);
|
||||||
|
|
||||||
@ -186,7 +195,7 @@ class Cuda {
|
|||||||
///
|
///
|
||||||
/// This matches the __CUDA_ARCH__ specification.
|
/// This matches the __CUDA_ARCH__ specification.
|
||||||
KOKKOS_DEPRECATED static size_type device_arch() {
|
KOKKOS_DEPRECATED static size_type device_arch() {
|
||||||
const cudaDeviceProp& cudaProp = Cuda().cuda_device_prop();
|
const cudaDeviceProp cudaProp = Cuda().cuda_device_prop();
|
||||||
return cudaProp.major * 100 + cudaProp.minor;
|
return cudaProp.major * 100 + cudaProp.minor;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@ -59,12 +59,6 @@ const std::unique_ptr<Kokkos::Cuda> &Kokkos::Impl::cuda_get_deep_copy_space(
|
|||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
namespace {
|
|
||||||
|
|
||||||
static std::atomic<int> num_uvm_allocations(0);
|
|
||||||
|
|
||||||
} // namespace
|
|
||||||
|
|
||||||
void DeepCopyCuda(void *dst, const void *src, size_t n) {
|
void DeepCopyCuda(void *dst, const void *src, size_t n) {
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL((CudaInternal::singleton().cuda_memcpy_wrapper(
|
KOKKOS_IMPL_CUDA_SAFE_CALL((CudaInternal::singleton().cuda_memcpy_wrapper(
|
||||||
dst, src, n, cudaMemcpyDefault)));
|
dst, src, n, cudaMemcpyDefault)));
|
||||||
@ -204,10 +198,7 @@ void *impl_allocate_common(const int device_id,
|
|||||||
// we should do here since we're turning it into an
|
// we should do here since we're turning it into an
|
||||||
// exception here
|
// exception here
|
||||||
cudaGetLastError();
|
cudaGetLastError();
|
||||||
throw Experimental::CudaRawMemoryAllocationFailure(
|
Kokkos::Impl::throw_bad_alloc(arg_handle.name, arg_alloc_size, arg_label);
|
||||||
arg_alloc_size, error_code,
|
|
||||||
Experimental::RawMemoryAllocationFailure::AllocationMechanism::
|
|
||||||
CudaMalloc);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
||||||
@ -252,8 +243,6 @@ void *CudaUVMSpace::impl_allocate(
|
|||||||
Cuda::impl_static_fence(
|
Cuda::impl_static_fence(
|
||||||
"Kokkos::CudaUVMSpace::impl_allocate: Pre UVM Allocation");
|
"Kokkos::CudaUVMSpace::impl_allocate: Pre UVM Allocation");
|
||||||
if (arg_alloc_size > 0) {
|
if (arg_alloc_size > 0) {
|
||||||
Kokkos::Impl::num_uvm_allocations++;
|
|
||||||
|
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
||||||
cudaError_t error_code =
|
cudaError_t error_code =
|
||||||
cudaMallocManaged(&ptr, arg_alloc_size, cudaMemAttachGlobal);
|
cudaMallocManaged(&ptr, arg_alloc_size, cudaMemAttachGlobal);
|
||||||
@ -263,10 +252,7 @@ void *CudaUVMSpace::impl_allocate(
|
|||||||
// we should do here since we're turning it into an
|
// we should do here since we're turning it into an
|
||||||
// exception here
|
// exception here
|
||||||
cudaGetLastError();
|
cudaGetLastError();
|
||||||
throw Experimental::CudaRawMemoryAllocationFailure(
|
Kokkos::Impl::throw_bad_alloc(name(), arg_alloc_size, arg_label);
|
||||||
arg_alloc_size, error_code,
|
|
||||||
Experimental::RawMemoryAllocationFailure::AllocationMechanism::
|
|
||||||
CudaMallocManaged);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
#ifdef KOKKOS_IMPL_DEBUG_CUDA_PIN_UVM_TO_HOST
|
#ifdef KOKKOS_IMPL_DEBUG_CUDA_PIN_UVM_TO_HOST
|
||||||
@ -307,10 +293,7 @@ void *CudaHostPinnedSpace::impl_allocate(
|
|||||||
// we should do here since we're turning it into an
|
// we should do here since we're turning it into an
|
||||||
// exception here
|
// exception here
|
||||||
cudaGetLastError();
|
cudaGetLastError();
|
||||||
throw Experimental::CudaRawMemoryAllocationFailure(
|
Kokkos::Impl::throw_bad_alloc(name(), arg_alloc_size, arg_label);
|
||||||
arg_alloc_size, error_code,
|
|
||||||
Experimental::RawMemoryAllocationFailure::AllocationMechanism::
|
|
||||||
CudaHostAlloc);
|
|
||||||
}
|
}
|
||||||
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
||||||
const size_t reported_size =
|
const size_t reported_size =
|
||||||
@ -341,27 +324,24 @@ void CudaSpace::impl_deallocate(
|
|||||||
Kokkos::Profiling::deallocateData(arg_handle, arg_label, arg_alloc_ptr,
|
Kokkos::Profiling::deallocateData(arg_handle, arg_label, arg_alloc_ptr,
|
||||||
reported_size);
|
reported_size);
|
||||||
}
|
}
|
||||||
try {
|
|
||||||
#ifndef CUDART_VERSION
|
#ifndef CUDART_VERSION
|
||||||
#error CUDART_VERSION undefined!
|
#error CUDART_VERSION undefined!
|
||||||
#elif (defined(KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC) && CUDART_VERSION >= 11020)
|
#elif (defined(KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC) && CUDART_VERSION >= 11020)
|
||||||
if (arg_alloc_size >= memory_threshold_g) {
|
if (arg_alloc_size >= memory_threshold_g) {
|
||||||
Impl::cuda_device_synchronize(
|
Impl::cuda_device_synchronize(
|
||||||
"Kokkos::Cuda: backend fence before async free");
|
"Kokkos::Cuda: backend fence before async free");
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFreeAsync(arg_alloc_ptr, m_stream));
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFreeAsync(arg_alloc_ptr, m_stream));
|
||||||
Impl::cuda_device_synchronize(
|
Impl::cuda_device_synchronize(
|
||||||
"Kokkos::Cuda: backend fence after async free");
|
"Kokkos::Cuda: backend fence after async free");
|
||||||
} else {
|
} else {
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFree(arg_alloc_ptr));
|
|
||||||
}
|
|
||||||
#else
|
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFree(arg_alloc_ptr));
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFree(arg_alloc_ptr));
|
||||||
#endif
|
|
||||||
} catch (...) {
|
|
||||||
}
|
}
|
||||||
|
#else
|
||||||
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
||||||
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFree(arg_alloc_ptr));
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
void CudaUVMSpace::deallocate(void *const arg_alloc_ptr,
|
void CudaUVMSpace::deallocate(void *const arg_alloc_ptr,
|
||||||
const size_t arg_alloc_size) const {
|
const size_t arg_alloc_size) const {
|
||||||
@ -387,13 +367,9 @@ void CudaUVMSpace::impl_deallocate(
|
|||||||
Kokkos::Profiling::deallocateData(arg_handle, arg_label, arg_alloc_ptr,
|
Kokkos::Profiling::deallocateData(arg_handle, arg_label, arg_alloc_ptr,
|
||||||
reported_size);
|
reported_size);
|
||||||
}
|
}
|
||||||
try {
|
if (arg_alloc_ptr != nullptr) {
|
||||||
if (arg_alloc_ptr != nullptr) {
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
||||||
Kokkos::Impl::num_uvm_allocations--;
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFree(arg_alloc_ptr));
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFree(arg_alloc_ptr));
|
|
||||||
}
|
|
||||||
} catch (...) {
|
|
||||||
}
|
}
|
||||||
Cuda::impl_static_fence(
|
Cuda::impl_static_fence(
|
||||||
"Kokkos::CudaUVMSpace::impl_deallocate: Post UVM Deallocation");
|
"Kokkos::CudaUVMSpace::impl_deallocate: Post UVM Deallocation");
|
||||||
@ -420,11 +396,8 @@ void CudaHostPinnedSpace::impl_deallocate(
|
|||||||
Kokkos::Profiling::deallocateData(arg_handle, arg_label, arg_alloc_ptr,
|
Kokkos::Profiling::deallocateData(arg_handle, arg_label, arg_alloc_ptr,
|
||||||
reported_size);
|
reported_size);
|
||||||
}
|
}
|
||||||
try {
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaSetDevice(m_device));
|
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFreeHost(arg_alloc_ptr));
|
||||||
KOKKOS_IMPL_CUDA_SAFE_CALL(cudaFreeHost(arg_alloc_ptr));
|
|
||||||
} catch (...) {
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
} // namespace Kokkos
|
} // namespace Kokkos
|
||||||
|
|||||||
@ -22,7 +22,6 @@
|
|||||||
|
|
||||||
#include <impl/Kokkos_Error.hpp>
|
#include <impl/Kokkos_Error.hpp>
|
||||||
#include <impl/Kokkos_Profiling.hpp>
|
#include <impl/Kokkos_Profiling.hpp>
|
||||||
#include <iosfwd>
|
|
||||||
|
|
||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
@ -69,52 +68,6 @@ inline void cuda_internal_safe_call(cudaError e, const char* name,
|
|||||||
Kokkos::Impl::cuda_internal_safe_call(call, #call, __FILE__, __LINE__)
|
Kokkos::Impl::cuda_internal_safe_call(call, #call, __FILE__, __LINE__)
|
||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
namespace Experimental {
|
|
||||||
|
|
||||||
class CudaRawMemoryAllocationFailure : public RawMemoryAllocationFailure {
|
|
||||||
private:
|
|
||||||
using base_t = RawMemoryAllocationFailure;
|
|
||||||
|
|
||||||
cudaError_t m_error_code = cudaSuccess;
|
|
||||||
|
|
||||||
static FailureMode get_failure_mode(cudaError_t error_code) {
|
|
||||||
switch (error_code) {
|
|
||||||
case cudaErrorMemoryAllocation: return FailureMode::OutOfMemoryError;
|
|
||||||
case cudaErrorInvalidValue: return FailureMode::InvalidAllocationSize;
|
|
||||||
// TODO handle cudaErrorNotSupported for cudaMallocManaged
|
|
||||||
default: return FailureMode::Unknown;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
public:
|
|
||||||
// using base_t::base_t;
|
|
||||||
// would trigger
|
|
||||||
//
|
|
||||||
// error: cannot determine the exception specification of the default
|
|
||||||
// constructor due to a circular dependency
|
|
||||||
//
|
|
||||||
// using NVCC 9.1 and gcc 7.4
|
|
||||||
CudaRawMemoryAllocationFailure(
|
|
||||||
size_t arg_attempted_size, size_t arg_attempted_alignment,
|
|
||||||
FailureMode arg_failure_mode = FailureMode::OutOfMemoryError,
|
|
||||||
AllocationMechanism arg_mechanism =
|
|
||||||
AllocationMechanism::StdMalloc) noexcept
|
|
||||||
: base_t(arg_attempted_size, arg_attempted_alignment, arg_failure_mode,
|
|
||||||
arg_mechanism) {}
|
|
||||||
|
|
||||||
CudaRawMemoryAllocationFailure(size_t arg_attempted_size,
|
|
||||||
cudaError_t arg_error_code,
|
|
||||||
AllocationMechanism arg_mechanism) noexcept
|
|
||||||
: base_t(arg_attempted_size, /* CudaSpace doesn't handle alignment? */ 1,
|
|
||||||
get_failure_mode(arg_error_code), arg_mechanism),
|
|
||||||
m_error_code(arg_error_code) {}
|
|
||||||
|
|
||||||
void append_additional_error_information(std::ostream& o) const override;
|
|
||||||
};
|
|
||||||
|
|
||||||
} // end namespace Experimental
|
|
||||||
|
|
||||||
} // namespace Kokkos
|
} // namespace Kokkos
|
||||||
|
|
||||||
#endif // KOKKOS_ENABLE_CUDA
|
#endif // KOKKOS_ENABLE_CUDA
|
||||||
|
|||||||
@ -72,7 +72,7 @@ struct GraphImpl<Kokkos::Cuda> {
|
|||||||
GraphNodeImpl<Kokkos::Cuda, aggregate_kernel_impl_t,
|
GraphNodeImpl<Kokkos::Cuda, aggregate_kernel_impl_t,
|
||||||
Kokkos::Experimental::TypeErasedTag>;
|
Kokkos::Experimental::TypeErasedTag>;
|
||||||
|
|
||||||
// Not moveable or copyable; it spends its whole life as a shared_ptr in the
|
// Not movable or copyable; it spends its whole life as a shared_ptr in the
|
||||||
// Graph object
|
// Graph object
|
||||||
GraphImpl() = delete;
|
GraphImpl() = delete;
|
||||||
GraphImpl(GraphImpl const&) = delete;
|
GraphImpl(GraphImpl const&) = delete;
|
||||||
@ -115,12 +115,9 @@ struct GraphImpl<Kokkos::Cuda> {
|
|||||||
|
|
||||||
template <class NodeImpl>
|
template <class NodeImpl>
|
||||||
// requires NodeImplPtr is a shared_ptr to specialization of GraphNodeImpl
|
// requires NodeImplPtr is a shared_ptr to specialization of GraphNodeImpl
|
||||||
// Also requires that the kernel has the graph node tag in it's policy
|
// Also requires that the kernel has the graph node tag in its policy
|
||||||
void add_node(std::shared_ptr<NodeImpl> const& arg_node_ptr) {
|
void add_node(std::shared_ptr<NodeImpl> const& arg_node_ptr) {
|
||||||
static_assert(
|
static_assert(NodeImpl::kernel_type::Policy::is_graph_kernel::value);
|
||||||
NodeImpl::kernel_type::Policy::is_graph_kernel::value,
|
|
||||||
"Something has gone horribly wrong, but it's too complicated to "
|
|
||||||
"explain here. Buy Daisy a coffee and she'll explain it to you.");
|
|
||||||
KOKKOS_EXPECTS(bool(arg_node_ptr));
|
KOKKOS_EXPECTS(bool(arg_node_ptr));
|
||||||
// The Kernel launch from the execute() method has been shimmed to insert
|
// The Kernel launch from the execute() method has been shimmed to insert
|
||||||
// the node into the graph
|
// the node into the graph
|
||||||
|
|||||||
@ -737,6 +737,14 @@ namespace Impl {
|
|||||||
int g_cuda_space_factory_initialized =
|
int g_cuda_space_factory_initialized =
|
||||||
initialize_space_factory<Cuda>("150_Cuda");
|
initialize_space_factory<Cuda>("150_Cuda");
|
||||||
|
|
||||||
|
int CudaInternal::m_cudaArch = -1;
|
||||||
|
cudaDeviceProp CudaInternal::m_deviceProp;
|
||||||
|
std::set<int> CudaInternal::cuda_devices = {};
|
||||||
|
std::map<int, unsigned long *> CudaInternal::constantMemHostStagingPerDevice =
|
||||||
|
{};
|
||||||
|
std::map<int, cudaEvent_t> CudaInternal::constantMemReusablePerDevice = {};
|
||||||
|
std::map<int, std::mutex> CudaInternal::constantMemMutexPerDevice = {};
|
||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
} // namespace Kokkos
|
} // namespace Kokkos
|
||||||
|
|||||||
@ -91,10 +91,10 @@ class CudaInternal {
|
|||||||
int m_cudaDev = -1;
|
int m_cudaDev = -1;
|
||||||
|
|
||||||
// Device Properties
|
// Device Properties
|
||||||
inline static int m_cudaArch = -1;
|
static int m_cudaArch;
|
||||||
static int concurrency();
|
static int concurrency();
|
||||||
|
|
||||||
inline static cudaDeviceProp m_deviceProp;
|
static cudaDeviceProp m_deviceProp;
|
||||||
|
|
||||||
// Scratch Spaces for Reductions
|
// Scratch Spaces for Reductions
|
||||||
mutable std::size_t m_scratchSpaceCount;
|
mutable std::size_t m_scratchSpaceCount;
|
||||||
@ -120,11 +120,10 @@ class CudaInternal {
|
|||||||
bool was_initialized = false;
|
bool was_initialized = false;
|
||||||
bool was_finalized = false;
|
bool was_finalized = false;
|
||||||
|
|
||||||
inline static std::set<int> cuda_devices = {};
|
static std::set<int> cuda_devices;
|
||||||
inline static std::map<int, unsigned long*> constantMemHostStagingPerDevice =
|
static std::map<int, unsigned long*> constantMemHostStagingPerDevice;
|
||||||
{};
|
static std::map<int, cudaEvent_t> constantMemReusablePerDevice;
|
||||||
inline static std::map<int, cudaEvent_t> constantMemReusablePerDevice = {};
|
static std::map<int, std::mutex> constantMemMutexPerDevice;
|
||||||
inline static std::map<int, std::mutex> constantMemMutexPerDevice = {};
|
|
||||||
|
|
||||||
static CudaInternal& singleton();
|
static CudaInternal& singleton();
|
||||||
|
|
||||||
@ -421,23 +420,6 @@ class CudaInternal {
|
|||||||
return cudaStreamSynchronize(stream);
|
return cudaStreamSynchronize(stream);
|
||||||
}
|
}
|
||||||
|
|
||||||
// The following are only available for cuda 11.2 and greater
|
|
||||||
#if (defined(KOKKOS_ENABLE_IMPL_CUDA_MALLOC_ASYNC) && CUDART_VERSION >= 11020)
|
|
||||||
template <bool setCudaDevice = true>
|
|
||||||
cudaError_t cuda_malloc_async_wrapper(void** devPtr, size_t size,
|
|
||||||
cudaStream_t hStream = nullptr) const {
|
|
||||||
if constexpr (setCudaDevice) set_cuda_device();
|
|
||||||
return cudaMallocAsync(devPtr, size, get_input_stream(hStream));
|
|
||||||
}
|
|
||||||
|
|
||||||
template <bool setCudaDevice = true>
|
|
||||||
cudaError_t cuda_free_async_wrapper(void* devPtr,
|
|
||||||
cudaStream_t hStream = nullptr) const {
|
|
||||||
if constexpr (setCudaDevice) set_cuda_device();
|
|
||||||
return cudaFreeAsync(devPtr, get_input_stream(hStream));
|
|
||||||
}
|
|
||||||
#endif
|
|
||||||
|
|
||||||
// C++ API routines
|
// C++ API routines
|
||||||
template <typename T, bool setCudaDevice = true>
|
template <typename T, bool setCudaDevice = true>
|
||||||
cudaError_t cuda_func_get_attributes_wrapper(cudaFuncAttributes* attr,
|
cudaError_t cuda_func_get_attributes_wrapper(cudaFuncAttributes* attr,
|
||||||
|
|||||||
@ -539,17 +539,9 @@ class ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,
|
|||||||
m_vector_size(arg_policy.impl_vector_length()) {
|
m_vector_size(arg_policy.impl_vector_length()) {
|
||||||
auto internal_space_instance =
|
auto internal_space_instance =
|
||||||
m_policy.space().impl_internal_space_instance();
|
m_policy.space().impl_internal_space_instance();
|
||||||
cudaFuncAttributes attr =
|
m_team_size = m_team_size >= 0 ? m_team_size
|
||||||
CudaParallelLaunch<ParallelFor, LaunchBounds>::get_cuda_func_attributes(
|
: arg_policy.team_size_recommended(
|
||||||
internal_space_instance->m_cudaDev);
|
arg_functor, ParallelForTag());
|
||||||
m_team_size =
|
|
||||||
m_team_size >= 0
|
|
||||||
? m_team_size
|
|
||||||
: Kokkos::Impl::cuda_get_opt_block_size<FunctorType, LaunchBounds>(
|
|
||||||
internal_space_instance, attr, m_functor, m_vector_size,
|
|
||||||
m_policy.team_scratch_size(0),
|
|
||||||
m_policy.thread_scratch_size(0)) /
|
|
||||||
m_vector_size;
|
|
||||||
|
|
||||||
m_shmem_begin = (sizeof(double) * (m_team_size + 2));
|
m_shmem_begin = (sizeof(double) * (m_team_size + 2));
|
||||||
m_shmem_size =
|
m_shmem_size =
|
||||||
@ -585,13 +577,7 @@ class ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,
|
|||||||
"Kokkos::Impl::ParallelFor< Cuda > insufficient shared memory"));
|
"Kokkos::Impl::ParallelFor< Cuda > insufficient shared memory"));
|
||||||
}
|
}
|
||||||
|
|
||||||
if (int(m_team_size) >
|
if (m_team_size > arg_policy.team_size_max(arg_functor, ParallelForTag())) {
|
||||||
int(Kokkos::Impl::cuda_get_max_block_size<FunctorType, LaunchBounds>(
|
|
||||||
internal_space_instance, attr, arg_functor,
|
|
||||||
arg_policy.impl_vector_length(),
|
|
||||||
arg_policy.team_scratch_size(0),
|
|
||||||
arg_policy.thread_scratch_size(0)) /
|
|
||||||
arg_policy.impl_vector_length())) {
|
|
||||||
Kokkos::Impl::throw_runtime_exception(std::string(
|
Kokkos::Impl::throw_runtime_exception(std::string(
|
||||||
"Kokkos::Impl::ParallelFor< Cuda > requested too large team size."));
|
"Kokkos::Impl::ParallelFor< Cuda > requested too large team size."));
|
||||||
}
|
}
|
||||||
@ -909,17 +895,11 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
m_vector_size(arg_policy.impl_vector_length()) {
|
m_vector_size(arg_policy.impl_vector_length()) {
|
||||||
auto internal_space_instance =
|
auto internal_space_instance =
|
||||||
m_policy.space().impl_internal_space_instance();
|
m_policy.space().impl_internal_space_instance();
|
||||||
cudaFuncAttributes attr = CudaParallelLaunch<ParallelReduce, LaunchBounds>::
|
m_team_size = m_team_size >= 0 ? m_team_size
|
||||||
get_cuda_func_attributes(internal_space_instance->m_cudaDev);
|
: arg_policy.team_size_recommended(
|
||||||
m_team_size =
|
arg_functor_reducer.get_functor(),
|
||||||
m_team_size >= 0
|
arg_functor_reducer.get_reducer(),
|
||||||
? m_team_size
|
ParallelReduceTag());
|
||||||
: Kokkos::Impl::cuda_get_opt_block_size<FunctorType, LaunchBounds>(
|
|
||||||
internal_space_instance, attr,
|
|
||||||
m_functor_reducer.get_functor(), m_vector_size,
|
|
||||||
m_policy.team_scratch_size(0),
|
|
||||||
m_policy.thread_scratch_size(0)) /
|
|
||||||
m_vector_size;
|
|
||||||
|
|
||||||
m_team_begin =
|
m_team_begin =
|
||||||
UseShflReduction
|
UseShflReduction
|
||||||
|
|||||||
@ -28,35 +28,20 @@ extern "C" {
|
|||||||
/* Cuda runtime function, declared in <crt/device_runtime.h>
|
/* Cuda runtime function, declared in <crt/device_runtime.h>
|
||||||
* Requires capability 2.x or better.
|
* Requires capability 2.x or better.
|
||||||
*/
|
*/
|
||||||
extern __device__ void __assertfail(const void *message, const void *file,
|
[[noreturn]] __device__ void __assertfail(const void *message, const void *file,
|
||||||
unsigned int line, const void *function,
|
unsigned int line,
|
||||||
size_t charsize);
|
const void *function,
|
||||||
|
size_t charsize);
|
||||||
}
|
}
|
||||||
|
|
||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
// required to workaround failures in random number generator unit tests with
|
[[noreturn]] __device__ static void cuda_abort(const char *const message) {
|
||||||
// pre-volta architectures
|
|
||||||
#if defined(KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK)
|
|
||||||
__device__ inline void cuda_abort(const char *const message) {
|
|
||||||
#else
|
|
||||||
[[noreturn]] __device__ inline void cuda_abort(const char *const message) {
|
|
||||||
#endif
|
|
||||||
const char empty[] = "";
|
const char empty[] = "";
|
||||||
|
|
||||||
__assertfail((const void *)message, (const void *)empty, (unsigned int)0,
|
__assertfail((const void *)message, (const void *)empty, (unsigned int)0,
|
||||||
(const void *)empty, sizeof(char));
|
(const void *)empty, sizeof(char));
|
||||||
|
|
||||||
// This loop is never executed. It's intended to suppress warnings that the
|
|
||||||
// function returns, even though it does not. This is necessary because
|
|
||||||
// __assertfail is not marked as [[noreturn]], even though it does not return.
|
|
||||||
// Disable with KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK to workaround failures
|
|
||||||
// in random number generator unit tests with pre-volta architectures
|
|
||||||
#if !defined(KOKKOS_ENABLE_DEBUG_BOUNDS_CHECK)
|
|
||||||
while (true)
|
|
||||||
;
|
|
||||||
#endif
|
|
||||||
}
|
}
|
||||||
|
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|||||||
@ -48,8 +48,19 @@ class HIP {
|
|||||||
using scratch_memory_space = ScratchMemorySpace<HIP>;
|
using scratch_memory_space = ScratchMemorySpace<HIP>;
|
||||||
|
|
||||||
HIP();
|
HIP();
|
||||||
HIP(hipStream_t stream,
|
|
||||||
Impl::ManageStream manage_stream = Impl::ManageStream::no);
|
explicit HIP(hipStream_t stream) : HIP(stream, Impl::ManageStream::no) {}
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
|
template <typename T = void>
|
||||||
|
KOKKOS_DEPRECATED_WITH_COMMENT(
|
||||||
|
"HIP execution space should be constructed explicitly.")
|
||||||
|
HIP(hipStream_t stream)
|
||||||
|
: HIP(stream) {}
|
||||||
|
#endif
|
||||||
|
|
||||||
|
HIP(hipStream_t stream, Impl::ManageStream manage_stream);
|
||||||
|
|
||||||
KOKKOS_DEPRECATED HIP(hipStream_t stream, bool manage_stream);
|
KOKKOS_DEPRECATED HIP(hipStream_t stream, bool manage_stream);
|
||||||
|
|
||||||
//@}
|
//@}
|
||||||
|
|||||||
@ -22,8 +22,6 @@
|
|||||||
|
|
||||||
#include <hip/hip_runtime.h>
|
#include <hip/hip_runtime.h>
|
||||||
|
|
||||||
#include <ostream>
|
|
||||||
|
|
||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
@ -44,39 +42,4 @@ inline void hip_internal_safe_call(hipError_t e, const char* name,
|
|||||||
#define KOKKOS_IMPL_HIP_SAFE_CALL(call) \
|
#define KOKKOS_IMPL_HIP_SAFE_CALL(call) \
|
||||||
Kokkos::Impl::hip_internal_safe_call(call, #call, __FILE__, __LINE__)
|
Kokkos::Impl::hip_internal_safe_call(call, #call, __FILE__, __LINE__)
|
||||||
|
|
||||||
namespace Kokkos {
|
|
||||||
namespace Experimental {
|
|
||||||
|
|
||||||
class HIPRawMemoryAllocationFailure : public RawMemoryAllocationFailure {
|
|
||||||
private:
|
|
||||||
hipError_t m_error_code = hipSuccess;
|
|
||||||
|
|
||||||
static FailureMode get_failure_mode(hipError_t error_code) {
|
|
||||||
switch (error_code) {
|
|
||||||
case hipErrorMemoryAllocation: return FailureMode::OutOfMemoryError;
|
|
||||||
case hipErrorInvalidValue: return FailureMode::InvalidAllocationSize;
|
|
||||||
default: return FailureMode::Unknown;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
public:
|
|
||||||
HIPRawMemoryAllocationFailure(size_t arg_attempted_size,
|
|
||||||
hipError_t arg_error_code,
|
|
||||||
AllocationMechanism arg_mechanism) noexcept
|
|
||||||
: RawMemoryAllocationFailure(
|
|
||||||
arg_attempted_size, /* HIPSpace doesn't handle alignment? */ 1,
|
|
||||||
get_failure_mode(arg_error_code), arg_mechanism),
|
|
||||||
m_error_code(arg_error_code) {}
|
|
||||||
|
|
||||||
void append_additional_error_information(std::ostream& o) const override {
|
|
||||||
if (m_error_code != hipSuccess) {
|
|
||||||
o << " The HIP allocation returned the error code \""
|
|
||||||
<< hipGetErrorName(m_error_code) << "\".";
|
|
||||||
}
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
} // namespace Experimental
|
|
||||||
} // namespace Kokkos
|
|
||||||
|
|
||||||
#endif
|
#endif
|
||||||
|
|||||||
@ -40,7 +40,7 @@ class GraphImpl<Kokkos::HIP> {
|
|||||||
GraphNodeImpl<Kokkos::HIP, aggregate_kernel_impl_t,
|
GraphNodeImpl<Kokkos::HIP, aggregate_kernel_impl_t,
|
||||||
Kokkos::Experimental::TypeErasedTag>;
|
Kokkos::Experimental::TypeErasedTag>;
|
||||||
|
|
||||||
// Not moveable or copyable; it spends its whole life as a shared_ptr in the
|
// Not movable or copyable; it spends its whole life as a shared_ptr in the
|
||||||
// Graph object.
|
// Graph object.
|
||||||
GraphImpl() = delete;
|
GraphImpl() = delete;
|
||||||
GraphImpl(GraphImpl const&) = delete;
|
GraphImpl(GraphImpl const&) = delete;
|
||||||
@ -108,7 +108,7 @@ inline void GraphImpl<Kokkos::HIP>::add_node(
|
|||||||
}
|
}
|
||||||
|
|
||||||
// Requires NodeImplPtr is a shared_ptr to specialization of GraphNodeImpl
|
// Requires NodeImplPtr is a shared_ptr to specialization of GraphNodeImpl
|
||||||
// Also requires that the kernel has the graph node tag in it's policy
|
// Also requires that the kernel has the graph node tag in its policy
|
||||||
template <class NodeImpl>
|
template <class NodeImpl>
|
||||||
inline void GraphImpl<Kokkos::HIP>::add_node(
|
inline void GraphImpl<Kokkos::HIP>::add_node(
|
||||||
std::shared_ptr<NodeImpl> const& arg_node_ptr) {
|
std::shared_ptr<NodeImpl> const& arg_node_ptr) {
|
||||||
|
|||||||
@ -353,6 +353,22 @@ void HIPInternal::finalize() {
|
|||||||
m_num_scratch_locks = 0;
|
m_num_scratch_locks = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
int HIPInternal::m_hipDev = -1;
|
||||||
|
unsigned HIPInternal::m_multiProcCount = 0;
|
||||||
|
unsigned HIPInternal::m_maxWarpCount = 0;
|
||||||
|
std::array<HIPInternal::size_type, 3> HIPInternal::m_maxBlock = {0, 0, 0};
|
||||||
|
unsigned HIPInternal::m_maxWavesPerCU = 0;
|
||||||
|
int HIPInternal::m_shmemPerSM = 0;
|
||||||
|
int HIPInternal::m_maxShmemPerBlock = 0;
|
||||||
|
int HIPInternal::m_maxThreadsPerSM = 0;
|
||||||
|
|
||||||
|
hipDeviceProp_t HIPInternal::m_deviceProp;
|
||||||
|
|
||||||
|
std::mutex HIPInternal::scratchFunctorMutex;
|
||||||
|
unsigned long *HIPInternal::constantMemHostStaging = nullptr;
|
||||||
|
hipEvent_t HIPInternal::constantMemReusable = nullptr;
|
||||||
|
std::mutex HIPInternal::constantMemMutex;
|
||||||
|
|
||||||
//----------------------------------------------------------------------------
|
//----------------------------------------------------------------------------
|
||||||
|
|
||||||
Kokkos::HIP::size_type hip_internal_multiprocessor_count() {
|
Kokkos::HIP::size_type hip_internal_multiprocessor_count() {
|
||||||
|
|||||||
@ -35,8 +35,7 @@ struct HIPTraits {
|
|||||||
static constexpr int WarpSize = 64;
|
static constexpr int WarpSize = 64;
|
||||||
static constexpr int WarpIndexMask = 0x003f; /* hexadecimal for 63 */
|
static constexpr int WarpIndexMask = 0x003f; /* hexadecimal for 63 */
|
||||||
static constexpr int WarpIndexShift = 6; /* WarpSize == 1 << WarpShift*/
|
static constexpr int WarpIndexShift = 6; /* WarpSize == 1 << WarpShift*/
|
||||||
#elif defined(KOKKOS_ARCH_AMD_GFX1030) || defined(KOKKOS_ARCH_AMD_GFX1100) || \
|
#elif defined(KOKKOS_ARCH_AMD_GFX1030) || defined(KOKKOS_ARCH_AMD_GFX1100)
|
||||||
defined(KOKKOS_ARCH_AMD_GFX1103)
|
|
||||||
static constexpr int WarpSize = 32;
|
static constexpr int WarpSize = 32;
|
||||||
static constexpr int WarpIndexMask = 0x001f; /* hexadecimal for 31 */
|
static constexpr int WarpIndexMask = 0x001f; /* hexadecimal for 31 */
|
||||||
static constexpr int WarpIndexShift = 5; /* WarpSize == 1 << WarpShift*/
|
static constexpr int WarpIndexShift = 5; /* WarpSize == 1 << WarpShift*/
|
||||||
@ -71,16 +70,16 @@ class HIPInternal {
|
|||||||
public:
|
public:
|
||||||
using size_type = ::Kokkos::HIP::size_type;
|
using size_type = ::Kokkos::HIP::size_type;
|
||||||
|
|
||||||
inline static int m_hipDev = -1;
|
static int m_hipDev;
|
||||||
inline static unsigned m_multiProcCount = 0;
|
static unsigned m_multiProcCount;
|
||||||
inline static unsigned m_maxWarpCount = 0;
|
static unsigned m_maxWarpCount;
|
||||||
inline static std::array<size_type, 3> m_maxBlock = {0, 0, 0};
|
static std::array<size_type, 3> m_maxBlock;
|
||||||
inline static unsigned m_maxWavesPerCU = 0;
|
static unsigned m_maxWavesPerCU;
|
||||||
inline static int m_shmemPerSM = 0;
|
static int m_shmemPerSM;
|
||||||
inline static int m_maxShmemPerBlock = 0;
|
static int m_maxShmemPerBlock;
|
||||||
inline static int m_maxThreadsPerSM = 0;
|
static int m_maxThreadsPerSM;
|
||||||
|
|
||||||
inline static hipDeviceProp_t m_deviceProp;
|
static hipDeviceProp_t m_deviceProp;
|
||||||
|
|
||||||
static int concurrency();
|
static int concurrency();
|
||||||
|
|
||||||
@ -93,7 +92,7 @@ class HIPInternal {
|
|||||||
size_type *m_scratchFlags = nullptr;
|
size_type *m_scratchFlags = nullptr;
|
||||||
mutable size_type *m_scratchFunctor = nullptr;
|
mutable size_type *m_scratchFunctor = nullptr;
|
||||||
mutable size_type *m_scratchFunctorHost = nullptr;
|
mutable size_type *m_scratchFunctorHost = nullptr;
|
||||||
inline static std::mutex scratchFunctorMutex;
|
static std::mutex scratchFunctorMutex;
|
||||||
|
|
||||||
hipStream_t m_stream = nullptr;
|
hipStream_t m_stream = nullptr;
|
||||||
uint32_t m_instance_id =
|
uint32_t m_instance_id =
|
||||||
@ -112,9 +111,9 @@ class HIPInternal {
|
|||||||
|
|
||||||
// FIXME_HIP: these want to be per-device, not per-stream... use of 'static'
|
// FIXME_HIP: these want to be per-device, not per-stream... use of 'static'
|
||||||
// here will break once there are multiple devices though
|
// here will break once there are multiple devices though
|
||||||
inline static unsigned long *constantMemHostStaging = nullptr;
|
static unsigned long *constantMemHostStaging;
|
||||||
inline static hipEvent_t constantMemReusable = nullptr;
|
static hipEvent_t constantMemReusable;
|
||||||
inline static std::mutex constantMemMutex;
|
static std::mutex constantMemMutex;
|
||||||
|
|
||||||
static HIPInternal &singleton();
|
static HIPInternal &singleton();
|
||||||
|
|
||||||
|
|||||||
@ -50,6 +50,7 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
using value_type = typename ReducerType::value_type;
|
using value_type = typename ReducerType::value_type;
|
||||||
using reference_type = typename ReducerType::reference_type;
|
using reference_type = typename ReducerType::reference_type;
|
||||||
using functor_type = FunctorType;
|
using functor_type = FunctorType;
|
||||||
|
using reducer_type = ReducerType;
|
||||||
using size_type = HIP::size_type;
|
using size_type = HIP::size_type;
|
||||||
|
|
||||||
// Conditionally set word_size_type to int16_t or int8_t if value_type is
|
// Conditionally set word_size_type to int16_t or int8_t if value_type is
|
||||||
|
|||||||
@ -31,7 +31,7 @@ template <class CombinedFunctorReducerType, class... Properties>
|
|||||||
class ParallelReduce<CombinedFunctorReducerType,
|
class ParallelReduce<CombinedFunctorReducerType,
|
||||||
Kokkos::TeamPolicy<Properties...>, HIP> {
|
Kokkos::TeamPolicy<Properties...>, HIP> {
|
||||||
public:
|
public:
|
||||||
using Policy = TeamPolicyInternal<HIP, Properties...>;
|
using Policy = TeamPolicy<Properties...>;
|
||||||
using FunctorType = typename CombinedFunctorReducerType::functor_type;
|
using FunctorType = typename CombinedFunctorReducerType::functor_type;
|
||||||
using ReducerType = typename CombinedFunctorReducerType::reducer_type;
|
using ReducerType = typename CombinedFunctorReducerType::reducer_type;
|
||||||
|
|
||||||
@ -46,6 +46,7 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
|
|
||||||
public:
|
public:
|
||||||
using functor_type = FunctorType;
|
using functor_type = FunctorType;
|
||||||
|
using reducer_type = ReducerType;
|
||||||
using size_type = HIP::size_type;
|
using size_type = HIP::size_type;
|
||||||
|
|
||||||
// static int constexpr UseShflReduction = false;
|
// static int constexpr UseShflReduction = false;
|
||||||
|
|||||||
@ -39,6 +39,7 @@
|
|||||||
|
|
||||||
/*--------------------------------------------------------------------------*/
|
/*--------------------------------------------------------------------------*/
|
||||||
/*--------------------------------------------------------------------------*/
|
/*--------------------------------------------------------------------------*/
|
||||||
|
|
||||||
namespace {
|
namespace {
|
||||||
|
|
||||||
static std::atomic<bool> is_first_hip_managed_allocation(true);
|
static std::atomic<bool> is_first_hip_managed_allocation(true);
|
||||||
@ -66,7 +67,6 @@ void* HIPSpace::allocate(
|
|||||||
return impl_allocate(arg_label, arg_alloc_size, arg_logical_size);
|
return impl_allocate(arg_label, arg_alloc_size, arg_logical_size);
|
||||||
}
|
}
|
||||||
void* HIPSpace::impl_allocate(
|
void* HIPSpace::impl_allocate(
|
||||||
|
|
||||||
const char* arg_label, const size_t arg_alloc_size,
|
const char* arg_label, const size_t arg_alloc_size,
|
||||||
const size_t arg_logical_size,
|
const size_t arg_logical_size,
|
||||||
const Kokkos::Tools::SpaceHandle arg_handle) const {
|
const Kokkos::Tools::SpaceHandle arg_handle) const {
|
||||||
@ -77,10 +77,7 @@ void* HIPSpace::impl_allocate(
|
|||||||
// This is the only way to clear the last error, which we should do here
|
// This is the only way to clear the last error, which we should do here
|
||||||
// since we're turning it into an exception here
|
// since we're turning it into an exception here
|
||||||
(void)hipGetLastError();
|
(void)hipGetLastError();
|
||||||
throw Experimental::HIPRawMemoryAllocationFailure(
|
Kokkos::Impl::throw_bad_alloc(name(), arg_alloc_size, arg_label);
|
||||||
arg_alloc_size, error_code,
|
|
||||||
Experimental::RawMemoryAllocationFailure::AllocationMechanism::
|
|
||||||
HIPMalloc);
|
|
||||||
}
|
}
|
||||||
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
||||||
const size_t reported_size =
|
const size_t reported_size =
|
||||||
@ -111,10 +108,7 @@ void* HIPHostPinnedSpace::impl_allocate(
|
|||||||
// This is the only way to clear the last error, which we should do here
|
// This is the only way to clear the last error, which we should do here
|
||||||
// since we're turning it into an exception here
|
// since we're turning it into an exception here
|
||||||
(void)hipGetLastError();
|
(void)hipGetLastError();
|
||||||
throw Experimental::HIPRawMemoryAllocationFailure(
|
Kokkos::Impl::throw_bad_alloc(name(), arg_alloc_size, arg_label);
|
||||||
arg_alloc_size, error_code,
|
|
||||||
Experimental::RawMemoryAllocationFailure::AllocationMechanism::
|
|
||||||
HIPHostMalloc);
|
|
||||||
}
|
}
|
||||||
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
||||||
const size_t reported_size =
|
const size_t reported_size =
|
||||||
@ -178,10 +172,7 @@ Kokkos::HIP::runtime WARNING: Kokkos did not find an environment variable 'HSA_X
|
|||||||
// This is the only way to clear the last error, which we should do here
|
// This is the only way to clear the last error, which we should do here
|
||||||
// since we're turning it into an exception here
|
// since we're turning it into an exception here
|
||||||
(void)hipGetLastError();
|
(void)hipGetLastError();
|
||||||
throw Experimental::HIPRawMemoryAllocationFailure(
|
Kokkos::Impl::throw_bad_alloc(name(), arg_alloc_size, arg_label);
|
||||||
arg_alloc_size, error_code,
|
|
||||||
Experimental::RawMemoryAllocationFailure::AllocationMechanism::
|
|
||||||
HIPMallocManaged);
|
|
||||||
}
|
}
|
||||||
KOKKOS_IMPL_HIP_SAFE_CALL(hipMemAdvise(
|
KOKKOS_IMPL_HIP_SAFE_CALL(hipMemAdvise(
|
||||||
ptr, arg_alloc_size, hipMemAdviseSetCoarseGrain, m_device));
|
ptr, arg_alloc_size, hipMemAdviseSetCoarseGrain, m_device));
|
||||||
|
|||||||
@ -153,7 +153,7 @@ void HPX::impl_instance_fence_locked(const std::string &name) const {
|
|||||||
auto &s = impl_get_sender();
|
auto &s = impl_get_sender();
|
||||||
|
|
||||||
hpx::this_thread::experimental::sync_wait(std::move(s));
|
hpx::this_thread::experimental::sync_wait(std::move(s));
|
||||||
s = hpx::execution::experimental::unique_any_sender(
|
s = hpx::execution::experimental::unique_any_sender<>(
|
||||||
hpx::execution::experimental::just());
|
hpx::execution::experimental::just());
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
@ -184,7 +184,7 @@ void HPX::impl_static_fence(const std::string &name) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
hpx::this_thread::experimental::sync_wait(std::move(s));
|
hpx::this_thread::experimental::sync_wait(std::move(s));
|
||||||
s = hpx::execution::experimental::unique_any_sender(
|
s = hpx::execution::experimental::unique_any_sender<>(
|
||||||
hpx::execution::experimental::just());
|
hpx::execution::experimental::just());
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|||||||
@ -168,17 +168,31 @@ class HPX {
|
|||||||
: m_instance_data(Kokkos::Impl::HostSharedPtr<instance_data>(
|
: m_instance_data(Kokkos::Impl::HostSharedPtr<instance_data>(
|
||||||
&m_default_instance_data, &default_instance_deleter)) {}
|
&m_default_instance_data, &default_instance_deleter)) {}
|
||||||
~HPX() = default;
|
~HPX() = default;
|
||||||
HPX(instance_mode mode)
|
explicit HPX(instance_mode mode)
|
||||||
: m_instance_data(
|
: m_instance_data(
|
||||||
mode == instance_mode::independent
|
mode == instance_mode::independent
|
||||||
? (Kokkos::Impl::HostSharedPtr<instance_data>(
|
? (Kokkos::Impl::HostSharedPtr<instance_data>(
|
||||||
new instance_data(m_next_instance_id++)))
|
new instance_data(m_next_instance_id++)))
|
||||||
: Kokkos::Impl::HostSharedPtr<instance_data>(
|
: Kokkos::Impl::HostSharedPtr<instance_data>(
|
||||||
&m_default_instance_data, &default_instance_deleter)) {}
|
&m_default_instance_data, &default_instance_deleter)) {}
|
||||||
HPX(hpx::execution::experimental::unique_any_sender<> &&sender)
|
explicit HPX(hpx::execution::experimental::unique_any_sender<> &&sender)
|
||||||
: m_instance_data(Kokkos::Impl::HostSharedPtr<instance_data>(
|
: m_instance_data(Kokkos::Impl::HostSharedPtr<instance_data>(
|
||||||
new instance_data(m_next_instance_id++, std::move(sender)))) {}
|
new instance_data(m_next_instance_id++, std::move(sender)))) {}
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
|
template <typename T = void>
|
||||||
|
KOKKOS_DEPRECATED_WITH_COMMENT(
|
||||||
|
"HPX execution space should be constructed explicitly.")
|
||||||
|
HPX(instance_mode mode)
|
||||||
|
: HPX(mode) {}
|
||||||
|
|
||||||
|
template <typename T = void>
|
||||||
|
KOKKOS_DEPRECATED_WITH_COMMENT(
|
||||||
|
"HPX execution space should be constructed explicitly.")
|
||||||
|
HPX(hpx::execution::experimental::unique_any_sender<> &&sender)
|
||||||
|
: HPX(std::move(sender)) {}
|
||||||
|
#endif
|
||||||
|
|
||||||
HPX(HPX &&other) = default;
|
HPX(HPX &&other) = default;
|
||||||
HPX(const HPX &other) = default;
|
HPX(const HPX &other) = default;
|
||||||
|
|
||||||
|
|||||||
@ -29,7 +29,6 @@
|
|||||||
#include <type_traits>
|
#include <type_traits>
|
||||||
#include <algorithm>
|
#include <algorithm>
|
||||||
#include <utility>
|
#include <utility>
|
||||||
#include <limits>
|
|
||||||
#include <cstddef>
|
#include <cstddef>
|
||||||
|
|
||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
@ -80,7 +79,11 @@ struct ArrayBoundsCheck<Integral, false> {
|
|||||||
/**\brief Derived from the C++17 'std::array'.
|
/**\brief Derived from the C++17 'std::array'.
|
||||||
* Dropping the iterator interface.
|
* Dropping the iterator interface.
|
||||||
*/
|
*/
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
template <class T = void, size_t N = KOKKOS_INVALID_INDEX, class Proxy = void>
|
template <class T = void, size_t N = KOKKOS_INVALID_INDEX, class Proxy = void>
|
||||||
|
#else
|
||||||
|
template <class T, size_t N>
|
||||||
|
#endif
|
||||||
struct Array {
|
struct Array {
|
||||||
public:
|
public:
|
||||||
/**
|
/**
|
||||||
@ -129,10 +132,38 @@ struct Array {
|
|||||||
KOKKOS_INLINE_FUNCTION constexpr const_pointer data() const {
|
KOKKOS_INLINE_FUNCTION constexpr const_pointer data() const {
|
||||||
return &m_internal_implementation_private_member_data[0];
|
return &m_internal_implementation_private_member_data[0];
|
||||||
}
|
}
|
||||||
|
|
||||||
|
friend KOKKOS_FUNCTION constexpr bool operator==(Array const& lhs,
|
||||||
|
Array const& rhs) noexcept {
|
||||||
|
for (size_t i = 0; i != N; ++i)
|
||||||
|
if (lhs[i] != rhs[i]) return false;
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
friend KOKKOS_FUNCTION constexpr bool operator!=(Array const& lhs,
|
||||||
|
Array const& rhs) noexcept {
|
||||||
|
return !(lhs == rhs);
|
||||||
|
}
|
||||||
|
|
||||||
|
private:
|
||||||
|
template <class U = T>
|
||||||
|
friend KOKKOS_INLINE_FUNCTION constexpr std::enable_if_t<
|
||||||
|
Impl::is_swappable<U>::value>
|
||||||
|
kokkos_swap(Array<T, N>& a,
|
||||||
|
Array<T, N>& b) noexcept(Impl::is_nothrow_swappable_v<U>) {
|
||||||
|
for (std::size_t i = 0; i < N; ++i) {
|
||||||
|
kokkos_swap(a[i], b[i]);
|
||||||
|
}
|
||||||
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
template <class T, class Proxy>
|
template <class T, class Proxy>
|
||||||
struct Array<T, 0, Proxy> {
|
struct Array<T, 0, Proxy> {
|
||||||
|
#else
|
||||||
|
template <class T>
|
||||||
|
struct Array<T, 0> {
|
||||||
|
#endif
|
||||||
public:
|
public:
|
||||||
using reference = T&;
|
using reference = T&;
|
||||||
using const_reference = std::add_const_t<T>&;
|
using const_reference = std::add_const_t<T>&;
|
||||||
@ -167,25 +198,35 @@ struct Array<T, 0, Proxy> {
|
|||||||
KOKKOS_INLINE_FUNCTION pointer data() { return nullptr; }
|
KOKKOS_INLINE_FUNCTION pointer data() { return nullptr; }
|
||||||
KOKKOS_INLINE_FUNCTION const_pointer data() const { return nullptr; }
|
KOKKOS_INLINE_FUNCTION const_pointer data() const { return nullptr; }
|
||||||
|
|
||||||
KOKKOS_DEFAULTED_FUNCTION ~Array() = default;
|
friend KOKKOS_FUNCTION constexpr bool operator==(Array const&,
|
||||||
KOKKOS_DEFAULTED_FUNCTION Array() = default;
|
Array const&) noexcept {
|
||||||
KOKKOS_DEFAULTED_FUNCTION Array(const Array&) = default;
|
return true;
|
||||||
KOKKOS_DEFAULTED_FUNCTION Array& operator=(const Array&) = default;
|
}
|
||||||
|
friend KOKKOS_FUNCTION constexpr bool operator!=(Array const&,
|
||||||
|
Array const&) noexcept {
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
// Some supported compilers are not sufficiently C++11 compliant
|
private:
|
||||||
// for default move constructor and move assignment operator.
|
friend KOKKOS_INLINE_FUNCTION constexpr void kokkos_swap(
|
||||||
// Array( Array && ) = default ;
|
Array<T, 0>&, Array<T, 0>&) noexcept {}
|
||||||
// Array & operator = ( Array && ) = default ;
|
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
|
namespace Impl {
|
||||||
|
struct KokkosArrayContiguous {};
|
||||||
|
struct KokkosArrayStrided {};
|
||||||
|
} // namespace Impl
|
||||||
|
|
||||||
template <>
|
template <>
|
||||||
struct Array<void, KOKKOS_INVALID_INDEX, void> {
|
struct KOKKOS_DEPRECATED Array<void, KOKKOS_INVALID_INDEX, void> {
|
||||||
struct contiguous {};
|
using contiguous = Impl::KokkosArrayContiguous;
|
||||||
struct strided {};
|
using strided = Impl::KokkosArrayStrided;
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class T>
|
template <class T>
|
||||||
struct Array<T, KOKKOS_INVALID_INDEX, Array<>::contiguous> {
|
struct KOKKOS_DEPRECATED
|
||||||
|
Array<T, KOKKOS_INVALID_INDEX, Impl::KokkosArrayContiguous> {
|
||||||
private:
|
private:
|
||||||
T* m_elem;
|
T* m_elem;
|
||||||
size_t m_size;
|
size_t m_size;
|
||||||
@ -253,7 +294,8 @@ struct Array<T, KOKKOS_INVALID_INDEX, Array<>::contiguous> {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <class T>
|
template <class T>
|
||||||
struct Array<T, KOKKOS_INVALID_INDEX, Array<>::strided> {
|
struct KOKKOS_DEPRECATED
|
||||||
|
Array<T, KOKKOS_INVALID_INDEX, Impl::KokkosArrayStrided> {
|
||||||
private:
|
private:
|
||||||
T* m_elem;
|
T* m_elem;
|
||||||
size_t m_size;
|
size_t m_size;
|
||||||
@ -320,10 +362,37 @@ struct Array<T, KOKKOS_INVALID_INDEX, Array<>::strided> {
|
|||||||
size_type arg_stride)
|
size_type arg_stride)
|
||||||
: m_elem(arg_ptr), m_size(arg_size), m_stride(arg_stride) {}
|
: m_elem(arg_ptr), m_size(arg_size), m_stride(arg_stride) {}
|
||||||
};
|
};
|
||||||
|
#endif
|
||||||
|
|
||||||
template <typename T, typename... Us>
|
template <typename T, typename... Us>
|
||||||
Array(T, Us...)->Array<T, 1 + sizeof...(Us)>;
|
Array(T, Us...)->Array<T, 1 + sizeof...(Us)>;
|
||||||
|
|
||||||
|
namespace Impl {
|
||||||
|
|
||||||
|
template <typename T, size_t N, size_t... I>
|
||||||
|
KOKKOS_FUNCTION constexpr Array<std::remove_cv_t<T>, N> to_array_impl(
|
||||||
|
T (&a)[N], std::index_sequence<I...>) {
|
||||||
|
return {{a[I]...}};
|
||||||
|
}
|
||||||
|
|
||||||
|
template <typename T, size_t N, size_t... I>
|
||||||
|
KOKKOS_FUNCTION constexpr Array<std::remove_cv_t<T>, N> to_array_impl(
|
||||||
|
T(&&a)[N], std::index_sequence<I...>) {
|
||||||
|
return {{std::move(a[I])...}};
|
||||||
|
}
|
||||||
|
|
||||||
|
} // namespace Impl
|
||||||
|
|
||||||
|
template <typename T, size_t N>
|
||||||
|
KOKKOS_FUNCTION constexpr auto to_array(T (&a)[N]) {
|
||||||
|
return Impl::to_array_impl(a, std::make_index_sequence<N>{});
|
||||||
|
}
|
||||||
|
|
||||||
|
template <typename T, size_t N>
|
||||||
|
KOKKOS_FUNCTION constexpr auto to_array(T(&&a)[N]) {
|
||||||
|
return Impl::to_array_impl(std::move(a), std::make_index_sequence<N>{});
|
||||||
|
}
|
||||||
|
|
||||||
} // namespace Kokkos
|
} // namespace Kokkos
|
||||||
|
|
||||||
//<editor-fold desc="Support for structured binding">
|
//<editor-fold desc="Support for structured binding">
|
||||||
@ -333,6 +402,7 @@ struct std::tuple_size<Kokkos::Array<T, N>>
|
|||||||
|
|
||||||
template <std::size_t I, class T, std::size_t N>
|
template <std::size_t I, class T, std::size_t N>
|
||||||
struct std::tuple_element<I, Kokkos::Array<T, N>> {
|
struct std::tuple_element<I, Kokkos::Array<T, N>> {
|
||||||
|
static_assert(I < N);
|
||||||
using type = T;
|
using type = T;
|
||||||
};
|
};
|
||||||
|
|
||||||
@ -340,21 +410,25 @@ namespace Kokkos {
|
|||||||
|
|
||||||
template <std::size_t I, class T, std::size_t N>
|
template <std::size_t I, class T, std::size_t N>
|
||||||
KOKKOS_FUNCTION constexpr T& get(Array<T, N>& a) noexcept {
|
KOKKOS_FUNCTION constexpr T& get(Array<T, N>& a) noexcept {
|
||||||
|
static_assert(I < N);
|
||||||
return a[I];
|
return a[I];
|
||||||
}
|
}
|
||||||
|
|
||||||
template <std::size_t I, class T, std::size_t N>
|
template <std::size_t I, class T, std::size_t N>
|
||||||
KOKKOS_FUNCTION constexpr T const& get(Array<T, N> const& a) noexcept {
|
KOKKOS_FUNCTION constexpr T const& get(Array<T, N> const& a) noexcept {
|
||||||
|
static_assert(I < N);
|
||||||
return a[I];
|
return a[I];
|
||||||
}
|
}
|
||||||
|
|
||||||
template <std::size_t I, class T, std::size_t N>
|
template <std::size_t I, class T, std::size_t N>
|
||||||
KOKKOS_FUNCTION constexpr T&& get(Array<T, N>&& a) noexcept {
|
KOKKOS_FUNCTION constexpr T&& get(Array<T, N>&& a) noexcept {
|
||||||
|
static_assert(I < N);
|
||||||
return std::move(a[I]);
|
return std::move(a[I]);
|
||||||
}
|
}
|
||||||
|
|
||||||
template <std::size_t I, class T, std::size_t N>
|
template <std::size_t I, class T, std::size_t N>
|
||||||
KOKKOS_FUNCTION constexpr T const&& get(Array<T, N> const&& a) noexcept {
|
KOKKOS_FUNCTION constexpr T const&& get(Array<T, N> const&& a) noexcept {
|
||||||
|
static_assert(I < N);
|
||||||
return std::move(a[I]);
|
return std::move(a[I]);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@ -22,7 +22,6 @@ static_assert(false,
|
|||||||
#ifndef KOKKOS_DESUL_ATOMICS_VOLATILE_WRAPPER_HPP_
|
#ifndef KOKKOS_DESUL_ATOMICS_VOLATILE_WRAPPER_HPP_
|
||||||
#define KOKKOS_DESUL_ATOMICS_VOLATILE_WRAPPER_HPP_
|
#define KOKKOS_DESUL_ATOMICS_VOLATILE_WRAPPER_HPP_
|
||||||
#include <Kokkos_Macros.hpp>
|
#include <Kokkos_Macros.hpp>
|
||||||
#include <Kokkos_Atomics_Desul_Config.hpp>
|
|
||||||
#include <desul/atomics.hpp>
|
#include <desul/atomics.hpp>
|
||||||
|
|
||||||
#ifdef KOKKOS_ENABLE_ATOMICS_BYPASS
|
#ifdef KOKKOS_ENABLE_ATOMICS_BYPASS
|
||||||
|
|||||||
@ -22,8 +22,6 @@ static_assert(false,
|
|||||||
#ifndef KOKKOS_DESUL_ATOMICS_WRAPPER_HPP_
|
#ifndef KOKKOS_DESUL_ATOMICS_WRAPPER_HPP_
|
||||||
#define KOKKOS_DESUL_ATOMICS_WRAPPER_HPP_
|
#define KOKKOS_DESUL_ATOMICS_WRAPPER_HPP_
|
||||||
#include <Kokkos_Macros.hpp>
|
#include <Kokkos_Macros.hpp>
|
||||||
|
|
||||||
#include <Kokkos_Atomics_Desul_Config.hpp>
|
|
||||||
#include <desul/atomics.hpp>
|
#include <desul/atomics.hpp>
|
||||||
|
|
||||||
#include <impl/Kokkos_Volatile_Load.hpp>
|
#include <impl/Kokkos_Volatile_Load.hpp>
|
||||||
|
|||||||
@ -28,6 +28,7 @@
|
|||||||
#include <complex>
|
#include <complex>
|
||||||
#include <type_traits>
|
#include <type_traits>
|
||||||
#include <iosfwd>
|
#include <iosfwd>
|
||||||
|
#include <tuple>
|
||||||
|
|
||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
|
|
||||||
@ -256,6 +257,12 @@ class
|
|||||||
return *this;
|
return *this;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
template <size_t I, typename RT>
|
||||||
|
friend constexpr const RT& get(const complex<RT>&) noexcept;
|
||||||
|
|
||||||
|
template <size_t I, typename RT>
|
||||||
|
friend constexpr const RT&& get(const complex<RT>&&) noexcept;
|
||||||
|
|
||||||
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
//! Copy constructor from volatile.
|
//! Copy constructor from volatile.
|
||||||
template <
|
template <
|
||||||
@ -423,6 +430,75 @@ class
|
|||||||
#endif // KOKKOS_ENABLE_DEPRECATED_CODE_4
|
#endif // KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
};
|
};
|
||||||
|
|
||||||
|
} // namespace Kokkos
|
||||||
|
|
||||||
|
// Tuple protocol for complex based on https://wg21.link/P2819R2 (voted into
|
||||||
|
// the C++26 working draft on 2023-11)
|
||||||
|
|
||||||
|
template <typename RealType>
|
||||||
|
struct std::tuple_size<Kokkos::complex<RealType>>
|
||||||
|
: std::integral_constant<size_t, 2> {};
|
||||||
|
|
||||||
|
template <size_t I, typename RealType>
|
||||||
|
struct std::tuple_element<I, Kokkos::complex<RealType>> {
|
||||||
|
static_assert(I < 2);
|
||||||
|
using type = RealType;
|
||||||
|
};
|
||||||
|
|
||||||
|
namespace Kokkos {
|
||||||
|
|
||||||
|
// get<...>(...) defined here so as not to be hidden friends, as per P2819R2
|
||||||
|
|
||||||
|
template <size_t I, typename RealType>
|
||||||
|
KOKKOS_FUNCTION constexpr RealType& get(complex<RealType>& z) noexcept {
|
||||||
|
static_assert(I < 2);
|
||||||
|
if constexpr (I == 0)
|
||||||
|
return z.real();
|
||||||
|
else
|
||||||
|
return z.imag();
|
||||||
|
#ifdef KOKKOS_COMPILER_INTEL
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
|
template <size_t I, typename RealType>
|
||||||
|
KOKKOS_FUNCTION constexpr RealType&& get(complex<RealType>&& z) noexcept {
|
||||||
|
static_assert(I < 2);
|
||||||
|
if constexpr (I == 0)
|
||||||
|
return std::move(z.real());
|
||||||
|
else
|
||||||
|
return std::move(z.imag());
|
||||||
|
#ifdef KOKKOS_COMPILER_INTEL
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
|
template <size_t I, typename RealType>
|
||||||
|
KOKKOS_FUNCTION constexpr const RealType& get(
|
||||||
|
const complex<RealType>& z) noexcept {
|
||||||
|
static_assert(I < 2);
|
||||||
|
if constexpr (I == 0)
|
||||||
|
return z.re_;
|
||||||
|
else
|
||||||
|
return z.im_;
|
||||||
|
#ifdef KOKKOS_COMPILER_INTEL
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
|
template <size_t I, typename RealType>
|
||||||
|
KOKKOS_FUNCTION constexpr const RealType&& get(
|
||||||
|
const complex<RealType>&& z) noexcept {
|
||||||
|
static_assert(I < 2);
|
||||||
|
if constexpr (I == 0)
|
||||||
|
return std::move(z.re_);
|
||||||
|
else
|
||||||
|
return std::move(z.im_);
|
||||||
|
#ifdef KOKKOS_COMPILER_INTEL
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
//==============================================================================
|
//==============================================================================
|
||||||
// <editor-fold desc="Equality and inequality"> {{{1
|
// <editor-fold desc="Equality and inequality"> {{{1
|
||||||
|
|
||||||
|
|||||||
@ -221,10 +221,12 @@ struct ViewFill<ViewType, Layout, ExecSpace, 7, iType> {
|
|||||||
ViewFill(const ViewType& a_, typename ViewType::const_value_type& val_,
|
ViewFill(const ViewType& a_, typename ViewType::const_value_type& val_,
|
||||||
const ExecSpace& space)
|
const ExecSpace& space)
|
||||||
: a(a_), val(val_) {
|
: a(a_), val(val_) {
|
||||||
|
// MDRangePolicy is not supported for 7D views
|
||||||
|
// Iterate separately over extent(2)
|
||||||
Kokkos::parallel_for("Kokkos::ViewFill-7D",
|
Kokkos::parallel_for("Kokkos::ViewFill-7D",
|
||||||
policy_type(space, {0, 0, 0, 0, 0, 0},
|
policy_type(space, {0, 0, 0, 0, 0, 0},
|
||||||
{a.extent(0), a.extent(1), a.extent(2),
|
{a.extent(0), a.extent(1), a.extent(3),
|
||||||
a.extent(3), a.extent(5), a.extent(6)}),
|
a.extent(4), a.extent(5), a.extent(6)}),
|
||||||
*this);
|
*this);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -249,6 +251,8 @@ struct ViewFill<ViewType, Layout, ExecSpace, 8, iType> {
|
|||||||
ViewFill(const ViewType& a_, typename ViewType::const_value_type& val_,
|
ViewFill(const ViewType& a_, typename ViewType::const_value_type& val_,
|
||||||
const ExecSpace& space)
|
const ExecSpace& space)
|
||||||
: a(a_), val(val_) {
|
: a(a_), val(val_) {
|
||||||
|
// MDRangePolicy is not supported for 8D views
|
||||||
|
// Iterate separately over extent(2) and extent(4)
|
||||||
Kokkos::parallel_for("Kokkos::ViewFill-8D",
|
Kokkos::parallel_for("Kokkos::ViewFill-8D",
|
||||||
policy_type(space, {0, 0, 0, 0, 0, 0},
|
policy_type(space, {0, 0, 0, 0, 0, 0},
|
||||||
{a.extent(0), a.extent(1), a.extent(3),
|
{a.extent(0), a.extent(1), a.extent(3),
|
||||||
@ -293,9 +297,11 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 2, iType> {
|
|||||||
ViewTypeA a;
|
ViewTypeA a;
|
||||||
ViewTypeB b;
|
ViewTypeB b;
|
||||||
static const Kokkos::Iterate outer_iteration_pattern =
|
static const Kokkos::Iterate outer_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::outer_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::outer_iteration_pattern;
|
||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::inner_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::inner_iteration_pattern;
|
||||||
using iterate_type =
|
using iterate_type =
|
||||||
Kokkos::Rank<2, outer_iteration_pattern, inner_iteration_pattern>;
|
Kokkos::Rank<2, outer_iteration_pattern, inner_iteration_pattern>;
|
||||||
using policy_type =
|
using policy_type =
|
||||||
@ -323,9 +329,11 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 3, iType> {
|
|||||||
ViewTypeB b;
|
ViewTypeB b;
|
||||||
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern =
|
static const Kokkos::Iterate outer_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::outer_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::outer_iteration_pattern;
|
||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::inner_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::inner_iteration_pattern;
|
||||||
using iterate_type =
|
using iterate_type =
|
||||||
Kokkos::Rank<3, outer_iteration_pattern, inner_iteration_pattern>;
|
Kokkos::Rank<3, outer_iteration_pattern, inner_iteration_pattern>;
|
||||||
using policy_type =
|
using policy_type =
|
||||||
@ -354,9 +362,11 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 4, iType> {
|
|||||||
ViewTypeB b;
|
ViewTypeB b;
|
||||||
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern =
|
static const Kokkos::Iterate outer_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::outer_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::outer_iteration_pattern;
|
||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::inner_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::inner_iteration_pattern;
|
||||||
using iterate_type =
|
using iterate_type =
|
||||||
Kokkos::Rank<4, outer_iteration_pattern, inner_iteration_pattern>;
|
Kokkos::Rank<4, outer_iteration_pattern, inner_iteration_pattern>;
|
||||||
using policy_type =
|
using policy_type =
|
||||||
@ -386,9 +396,11 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 5, iType> {
|
|||||||
ViewTypeB b;
|
ViewTypeB b;
|
||||||
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern =
|
static const Kokkos::Iterate outer_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::outer_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::outer_iteration_pattern;
|
||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::inner_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::inner_iteration_pattern;
|
||||||
using iterate_type =
|
using iterate_type =
|
||||||
Kokkos::Rank<5, outer_iteration_pattern, inner_iteration_pattern>;
|
Kokkos::Rank<5, outer_iteration_pattern, inner_iteration_pattern>;
|
||||||
using policy_type =
|
using policy_type =
|
||||||
@ -418,9 +430,11 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 6, iType> {
|
|||||||
ViewTypeB b;
|
ViewTypeB b;
|
||||||
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern =
|
static const Kokkos::Iterate outer_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::outer_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::outer_iteration_pattern;
|
||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::inner_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::inner_iteration_pattern;
|
||||||
using iterate_type =
|
using iterate_type =
|
||||||
Kokkos::Rank<6, outer_iteration_pattern, inner_iteration_pattern>;
|
Kokkos::Rank<6, outer_iteration_pattern, inner_iteration_pattern>;
|
||||||
using policy_type =
|
using policy_type =
|
||||||
@ -450,9 +464,11 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 7, iType> {
|
|||||||
ViewTypeB b;
|
ViewTypeB b;
|
||||||
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern =
|
static const Kokkos::Iterate outer_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::outer_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::outer_iteration_pattern;
|
||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::inner_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::inner_iteration_pattern;
|
||||||
using iterate_type =
|
using iterate_type =
|
||||||
Kokkos::Rank<6, outer_iteration_pattern, inner_iteration_pattern>;
|
Kokkos::Rank<6, outer_iteration_pattern, inner_iteration_pattern>;
|
||||||
using policy_type =
|
using policy_type =
|
||||||
@ -461,6 +477,8 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 7, iType> {
|
|||||||
ViewCopy(const ViewTypeA& a_, const ViewTypeB& b_,
|
ViewCopy(const ViewTypeA& a_, const ViewTypeB& b_,
|
||||||
const ExecSpace space = ExecSpace())
|
const ExecSpace space = ExecSpace())
|
||||||
: a(a_), b(b_) {
|
: a(a_), b(b_) {
|
||||||
|
// MDRangePolicy is not supported for 7D views
|
||||||
|
// Iterate separately over extent(2)
|
||||||
Kokkos::parallel_for("Kokkos::ViewCopy-7D",
|
Kokkos::parallel_for("Kokkos::ViewCopy-7D",
|
||||||
policy_type(space, {0, 0, 0, 0, 0, 0},
|
policy_type(space, {0, 0, 0, 0, 0, 0},
|
||||||
{a.extent(0), a.extent(1), a.extent(3),
|
{a.extent(0), a.extent(1), a.extent(3),
|
||||||
@ -483,9 +501,11 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 8, iType> {
|
|||||||
ViewTypeB b;
|
ViewTypeB b;
|
||||||
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern =
|
static const Kokkos::Iterate outer_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::outer_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::outer_iteration_pattern;
|
||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::layout_iterate_type_selector<Layout>::inner_iteration_pattern;
|
Kokkos::Impl::layout_iterate_type_selector<
|
||||||
|
Layout>::inner_iteration_pattern;
|
||||||
using iterate_type =
|
using iterate_type =
|
||||||
Kokkos::Rank<6, outer_iteration_pattern, inner_iteration_pattern>;
|
Kokkos::Rank<6, outer_iteration_pattern, inner_iteration_pattern>;
|
||||||
using policy_type =
|
using policy_type =
|
||||||
@ -494,6 +514,8 @@ struct ViewCopy<ViewTypeA, ViewTypeB, Layout, ExecSpace, 8, iType> {
|
|||||||
ViewCopy(const ViewTypeA& a_, const ViewTypeB& b_,
|
ViewCopy(const ViewTypeA& a_, const ViewTypeB& b_,
|
||||||
const ExecSpace space = ExecSpace())
|
const ExecSpace space = ExecSpace())
|
||||||
: a(a_), b(b_) {
|
: a(a_), b(b_) {
|
||||||
|
// MDRangePolicy is not supported for 8D views
|
||||||
|
// Iterate separately over extent(2) and extent(4)
|
||||||
Kokkos::parallel_for("Kokkos::ViewCopy-8D",
|
Kokkos::parallel_for("Kokkos::ViewCopy-8D",
|
||||||
policy_type(space, {0, 0, 0, 0, 0, 0},
|
policy_type(space, {0, 0, 0, 0, 0, 0},
|
||||||
{a.extent(0), a.extent(1), a.extent(3),
|
{a.extent(0), a.extent(1), a.extent(3),
|
||||||
@ -539,11 +561,8 @@ void view_copy(const ExecutionSpace& space, const DstType& dst,
|
|||||||
int64_t strides[DstType::rank + 1];
|
int64_t strides[DstType::rank + 1];
|
||||||
dst.stride(strides);
|
dst.stride(strides);
|
||||||
Kokkos::Iterate iterate;
|
Kokkos::Iterate iterate;
|
||||||
if (Kokkos::is_layouttiled<typename DstType::array_layout>::value) {
|
if (std::is_same<typename DstType::array_layout,
|
||||||
iterate = Kokkos::layout_iterate_type_selector<
|
Kokkos::LayoutRight>::value) {
|
||||||
typename DstType::array_layout>::outer_iteration_pattern;
|
|
||||||
} else if (std::is_same<typename DstType::array_layout,
|
|
||||||
Kokkos::LayoutRight>::value) {
|
|
||||||
iterate = Kokkos::Iterate::Right;
|
iterate = Kokkos::Iterate::Right;
|
||||||
} else if (std::is_same<typename DstType::array_layout,
|
} else if (std::is_same<typename DstType::array_layout,
|
||||||
Kokkos::LayoutLeft>::value) {
|
Kokkos::LayoutLeft>::value) {
|
||||||
@ -630,11 +649,8 @@ void view_copy(const DstType& dst, const SrcType& src) {
|
|||||||
int64_t strides[DstType::rank + 1];
|
int64_t strides[DstType::rank + 1];
|
||||||
dst.stride(strides);
|
dst.stride(strides);
|
||||||
Kokkos::Iterate iterate;
|
Kokkos::Iterate iterate;
|
||||||
if (Kokkos::is_layouttiled<typename DstType::array_layout>::value) {
|
if (std::is_same<typename DstType::array_layout,
|
||||||
iterate = Kokkos::layout_iterate_type_selector<
|
Kokkos::LayoutRight>::value) {
|
||||||
typename DstType::array_layout>::outer_iteration_pattern;
|
|
||||||
} else if (std::is_same<typename DstType::array_layout,
|
|
||||||
Kokkos::LayoutRight>::value) {
|
|
||||||
iterate = Kokkos::Iterate::Right;
|
iterate = Kokkos::Iterate::Right;
|
||||||
} else if (std::is_same<typename DstType::array_layout,
|
} else if (std::is_same<typename DstType::array_layout,
|
||||||
Kokkos::LayoutLeft>::value) {
|
Kokkos::LayoutLeft>::value) {
|
||||||
@ -3092,8 +3108,7 @@ inline std::enable_if_t<
|
|||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutRight>::value ||
|
Kokkos::LayoutRight>::value ||
|
||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutStride>::value ||
|
Kokkos::LayoutStride>::value>
|
||||||
is_layouttiled<typename Kokkos::View<T, P...>::array_layout>::value>
|
|
||||||
impl_resize(const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
impl_resize(const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
Kokkos::View<T, P...>& v,
|
Kokkos::View<T, P...>& v,
|
||||||
const typename Kokkos::View<T, P...>::array_layout& layout) {
|
const typename Kokkos::View<T, P...>::array_layout& layout) {
|
||||||
@ -3139,8 +3154,7 @@ inline std::enable_if_t<
|
|||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutRight>::value ||
|
Kokkos::LayoutRight>::value ||
|
||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutStride>::value ||
|
Kokkos::LayoutStride>::value)>
|
||||||
is_layouttiled<typename Kokkos::View<T, P...>::array_layout>::value)>
|
|
||||||
impl_resize(const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
impl_resize(const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
Kokkos::View<T, P...>& v,
|
Kokkos::View<T, P...>& v,
|
||||||
const typename Kokkos::View<T, P...>::array_layout& layout) {
|
const typename Kokkos::View<T, P...>::array_layout& layout) {
|
||||||
@ -3235,7 +3249,10 @@ impl_realloc(Kokkos::View<T, P...>& v, const size_t n0, const size_t n1,
|
|||||||
v = view_type(); // Best effort to deallocate in case no other view refers
|
v = view_type(); // Best effort to deallocate in case no other view refers
|
||||||
// to the shared allocation
|
// to the shared allocation
|
||||||
v = view_type(arg_prop_copy, n0, n1, n2, n3, n4, n5, n6, n7);
|
v = view_type(arg_prop_copy, n0, n1, n2, n3, n4, n5, n6, n7);
|
||||||
} else if (alloc_prop_input::initialize) {
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if constexpr (alloc_prop_input::initialize) {
|
||||||
if constexpr (alloc_prop_input::has_execution_space) {
|
if constexpr (alloc_prop_input::has_execution_space) {
|
||||||
const auto& exec_space =
|
const auto& exec_space =
|
||||||
Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop);
|
Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop);
|
||||||
@ -3308,8 +3325,7 @@ inline std::enable_if_t<
|
|||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutRight>::value ||
|
Kokkos::LayoutRight>::value ||
|
||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutStride>::value ||
|
Kokkos::LayoutStride>::value>
|
||||||
is_layouttiled<typename Kokkos::View<T, P...>::array_layout>::value>
|
|
||||||
impl_realloc(Kokkos::View<T, P...>& v,
|
impl_realloc(Kokkos::View<T, P...>& v,
|
||||||
const typename Kokkos::View<T, P...>::array_layout& layout,
|
const typename Kokkos::View<T, P...>::array_layout& layout,
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
@ -3331,7 +3347,10 @@ impl_realloc(Kokkos::View<T, P...>& v,
|
|||||||
if (v.layout() != layout) {
|
if (v.layout() != layout) {
|
||||||
v = view_type(); // Deallocate first, if the only view to allocation
|
v = view_type(); // Deallocate first, if the only view to allocation
|
||||||
v = view_type(arg_prop, layout);
|
v = view_type(arg_prop, layout);
|
||||||
} else if (alloc_prop_input::initialize) {
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
if constexpr (alloc_prop_input::initialize) {
|
||||||
if constexpr (alloc_prop_input::has_execution_space) {
|
if constexpr (alloc_prop_input::has_execution_space) {
|
||||||
const auto& exec_space =
|
const auto& exec_space =
|
||||||
Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop);
|
Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop);
|
||||||
@ -3351,8 +3370,7 @@ inline std::enable_if_t<
|
|||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutRight>::value ||
|
Kokkos::LayoutRight>::value ||
|
||||||
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
std::is_same<typename Kokkos::View<T, P...>::array_layout,
|
||||||
Kokkos::LayoutStride>::value ||
|
Kokkos::LayoutStride>::value)>
|
||||||
is_layouttiled<typename Kokkos::View<T, P...>::array_layout>::value)>
|
|
||||||
impl_realloc(Kokkos::View<T, P...>& v,
|
impl_realloc(Kokkos::View<T, P...>& v,
|
||||||
const typename Kokkos::View<T, P...>::array_layout& layout,
|
const typename Kokkos::View<T, P...>::array_layout& layout,
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
@ -3452,6 +3470,7 @@ struct MirrorType {
|
|||||||
using view_type = Kokkos::View<data_type, array_layout, Space>;
|
using view_type = Kokkos::View<data_type, array_layout, Space>;
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// collection of static asserts for create_mirror and create_mirror_view
|
||||||
template <class... ViewCtorArgs>
|
template <class... ViewCtorArgs>
|
||||||
void check_view_ctor_args_create_mirror() {
|
void check_view_ctor_args_create_mirror() {
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
@ -3470,232 +3489,231 @@ void check_view_ctor_args_create_mirror() {
|
|||||||
"not explicitly allow padding!");
|
"not explicitly allow padding!");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// create a mirror
|
||||||
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline std::enable_if_t<!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space,
|
inline auto create_mirror(const Kokkos::View<T, P...>& src,
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
create_mirror(const Kokkos::View<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
using src_type = View<T, P...>;
|
|
||||||
using dst_type = typename src_type::HostMirror;
|
|
||||||
|
|
||||||
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
||||||
|
|
||||||
auto prop_copy = Impl::with_properties_if_unset(
|
auto prop_copy = Impl::with_properties_if_unset(
|
||||||
arg_prop, std::string(src.label()).append("_mirror"));
|
arg_prop, std::string(src.label()).append("_mirror"));
|
||||||
|
|
||||||
return dst_type(prop_copy, src.layout());
|
if constexpr (Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
}
|
using memory_space = typename decltype(prop_copy)::memory_space;
|
||||||
|
using dst_type =
|
||||||
// Create a mirror in a new space (specialization for different space)
|
typename Impl::MirrorType<memory_space, T, P...>::view_type;
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
return dst_type(prop_copy, src.layout());
|
||||||
class Enable = std::enable_if_t<
|
} else {
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
using dst_type = typename View<T, P...>::HostMirror;
|
||||||
auto create_mirror(const Kokkos::View<T, P...>& src,
|
return dst_type(prop_copy, src.layout());
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
}
|
||||||
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
auto prop_copy = Impl::with_properties_if_unset(
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
arg_prop, std::string(src.label()).append("_mirror"));
|
__builtin_unreachable();
|
||||||
using alloc_prop = decltype(prop_copy);
|
#endif
|
||||||
|
|
||||||
return typename Impl::MirrorType<typename alloc_prop::memory_space, T,
|
|
||||||
P...>::view_type(prop_copy, src.layout());
|
|
||||||
}
|
}
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
template <class T, class... P>
|
// public interface
|
||||||
std::enable_if_t<std::is_void<typename ViewTraits<T, P...>::specialize>::value,
|
template <class T, class... P,
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
typename = std::enable_if_t<
|
||||||
create_mirror(Kokkos::View<T, P...> const& v) {
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
return Impl::create_mirror(v, Impl::ViewCtorProp<>{});
|
auto create_mirror(Kokkos::View<T, P...> const& src) {
|
||||||
|
return Impl::create_mirror(src, Impl::ViewCtorProp<>{});
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P>
|
// public interface that accepts a without initializing flag
|
||||||
std::enable_if_t<std::is_void<typename ViewTraits<T, P...>::specialize>::value,
|
template <class T, class... P,
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
typename = std::enable_if_t<
|
||||||
create_mirror(Kokkos::Impl::WithoutInitializing_t wi,
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
Kokkos::View<T, P...> const& v) {
|
auto create_mirror(Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
return Impl::create_mirror(v, view_alloc(wi));
|
Kokkos::View<T, P...> const& src) {
|
||||||
|
return Impl::create_mirror(src, view_alloc(wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts a space
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
|
||||||
std::enable_if_t<std::is_void<typename ViewTraits<T, P...>::specialize>::value,
|
|
||||||
typename Impl::MirrorType<Space, T, P...>::view_type>
|
|
||||||
create_mirror(Space const&, Kokkos::View<T, P...> const& v) {
|
|
||||||
return Impl::create_mirror(v, view_alloc(typename Space::memory_space{}));
|
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
typename Enable = std::enable_if_t<
|
typename Enable = std::enable_if_t<
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
Kokkos::is_space<Space>::value &&
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
|
auto create_mirror(Space const&, Kokkos::View<T, P...> const& src) {
|
||||||
|
return Impl::create_mirror(src, view_alloc(typename Space::memory_space{}));
|
||||||
|
}
|
||||||
|
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
|
template <class T, class... P, class... ViewCtorArgs,
|
||||||
|
typename = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
auto create_mirror(Impl::ViewCtorProp<ViewCtorArgs...> const& arg_prop,
|
auto create_mirror(Impl::ViewCtorProp<ViewCtorArgs...> const& arg_prop,
|
||||||
Kokkos::View<T, P...> const& v) {
|
Kokkos::View<T, P...> const& src) {
|
||||||
return Impl::create_mirror(v, arg_prop);
|
return Impl::create_mirror(src, arg_prop);
|
||||||
}
|
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
|
||||||
std::enable_if_t<
|
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space,
|
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
|
||||||
create_mirror(Impl::ViewCtorProp<ViewCtorArgs...> const& arg_prop,
|
|
||||||
Kokkos::View<T, P...> const& v) {
|
|
||||||
return Impl::create_mirror(v, arg_prop);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts a space and a without initializing flag
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
typename Enable = std::enable_if_t<
|
||||||
std::enable_if_t<std::is_void<typename ViewTraits<T, P...>::specialize>::value,
|
Kokkos::is_space<Space>::value &&
|
||||||
typename Impl::MirrorType<Space, T, P...>::view_type>
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
create_mirror(Kokkos::Impl::WithoutInitializing_t wi, Space const&,
|
auto create_mirror(Kokkos::Impl::WithoutInitializing_t wi, Space const&,
|
||||||
Kokkos::View<T, P...> const& v) {
|
Kokkos::View<T, P...> const& src) {
|
||||||
return Impl::create_mirror(v, view_alloc(typename Space::memory_space{}, wi));
|
return Impl::create_mirror(src,
|
||||||
|
view_alloc(typename Space::memory_space{}, wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
|
// choose a `Kokkos::create_mirror` adapted for the provided view and the
|
||||||
|
// provided arguments
|
||||||
|
template <class View, class... ViewCtorArgs>
|
||||||
|
inline auto choose_create_mirror(
|
||||||
|
const View& src, const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
|
// Due to the fact that users can overload `Kokkos::create_mirror`, but also
|
||||||
|
// that they may not have implemented all of its different possible
|
||||||
|
// variations, this function chooses the correct private or public version of
|
||||||
|
// it to call.
|
||||||
|
// This helper should be used by any overload of
|
||||||
|
// `Kokkos::Impl::create_mirror_view`.
|
||||||
|
|
||||||
|
if constexpr (std::is_void_v<typename View::traits::specialize>) {
|
||||||
|
// if the view is not specialized, just call the Impl function
|
||||||
|
|
||||||
|
// using ADL to find the later defined overload of the function
|
||||||
|
using namespace Kokkos::Impl;
|
||||||
|
|
||||||
|
return create_mirror(src, arg_prop);
|
||||||
|
} else {
|
||||||
|
// otherwise, recreate the public call
|
||||||
|
using ViewProp = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
|
|
||||||
|
// using ADL to find the later defined overload of the function
|
||||||
|
using namespace Kokkos;
|
||||||
|
|
||||||
|
if constexpr (sizeof...(ViewCtorArgs) == 0) {
|
||||||
|
// if there are no view constructor args, call the specific public
|
||||||
|
// function
|
||||||
|
return create_mirror(src);
|
||||||
|
} else if constexpr (sizeof...(ViewCtorArgs) == 1 &&
|
||||||
|
ViewProp::has_memory_space) {
|
||||||
|
// if there is one view constructor arg and it has a memory space, call
|
||||||
|
// the specific public function
|
||||||
|
return create_mirror(typename ViewProp::memory_space{}, src);
|
||||||
|
} else if constexpr (sizeof...(ViewCtorArgs) == 1 &&
|
||||||
|
!ViewProp::initialize) {
|
||||||
|
// if there is one view constructor arg and it has a without initializing
|
||||||
|
// mark, call the specific public function
|
||||||
|
return create_mirror(typename Kokkos::Impl::WithoutInitializing_t{}, src);
|
||||||
|
} else if constexpr (sizeof...(ViewCtorArgs) == 2 &&
|
||||||
|
ViewProp::has_memory_space && !ViewProp::initialize) {
|
||||||
|
// if there is two view constructor args and they have a memory space and
|
||||||
|
// a without initializing mark, call the specific public function
|
||||||
|
return create_mirror(typename Kokkos::Impl::WithoutInitializing_t{},
|
||||||
|
typename ViewProp::memory_space{}, src);
|
||||||
|
} else {
|
||||||
|
// if there are other constructor args, call the generic public function
|
||||||
|
|
||||||
|
// Beware, there are some libraries using Kokkos that don't implement
|
||||||
|
// this overload (hence the reason for this present function to exist).
|
||||||
|
return create_mirror(arg_prop, src);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
|
}
|
||||||
|
|
||||||
|
// create a mirror view
|
||||||
|
// private interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
template <class T, class... P, class... ViewCtorArgs>
|
||||||
inline std::enable_if_t<
|
inline auto create_mirror_view(
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
const Kokkos::View<T, P...>& src,
|
||||||
(std::is_same<
|
[[maybe_unused]] const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
||||||
typename Kokkos::View<T, P...>::memory_space,
|
if constexpr (!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space) {
|
||||||
typename Kokkos::View<T, P...>::HostMirror::memory_space>::value &&
|
if constexpr (std::is_same<typename Kokkos::View<T, P...>::memory_space,
|
||||||
std::is_same<
|
typename Kokkos::View<
|
||||||
typename Kokkos::View<T, P...>::data_type,
|
T, P...>::HostMirror::memory_space>::value &&
|
||||||
typename Kokkos::View<T, P...>::HostMirror::data_type>::value),
|
std::is_same<typename Kokkos::View<T, P...>::data_type,
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
typename Kokkos::View<
|
||||||
create_mirror_view(const Kokkos::View<T, P...>& src,
|
T, P...>::HostMirror::data_type>::value) {
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
||||||
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
return typename Kokkos::View<T, P...>::HostMirror(src);
|
||||||
return src;
|
} else {
|
||||||
}
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
|
}
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
} else {
|
||||||
inline std::enable_if_t<
|
if constexpr (Impl::MirrorViewType<typename Impl::ViewCtorProp<
|
||||||
!Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space &&
|
ViewCtorArgs...>::memory_space,
|
||||||
!(std::is_same<typename Kokkos::View<T, P...>::memory_space,
|
T, P...>::is_same_memspace) {
|
||||||
typename Kokkos::View<
|
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
||||||
T, P...>::HostMirror::memory_space>::value &&
|
return typename Impl::MirrorViewType<
|
||||||
std::is_same<
|
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
||||||
typename Kokkos::View<T, P...>::data_type,
|
P...>::view_type(src);
|
||||||
typename Kokkos::View<T, P...>::HostMirror::data_type>::value),
|
} else {
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
return Kokkos::Impl::choose_create_mirror(src, arg_prop);
|
||||||
create_mirror_view(const Kokkos::View<T, P...>& src,
|
}
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
}
|
||||||
return Kokkos::Impl::create_mirror(src, arg_prop);
|
#if defined(KOKKOS_COMPILER_INTEL) || \
|
||||||
}
|
(defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC))
|
||||||
// Create a mirror view in a new space (specialization for same space)
|
__builtin_unreachable();
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
#endif
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
std::enable_if_t<Impl::MirrorViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::view_type>
|
|
||||||
create_mirror_view(const Kokkos::View<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&) {
|
|
||||||
check_view_ctor_args_create_mirror<ViewCtorArgs...>();
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Create a mirror view in a new space (specialization for different space)
|
|
||||||
template <class T, class... P, class... ViewCtorArgs,
|
|
||||||
class = std::enable_if_t<
|
|
||||||
Impl::ViewCtorProp<ViewCtorArgs...>::has_memory_space>>
|
|
||||||
std::enable_if_t<!Impl::MirrorViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::is_same_memspace,
|
|
||||||
typename Impl::MirrorViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space,
|
|
||||||
T, P...>::view_type>
|
|
||||||
create_mirror_view(const Kokkos::View<T, P...>& src,
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop) {
|
|
||||||
return Kokkos::Impl::create_mirror(src, arg_prop);
|
|
||||||
}
|
}
|
||||||
} // namespace Impl
|
} // namespace Impl
|
||||||
|
|
||||||
|
// public interface
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
std::enable_if_t<
|
auto create_mirror_view(const Kokkos::View<T, P...>& src) {
|
||||||
std::is_same<
|
return Impl::create_mirror_view(src, view_alloc());
|
||||||
typename Kokkos::View<T, P...>::memory_space,
|
|
||||||
typename Kokkos::View<T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<
|
|
||||||
typename Kokkos::View<T, P...>::data_type,
|
|
||||||
typename Kokkos::View<T, P...>::HostMirror::data_type>::value,
|
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(const Kokkos::View<T, P...>& src) {
|
|
||||||
return src;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts a without initializing flag
|
||||||
template <class T, class... P>
|
template <class T, class... P>
|
||||||
std::enable_if_t<
|
auto create_mirror_view(Kokkos::Impl::WithoutInitializing_t wi,
|
||||||
!(std::is_same<
|
Kokkos::View<T, P...> const& src) {
|
||||||
typename Kokkos::View<T, P...>::memory_space,
|
return Impl::create_mirror_view(src, view_alloc(wi));
|
||||||
typename Kokkos::View<T, P...>::HostMirror::memory_space>::value &&
|
|
||||||
std::is_same<
|
|
||||||
typename Kokkos::View<T, P...>::data_type,
|
|
||||||
typename Kokkos::View<T, P...>::HostMirror::data_type>::value),
|
|
||||||
typename Kokkos::View<T, P...>::HostMirror>
|
|
||||||
create_mirror_view(const Kokkos::View<T, P...>& src) {
|
|
||||||
return Kokkos::create_mirror(src);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P>
|
// public interface that accepts a space
|
||||||
typename Kokkos::View<T, P...>::HostMirror create_mirror_view(
|
|
||||||
Kokkos::Impl::WithoutInitializing_t wi, Kokkos::View<T, P...> const& v) {
|
|
||||||
return Impl::create_mirror_view(v, view_alloc(wi));
|
|
||||||
}
|
|
||||||
|
|
||||||
// FIXME_C++17 Improve SFINAE here.
|
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
typename Impl::MirrorViewType<Space, T, P...>::view_type create_mirror_view(
|
auto create_mirror_view(const Space&, const Kokkos::View<T, P...>& src) {
|
||||||
const Space&, const Kokkos::View<T, P...>& src,
|
return Impl::create_mirror_view(src,
|
||||||
std::enable_if_t<Impl::MirrorViewType<Space, T, P...>::is_same_memspace>* =
|
view_alloc(typename Space::memory_space()));
|
||||||
nullptr) {
|
|
||||||
return src;
|
|
||||||
}
|
|
||||||
|
|
||||||
// FIXME_C++17 Improve SFINAE here.
|
|
||||||
template <class Space, class T, class... P,
|
|
||||||
class Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
|
||||||
typename Impl::MirrorViewType<Space, T, P...>::view_type create_mirror_view(
|
|
||||||
const Space& space, const Kokkos::View<T, P...>& src,
|
|
||||||
std::enable_if_t<!Impl::MirrorViewType<Space, T, P...>::is_same_memspace>* =
|
|
||||||
nullptr) {
|
|
||||||
return Kokkos::create_mirror(space, src);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// public interface that accepts a space and a without initializing flag
|
||||||
template <class Space, class T, class... P,
|
template <class Space, class T, class... P,
|
||||||
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
typename Enable = std::enable_if_t<Kokkos::is_space<Space>::value>>
|
||||||
typename Impl::MirrorViewType<Space, T, P...>::view_type create_mirror_view(
|
auto create_mirror_view(Kokkos::Impl::WithoutInitializing_t wi, Space const&,
|
||||||
Kokkos::Impl::WithoutInitializing_t wi, Space const&,
|
Kokkos::View<T, P...> const& src) {
|
||||||
Kokkos::View<T, P...> const& v) {
|
|
||||||
return Impl::create_mirror_view(
|
return Impl::create_mirror_view(
|
||||||
v, view_alloc(typename Space::memory_space{}, wi));
|
src, view_alloc(typename Space::memory_space{}, wi));
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T, class... P, class... ViewCtorArgs>
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
|
template <class T, class... P, class... ViewCtorArgs,
|
||||||
|
typename = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
auto create_mirror_view(const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
auto create_mirror_view(const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
const Kokkos::View<T, P...>& v) {
|
const Kokkos::View<T, P...>& src) {
|
||||||
return Impl::create_mirror_view(v, arg_prop);
|
return Impl::create_mirror_view(src, arg_prop);
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class... ViewCtorArgs, class T, class... P>
|
namespace Impl {
|
||||||
auto create_mirror_view_and_copy(
|
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>&,
|
// collection of static asserts for create_mirror_view_and_copy
|
||||||
const Kokkos::View<T, P...>& src,
|
template <class... ViewCtorArgs>
|
||||||
std::enable_if_t<
|
void check_view_ctor_args_create_mirror_view_and_copy() {
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
|
||||||
Impl::MirrorViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::is_same_memspace>* = nullptr) {
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
|
|
||||||
static_assert(
|
static_assert(
|
||||||
alloc_prop_input::has_memory_space,
|
alloc_prop_input::has_memory_space,
|
||||||
"The view constructor arguments passed to "
|
"The view constructor arguments passed to "
|
||||||
@ -3708,52 +3726,53 @@ auto create_mirror_view_and_copy(
|
|||||||
"The view constructor arguments passed to "
|
"The view constructor arguments passed to "
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
"Kokkos::create_mirror_view_and_copy must "
|
||||||
"not explicitly allow padding!");
|
"not explicitly allow padding!");
|
||||||
|
|
||||||
// same behavior as deep_copy(src, src)
|
|
||||||
if (!alloc_prop_input::has_execution_space)
|
|
||||||
fence(
|
|
||||||
"Kokkos::create_mirror_view_and_copy: fence before returning src view");
|
|
||||||
return src;
|
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class... ViewCtorArgs, class T, class... P>
|
} // namespace Impl
|
||||||
|
|
||||||
|
// create a mirror view and deep copy it
|
||||||
|
// public interface that accepts arbitrary view constructor args passed by a
|
||||||
|
// view_alloc
|
||||||
|
template <class... ViewCtorArgs, class T, class... P,
|
||||||
|
class Enable = std::enable_if_t<
|
||||||
|
std::is_void_v<typename ViewTraits<T, P...>::specialize>>>
|
||||||
auto create_mirror_view_and_copy(
|
auto create_mirror_view_and_copy(
|
||||||
const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
[[maybe_unused]] const Impl::ViewCtorProp<ViewCtorArgs...>& arg_prop,
|
||||||
const Kokkos::View<T, P...>& src,
|
const Kokkos::View<T, P...>& src) {
|
||||||
std::enable_if_t<
|
|
||||||
std::is_void<typename ViewTraits<T, P...>::specialize>::value &&
|
|
||||||
!Impl::MirrorViewType<
|
|
||||||
typename Impl::ViewCtorProp<ViewCtorArgs...>::memory_space, T,
|
|
||||||
P...>::is_same_memspace>* = nullptr) {
|
|
||||||
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
using alloc_prop_input = Impl::ViewCtorProp<ViewCtorArgs...>;
|
||||||
static_assert(
|
|
||||||
alloc_prop_input::has_memory_space,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must include a memory space!");
|
|
||||||
static_assert(!alloc_prop_input::has_pointer,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not include a pointer!");
|
|
||||||
static_assert(!alloc_prop_input::allow_padding,
|
|
||||||
"The view constructor arguments passed to "
|
|
||||||
"Kokkos::create_mirror_view_and_copy must "
|
|
||||||
"not explicitly allow padding!");
|
|
||||||
using Space = typename alloc_prop_input::memory_space;
|
|
||||||
using Mirror = typename Impl::MirrorViewType<Space, T, P...>::view_type;
|
|
||||||
|
|
||||||
auto arg_prop_copy = Impl::with_properties_if_unset(
|
Impl::check_view_ctor_args_create_mirror_view_and_copy<ViewCtorArgs...>();
|
||||||
arg_prop, std::string{}, WithoutInitializing,
|
|
||||||
typename Space::execution_space{});
|
|
||||||
|
|
||||||
std::string& label = Impl::get_property<Impl::LabelTag>(arg_prop_copy);
|
if constexpr (Impl::MirrorViewType<typename alloc_prop_input::memory_space, T,
|
||||||
if (label.empty()) label = src.label();
|
P...>::is_same_memspace) {
|
||||||
auto mirror = typename Mirror::non_const_type{arg_prop_copy, src.layout()};
|
// same behavior as deep_copy(src, src)
|
||||||
if constexpr (alloc_prop_input::has_execution_space) {
|
if constexpr (!alloc_prop_input::has_execution_space)
|
||||||
deep_copy(Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop_copy),
|
fence(
|
||||||
mirror, src);
|
"Kokkos::create_mirror_view_and_copy: fence before returning src "
|
||||||
} else
|
"view");
|
||||||
deep_copy(mirror, src);
|
return src;
|
||||||
return mirror;
|
} else {
|
||||||
|
using Space = typename alloc_prop_input::memory_space;
|
||||||
|
using Mirror = typename Impl::MirrorViewType<Space, T, P...>::view_type;
|
||||||
|
|
||||||
|
auto arg_prop_copy = Impl::with_properties_if_unset(
|
||||||
|
arg_prop, std::string{}, WithoutInitializing,
|
||||||
|
typename Space::execution_space{});
|
||||||
|
|
||||||
|
std::string& label = Impl::get_property<Impl::LabelTag>(arg_prop_copy);
|
||||||
|
if (label.empty()) label = src.label();
|
||||||
|
auto mirror = typename Mirror::non_const_type{arg_prop_copy, src.layout()};
|
||||||
|
if constexpr (alloc_prop_input::has_execution_space) {
|
||||||
|
deep_copy(Impl::get_property<Impl::ExecutionSpaceTag>(arg_prop_copy),
|
||||||
|
mirror, src);
|
||||||
|
} else
|
||||||
|
deep_copy(mirror, src);
|
||||||
|
return mirror;
|
||||||
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_NVCC) && KOKKOS_COMPILER_NVCC >= 1130 && \
|
||||||
|
!defined(KOKKOS_COMPILER_MSVC)
|
||||||
|
__builtin_unreachable();
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
// Previously when using auto here, the intel compiler 19.3 would
|
// Previously when using auto here, the intel compiler 19.3 would
|
||||||
|
|||||||
@ -40,7 +40,12 @@ struct ParallelReduceTag {};
|
|||||||
|
|
||||||
struct ChunkSize {
|
struct ChunkSize {
|
||||||
int value;
|
int value;
|
||||||
|
explicit ChunkSize(int value_) : value(value_) {}
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
|
template <typename T = void>
|
||||||
|
KOKKOS_DEPRECATED_WITH_COMMENT("ChunkSize should be constructed explicitly.")
|
||||||
ChunkSize(int value_) : value(value_) {}
|
ChunkSize(int value_) : value(value_) {}
|
||||||
|
#endif
|
||||||
};
|
};
|
||||||
|
|
||||||
/** \brief Execution policy for work over a range of an integral type.
|
/** \brief Execution policy for work over a range of an integral type.
|
||||||
@ -714,6 +719,58 @@ class TeamPolicy
|
|||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Execution space not provided deduces to TeamPolicy<>
|
||||||
|
|
||||||
|
TeamPolicy()->TeamPolicy<>;
|
||||||
|
|
||||||
|
TeamPolicy(int, int)->TeamPolicy<>;
|
||||||
|
TeamPolicy(int, int, int)->TeamPolicy<>;
|
||||||
|
TeamPolicy(int, Kokkos::AUTO_t const&)->TeamPolicy<>;
|
||||||
|
TeamPolicy(int, Kokkos::AUTO_t const&, int)->TeamPolicy<>;
|
||||||
|
TeamPolicy(int, Kokkos::AUTO_t const&, Kokkos::AUTO_t const&)->TeamPolicy<>;
|
||||||
|
TeamPolicy(int, int, Kokkos::AUTO_t const&)->TeamPolicy<>;
|
||||||
|
|
||||||
|
// DefaultExecutionSpace deduces to TeamPolicy<>
|
||||||
|
|
||||||
|
TeamPolicy(DefaultExecutionSpace const&, int, int)->TeamPolicy<>;
|
||||||
|
TeamPolicy(DefaultExecutionSpace const&, int, int, int)->TeamPolicy<>;
|
||||||
|
TeamPolicy(DefaultExecutionSpace const&, int, Kokkos::AUTO_t const&)
|
||||||
|
->TeamPolicy<>;
|
||||||
|
TeamPolicy(DefaultExecutionSpace const&, int, Kokkos::AUTO_t const&, int)
|
||||||
|
->TeamPolicy<>;
|
||||||
|
TeamPolicy(DefaultExecutionSpace const&, int, Kokkos::AUTO_t const&,
|
||||||
|
Kokkos::AUTO_t const&)
|
||||||
|
->TeamPolicy<>;
|
||||||
|
TeamPolicy(DefaultExecutionSpace const&, int, int, Kokkos::AUTO_t const&)
|
||||||
|
->TeamPolicy<>;
|
||||||
|
|
||||||
|
// ES != DefaultExecutionSpace deduces to TeamPolicy<ES>
|
||||||
|
|
||||||
|
template <typename ES,
|
||||||
|
typename = std::enable_if_t<Kokkos::is_execution_space_v<ES>>>
|
||||||
|
TeamPolicy(ES const&, int, int)->TeamPolicy<ES>;
|
||||||
|
|
||||||
|
template <typename ES,
|
||||||
|
typename = std::enable_if_t<Kokkos::is_execution_space_v<ES>>>
|
||||||
|
TeamPolicy(ES const&, int, int, int)->TeamPolicy<ES>;
|
||||||
|
|
||||||
|
template <typename ES,
|
||||||
|
typename = std::enable_if_t<Kokkos::is_execution_space_v<ES>>>
|
||||||
|
TeamPolicy(ES const&, int, Kokkos::AUTO_t const&)->TeamPolicy<ES>;
|
||||||
|
|
||||||
|
template <typename ES,
|
||||||
|
typename = std::enable_if_t<Kokkos::is_execution_space_v<ES>>>
|
||||||
|
TeamPolicy(ES const&, int, Kokkos::AUTO_t const&, int)->TeamPolicy<ES>;
|
||||||
|
|
||||||
|
template <typename ES,
|
||||||
|
typename = std::enable_if_t<Kokkos::is_execution_space_v<ES>>>
|
||||||
|
TeamPolicy(ES const&, int, Kokkos::AUTO_t const&, Kokkos::AUTO_t const&)
|
||||||
|
->TeamPolicy<ES>;
|
||||||
|
|
||||||
|
template <typename ES,
|
||||||
|
typename = std::enable_if_t<Kokkos::is_execution_space_v<ES>>>
|
||||||
|
TeamPolicy(ES const&, int, int, Kokkos::AUTO_t const&)->TeamPolicy<ES>;
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
template <typename iType, class TeamMemberType>
|
template <typename iType, class TeamMemberType>
|
||||||
@ -968,9 +1025,9 @@ struct TeamThreadMDRange<Rank<N, OuterDir, InnerDir>, TeamHandle> {
|
|||||||
static constexpr auto par_vector = Impl::TeamMDRangeParVector::NotParVector;
|
static constexpr auto par_vector = Impl::TeamMDRangeParVector::NotParVector;
|
||||||
|
|
||||||
static constexpr Iterate direction =
|
static constexpr Iterate direction =
|
||||||
OuterDir == Iterate::Default
|
OuterDir == Iterate::Default ? Impl::layout_iterate_type_selector<
|
||||||
? layout_iterate_type_selector<ArrayLayout>::outer_iteration_pattern
|
ArrayLayout>::outer_iteration_pattern
|
||||||
: iter;
|
: iter;
|
||||||
|
|
||||||
template <class... Args>
|
template <class... Args>
|
||||||
KOKKOS_FUNCTION TeamThreadMDRange(TeamHandleType const& team_, Args&&... args)
|
KOKKOS_FUNCTION TeamThreadMDRange(TeamHandleType const& team_, Args&&... args)
|
||||||
@ -983,7 +1040,7 @@ struct TeamThreadMDRange<Rank<N, OuterDir, InnerDir>, TeamHandle> {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename TeamHandle, typename... Args>
|
template <typename TeamHandle, typename... Args>
|
||||||
TeamThreadMDRange(TeamHandle const&, Args&&...)
|
KOKKOS_DEDUCTION_GUIDE TeamThreadMDRange(TeamHandle const&, Args&&...)
|
||||||
->TeamThreadMDRange<Rank<sizeof...(Args), Iterate::Default>, TeamHandle>;
|
->TeamThreadMDRange<Rank<sizeof...(Args), Iterate::Default>, TeamHandle>;
|
||||||
|
|
||||||
template <typename Rank, typename TeamHandle>
|
template <typename Rank, typename TeamHandle>
|
||||||
@ -1004,9 +1061,9 @@ struct ThreadVectorMDRange<Rank<N, OuterDir, InnerDir>, TeamHandle> {
|
|||||||
static constexpr auto par_vector = Impl::TeamMDRangeParVector::ParVector;
|
static constexpr auto par_vector = Impl::TeamMDRangeParVector::ParVector;
|
||||||
|
|
||||||
static constexpr Iterate direction =
|
static constexpr Iterate direction =
|
||||||
OuterDir == Iterate::Default
|
OuterDir == Iterate::Default ? Impl::layout_iterate_type_selector<
|
||||||
? layout_iterate_type_selector<ArrayLayout>::outer_iteration_pattern
|
ArrayLayout>::outer_iteration_pattern
|
||||||
: iter;
|
: iter;
|
||||||
|
|
||||||
template <class... Args>
|
template <class... Args>
|
||||||
KOKKOS_INLINE_FUNCTION ThreadVectorMDRange(TeamHandleType const& team_,
|
KOKKOS_INLINE_FUNCTION ThreadVectorMDRange(TeamHandleType const& team_,
|
||||||
@ -1020,7 +1077,7 @@ struct ThreadVectorMDRange<Rank<N, OuterDir, InnerDir>, TeamHandle> {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename TeamHandle, typename... Args>
|
template <typename TeamHandle, typename... Args>
|
||||||
ThreadVectorMDRange(TeamHandle const&, Args&&...)
|
KOKKOS_DEDUCTION_GUIDE ThreadVectorMDRange(TeamHandle const&, Args&&...)
|
||||||
->ThreadVectorMDRange<Rank<sizeof...(Args), Iterate::Default>, TeamHandle>;
|
->ThreadVectorMDRange<Rank<sizeof...(Args), Iterate::Default>, TeamHandle>;
|
||||||
|
|
||||||
template <typename Rank, typename TeamHandle>
|
template <typename Rank, typename TeamHandle>
|
||||||
@ -1041,9 +1098,9 @@ struct TeamVectorMDRange<Rank<N, OuterDir, InnerDir>, TeamHandle> {
|
|||||||
static constexpr auto par_vector = Impl::TeamMDRangeParVector::ParVector;
|
static constexpr auto par_vector = Impl::TeamMDRangeParVector::ParVector;
|
||||||
|
|
||||||
static constexpr Iterate direction =
|
static constexpr Iterate direction =
|
||||||
iter == Iterate::Default
|
iter == Iterate::Default ? Impl::layout_iterate_type_selector<
|
||||||
? layout_iterate_type_selector<ArrayLayout>::outer_iteration_pattern
|
ArrayLayout>::outer_iteration_pattern
|
||||||
: iter;
|
: iter;
|
||||||
|
|
||||||
template <class... Args>
|
template <class... Args>
|
||||||
KOKKOS_INLINE_FUNCTION TeamVectorMDRange(TeamHandleType const& team_,
|
KOKKOS_INLINE_FUNCTION TeamVectorMDRange(TeamHandleType const& team_,
|
||||||
@ -1057,7 +1114,7 @@ struct TeamVectorMDRange<Rank<N, OuterDir, InnerDir>, TeamHandle> {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename TeamHandle, typename... Args>
|
template <typename TeamHandle, typename... Args>
|
||||||
TeamVectorMDRange(TeamHandle const&, Args&&...)
|
KOKKOS_DEDUCTION_GUIDE TeamVectorMDRange(TeamHandle const&, Args&&...)
|
||||||
->TeamVectorMDRange<Rank<sizeof...(Args), Iterate::Default>, TeamHandle>;
|
->TeamVectorMDRange<Rank<sizeof...(Args), Iterate::Default>, TeamHandle>;
|
||||||
|
|
||||||
template <typename Rank, typename TeamHandle, typename Lambda,
|
template <typename Rank, typename TeamHandle, typename Lambda,
|
||||||
|
|||||||
@ -25,33 +25,40 @@ static_assert(false,
|
|||||||
#include <cstddef>
|
#include <cstddef>
|
||||||
#include <type_traits>
|
#include <type_traits>
|
||||||
#include <Kokkos_Macros.hpp>
|
#include <Kokkos_Macros.hpp>
|
||||||
|
#ifdef KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
|
#include <mdspan/mdspan.hpp>
|
||||||
|
#else
|
||||||
|
#include <limits>
|
||||||
|
#endif
|
||||||
|
|
||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
|
|
||||||
|
#ifndef KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
|
constexpr size_t dynamic_extent = std::numeric_limits<size_t>::max();
|
||||||
|
#endif
|
||||||
|
|
||||||
namespace Experimental {
|
namespace Experimental {
|
||||||
|
|
||||||
constexpr ptrdiff_t dynamic_extent = -1;
|
template <size_t... ExtentSpecs>
|
||||||
|
|
||||||
template <ptrdiff_t... ExtentSpecs>
|
|
||||||
struct Extents {
|
struct Extents {
|
||||||
/* TODO @enhancement flesh this out more */
|
/* TODO @enhancement flesh this out more */
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class Exts, ptrdiff_t NewExtent>
|
template <class Exts, size_t NewExtent>
|
||||||
struct PrependExtent;
|
struct PrependExtent;
|
||||||
|
|
||||||
template <ptrdiff_t... Exts, ptrdiff_t NewExtent>
|
template <size_t... Exts, size_t NewExtent>
|
||||||
struct PrependExtent<Extents<Exts...>, NewExtent> {
|
struct PrependExtent<Extents<Exts...>, NewExtent> {
|
||||||
using type = Extents<NewExtent, Exts...>;
|
using type = Extents<NewExtent, Exts...>;
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class Exts, ptrdiff_t NewExtent>
|
template <class Exts, size_t NewExtent>
|
||||||
struct AppendExtent;
|
struct AppendExtent;
|
||||||
|
|
||||||
template <ptrdiff_t... Exts, ptrdiff_t NewExtent>
|
template <size_t... Exts, size_t NewExtent>
|
||||||
struct AppendExtent<Extents<Exts...>, NewExtent> {
|
struct AppendExtent<Extents<Exts...>, NewExtent> {
|
||||||
using type = Extents<Exts..., NewExtent>;
|
using type = Extents<Exts..., NewExtent>;
|
||||||
};
|
};
|
||||||
|
|
||||||
} // end namespace Experimental
|
} // end namespace Experimental
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
@ -75,33 +82,32 @@ struct _parse_impl {
|
|||||||
|
|
||||||
// We have to treat the case of int**[x] specially, since it *doesn't* go
|
// We have to treat the case of int**[x] specially, since it *doesn't* go
|
||||||
// backwards
|
// backwards
|
||||||
template <class T, ptrdiff_t... ExtentSpec>
|
template <class T, size_t... ExtentSpec>
|
||||||
struct _parse_impl<T*, Kokkos::Experimental::Extents<ExtentSpec...>,
|
struct _parse_impl<T*, Kokkos::Experimental::Extents<ExtentSpec...>,
|
||||||
std::enable_if_t<_all_remaining_extents_dynamic<T>::value>>
|
std::enable_if_t<_all_remaining_extents_dynamic<T>::value>>
|
||||||
: _parse_impl<T, Kokkos::Experimental::Extents<
|
: _parse_impl<T, Kokkos::Experimental::Extents<Kokkos::dynamic_extent,
|
||||||
Kokkos::Experimental::dynamic_extent, ExtentSpec...>> {
|
ExtentSpec...>> {};
|
||||||
};
|
|
||||||
|
|
||||||
// int*(*[x])[y] should still work also (meaning int[][x][][y])
|
// int*(*[x])[y] should still work also (meaning int[][x][][y])
|
||||||
template <class T, ptrdiff_t... ExtentSpec>
|
template <class T, size_t... ExtentSpec>
|
||||||
struct _parse_impl<
|
struct _parse_impl<
|
||||||
T*, Kokkos::Experimental::Extents<ExtentSpec...>,
|
T*, Kokkos::Experimental::Extents<ExtentSpec...>,
|
||||||
std::enable_if_t<!_all_remaining_extents_dynamic<T>::value>> {
|
std::enable_if_t<!_all_remaining_extents_dynamic<T>::value>> {
|
||||||
using _next = Kokkos::Experimental::AppendExtent<
|
using _next = Kokkos::Experimental::AppendExtent<
|
||||||
typename _parse_impl<T, Kokkos::Experimental::Extents<ExtentSpec...>,
|
typename _parse_impl<T, Kokkos::Experimental::Extents<ExtentSpec...>,
|
||||||
void>::type,
|
void>::type,
|
||||||
Kokkos::Experimental::dynamic_extent>;
|
Kokkos::dynamic_extent>;
|
||||||
using type = typename _next::type;
|
using type = typename _next::type;
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class T, ptrdiff_t... ExtentSpec, unsigned N>
|
template <class T, size_t... ExtentSpec, unsigned N>
|
||||||
struct _parse_impl<T[N], Kokkos::Experimental::Extents<ExtentSpec...>, void>
|
struct _parse_impl<T[N], Kokkos::Experimental::Extents<ExtentSpec...>, void>
|
||||||
: _parse_impl<
|
: _parse_impl<T,
|
||||||
T, Kokkos::Experimental::Extents<ExtentSpec...,
|
Kokkos::Experimental::Extents<ExtentSpec...,
|
||||||
ptrdiff_t(N)> // TODO @pedantic this
|
size_t(N)> // TODO @pedantic
|
||||||
// could be a
|
// this could be a
|
||||||
// narrowing cast
|
// narrowing cast
|
||||||
> {};
|
> {};
|
||||||
|
|
||||||
} // end namespace _parse_view_extents_impl
|
} // end namespace _parse_view_extents_impl
|
||||||
|
|
||||||
@ -111,38 +117,34 @@ struct ParseViewExtents {
|
|||||||
DataType, Kokkos::Experimental::Extents<>>::type;
|
DataType, Kokkos::Experimental::Extents<>>::type;
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class ValueType, ptrdiff_t Ext>
|
template <class ValueType, size_t Ext>
|
||||||
struct ApplyExtent {
|
struct ApplyExtent {
|
||||||
using type = ValueType[Ext];
|
using type = ValueType[Ext];
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class ValueType>
|
template <class ValueType>
|
||||||
struct ApplyExtent<ValueType, Kokkos::Experimental::dynamic_extent> {
|
struct ApplyExtent<ValueType, Kokkos::dynamic_extent> {
|
||||||
using type = ValueType*;
|
using type = ValueType*;
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class ValueType, unsigned N, ptrdiff_t Ext>
|
template <class ValueType, unsigned N, size_t Ext>
|
||||||
struct ApplyExtent<ValueType[N], Ext> {
|
struct ApplyExtent<ValueType[N], Ext> {
|
||||||
using type = typename ApplyExtent<ValueType, Ext>::type[N];
|
using type = typename ApplyExtent<ValueType, Ext>::type[N];
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class ValueType, ptrdiff_t Ext>
|
template <class ValueType, size_t Ext>
|
||||||
struct ApplyExtent<ValueType*, Ext> {
|
struct ApplyExtent<ValueType*, Ext> {
|
||||||
using type = ValueType * [Ext];
|
using type = ValueType * [Ext];
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class ValueType>
|
template <class ValueType>
|
||||||
struct ApplyExtent<ValueType*, Kokkos::Experimental::dynamic_extent> {
|
struct ApplyExtent<ValueType*, dynamic_extent> {
|
||||||
using type =
|
using type = typename ApplyExtent<ValueType, dynamic_extent>::type*;
|
||||||
typename ApplyExtent<ValueType,
|
|
||||||
Kokkos::Experimental::dynamic_extent>::type*;
|
|
||||||
};
|
};
|
||||||
|
|
||||||
template <class ValueType, unsigned N>
|
template <class ValueType, unsigned N>
|
||||||
struct ApplyExtent<ValueType[N], Kokkos::Experimental::dynamic_extent> {
|
struct ApplyExtent<ValueType[N], dynamic_extent> {
|
||||||
using type =
|
using type = typename ApplyExtent<ValueType, dynamic_extent>::type[N];
|
||||||
typename ApplyExtent<ValueType,
|
|
||||||
Kokkos::Experimental::dynamic_extent>::type[N];
|
|
||||||
};
|
};
|
||||||
|
|
||||||
} // end namespace Impl
|
} // end namespace Impl
|
||||||
|
|||||||
@ -167,6 +167,9 @@ Graph<ExecutionSpace> create_graph(Closure&& arg_closure) {
|
|||||||
#include <HIP/Kokkos_HIP_Graph_Impl.hpp>
|
#include <HIP/Kokkos_HIP_Graph_Impl.hpp>
|
||||||
#endif
|
#endif
|
||||||
#endif
|
#endif
|
||||||
|
#ifdef SYCL_EXT_ONEAPI_GRAPH
|
||||||
|
#include <SYCL/Kokkos_SYCL_Graph_Impl.hpp>
|
||||||
|
#endif
|
||||||
#ifdef KOKKOS_IMPL_PUBLIC_INCLUDE_NOTDEFINED_GRAPH
|
#ifdef KOKKOS_IMPL_PUBLIC_INCLUDE_NOTDEFINED_GRAPH
|
||||||
#undef KOKKOS_IMPL_PUBLIC_INCLUDE
|
#undef KOKKOS_IMPL_PUBLIC_INCLUDE
|
||||||
#undef KOKKOS_IMPL_PUBLIC_INCLUDE_NOTDEFINED_GRAPH
|
#undef KOKKOS_IMPL_PUBLIC_INCLUDE_NOTDEFINED_GRAPH
|
||||||
|
|||||||
@ -113,7 +113,6 @@ class HostSpace {
|
|||||||
const size_t arg_alloc_size,
|
const size_t arg_alloc_size,
|
||||||
const size_t arg_logical_size = 0) const;
|
const size_t arg_logical_size = 0) const;
|
||||||
|
|
||||||
private:
|
|
||||||
void* impl_allocate(const char* arg_label, const size_t arg_alloc_size,
|
void* impl_allocate(const char* arg_label, const size_t arg_alloc_size,
|
||||||
const size_t arg_logical_size = 0,
|
const size_t arg_logical_size = 0,
|
||||||
const Kokkos::Tools::SpaceHandle =
|
const Kokkos::Tools::SpaceHandle =
|
||||||
@ -124,7 +123,6 @@ class HostSpace {
|
|||||||
const Kokkos::Tools::SpaceHandle =
|
const Kokkos::Tools::SpaceHandle =
|
||||||
Kokkos::Tools::make_space_handle(name())) const;
|
Kokkos::Tools::make_space_handle(name())) const;
|
||||||
|
|
||||||
public:
|
|
||||||
/**\brief Return Name of the MemorySpace */
|
/**\brief Return Name of the MemorySpace */
|
||||||
static constexpr const char* name() { return m_name; }
|
static constexpr const char* name() { return m_name; }
|
||||||
|
|
||||||
|
|||||||
@ -217,81 +217,12 @@ enum class Iterate {
|
|||||||
Right // Right indices stride fastest
|
Right // Right indices stride fastest
|
||||||
};
|
};
|
||||||
|
|
||||||
// To check for LayoutTiled
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
// This is to hide extra compile-time 'identifier' info within the LayoutTiled
|
template <typename Layout, class Enable = void>
|
||||||
// class by not relying on template specialization to include the ArgN*'s
|
struct KOKKOS_DEPRECATED is_layouttiled : std::false_type {};
|
||||||
template <typename LayoutTiledCheck, class Enable = void>
|
#endif
|
||||||
struct is_layouttiled : std::false_type {};
|
|
||||||
|
|
||||||
template <typename LayoutTiledCheck>
|
|
||||||
struct is_layouttiled<LayoutTiledCheck,
|
|
||||||
std::enable_if_t<LayoutTiledCheck::is_array_layout_tiled>>
|
|
||||||
: std::true_type {};
|
|
||||||
|
|
||||||
namespace Experimental {
|
|
||||||
|
|
||||||
/// LayoutTiled
|
|
||||||
// Must have Rank >= 2
|
|
||||||
template <
|
|
||||||
Kokkos::Iterate OuterP, Kokkos::Iterate InnerP, unsigned ArgN0,
|
|
||||||
unsigned ArgN1, unsigned ArgN2 = 0, unsigned ArgN3 = 0, unsigned ArgN4 = 0,
|
|
||||||
unsigned ArgN5 = 0, unsigned ArgN6 = 0, unsigned ArgN7 = 0,
|
|
||||||
bool IsPowerOfTwo =
|
|
||||||
(Kokkos::Impl::is_integral_power_of_two(ArgN0) &&
|
|
||||||
Kokkos::Impl::is_integral_power_of_two(ArgN1) &&
|
|
||||||
(Kokkos::Impl::is_integral_power_of_two(ArgN2) || (ArgN2 == 0)) &&
|
|
||||||
(Kokkos::Impl::is_integral_power_of_two(ArgN3) || (ArgN3 == 0)) &&
|
|
||||||
(Kokkos::Impl::is_integral_power_of_two(ArgN4) || (ArgN4 == 0)) &&
|
|
||||||
(Kokkos::Impl::is_integral_power_of_two(ArgN5) || (ArgN5 == 0)) &&
|
|
||||||
(Kokkos::Impl::is_integral_power_of_two(ArgN6) || (ArgN6 == 0)) &&
|
|
||||||
(Kokkos::Impl::is_integral_power_of_two(ArgN7) || (ArgN7 == 0)))>
|
|
||||||
struct LayoutTiled {
|
|
||||||
static_assert(IsPowerOfTwo,
|
|
||||||
"LayoutTiled must be given power-of-two tile dimensions");
|
|
||||||
|
|
||||||
using array_layout = LayoutTiled<OuterP, InnerP, ArgN0, ArgN1, ArgN2, ArgN3,
|
|
||||||
ArgN4, ArgN5, ArgN6, ArgN7, IsPowerOfTwo>;
|
|
||||||
static constexpr Iterate outer_pattern = OuterP;
|
|
||||||
static constexpr Iterate inner_pattern = InnerP;
|
|
||||||
|
|
||||||
enum { N0 = ArgN0 };
|
|
||||||
enum { N1 = ArgN1 };
|
|
||||||
enum { N2 = ArgN2 };
|
|
||||||
enum { N3 = ArgN3 };
|
|
||||||
enum { N4 = ArgN4 };
|
|
||||||
enum { N5 = ArgN5 };
|
|
||||||
enum { N6 = ArgN6 };
|
|
||||||
enum { N7 = ArgN7 };
|
|
||||||
|
|
||||||
size_t dimension[ARRAY_LAYOUT_MAX_RANK];
|
|
||||||
|
|
||||||
enum : bool { is_extent_constructible = true };
|
|
||||||
|
|
||||||
LayoutTiled(LayoutTiled const&) = default;
|
|
||||||
LayoutTiled(LayoutTiled&&) = default;
|
|
||||||
LayoutTiled& operator=(LayoutTiled const&) = default;
|
|
||||||
LayoutTiled& operator=(LayoutTiled&&) = default;
|
|
||||||
|
|
||||||
KOKKOS_INLINE_FUNCTION
|
|
||||||
explicit constexpr LayoutTiled(size_t argN0 = 0, size_t argN1 = 0,
|
|
||||||
size_t argN2 = 0, size_t argN3 = 0,
|
|
||||||
size_t argN4 = 0, size_t argN5 = 0,
|
|
||||||
size_t argN6 = 0, size_t argN7 = 0)
|
|
||||||
: dimension{argN0, argN1, argN2, argN3, argN4, argN5, argN6, argN7} {}
|
|
||||||
|
|
||||||
friend bool operator==(const LayoutTiled& left, const LayoutTiled& right) {
|
|
||||||
for (unsigned int rank = 0; rank < ARRAY_LAYOUT_MAX_RANK; ++rank)
|
|
||||||
if (left.dimension[rank] != right.dimension[rank]) return false;
|
|
||||||
return true;
|
|
||||||
}
|
|
||||||
|
|
||||||
friend bool operator!=(const LayoutTiled& left, const LayoutTiled& right) {
|
|
||||||
return !(left == right);
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
} // namespace Experimental
|
|
||||||
|
|
||||||
|
namespace Impl {
|
||||||
// For use with view_copy
|
// For use with view_copy
|
||||||
template <typename... Layout>
|
template <typename... Layout>
|
||||||
struct layout_iterate_type_selector {
|
struct layout_iterate_type_selector {
|
||||||
@ -320,42 +251,13 @@ struct layout_iterate_type_selector<Kokkos::LayoutStride> {
|
|||||||
static const Kokkos::Iterate inner_iteration_pattern =
|
static const Kokkos::Iterate inner_iteration_pattern =
|
||||||
Kokkos::Iterate::Default;
|
Kokkos::Iterate::Default;
|
||||||
};
|
};
|
||||||
|
} // namespace Impl
|
||||||
|
|
||||||
template <unsigned ArgN0, unsigned ArgN1, unsigned ArgN2, unsigned ArgN3,
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
unsigned ArgN4, unsigned ArgN5, unsigned ArgN6, unsigned ArgN7>
|
template <typename... Layout>
|
||||||
struct layout_iterate_type_selector<Kokkos::Experimental::LayoutTiled<
|
using layout_iterate_type_selector KOKKOS_DEPRECATED =
|
||||||
Kokkos::Iterate::Left, Kokkos::Iterate::Left, ArgN0, ArgN1, ArgN2, ArgN3,
|
Impl::layout_iterate_type_selector<Layout...>;
|
||||||
ArgN4, ArgN5, ArgN6, ArgN7, true>> {
|
#endif
|
||||||
static const Kokkos::Iterate outer_iteration_pattern = Kokkos::Iterate::Left;
|
|
||||||
static const Kokkos::Iterate inner_iteration_pattern = Kokkos::Iterate::Left;
|
|
||||||
};
|
|
||||||
|
|
||||||
template <unsigned ArgN0, unsigned ArgN1, unsigned ArgN2, unsigned ArgN3,
|
|
||||||
unsigned ArgN4, unsigned ArgN5, unsigned ArgN6, unsigned ArgN7>
|
|
||||||
struct layout_iterate_type_selector<Kokkos::Experimental::LayoutTiled<
|
|
||||||
Kokkos::Iterate::Right, Kokkos::Iterate::Left, ArgN0, ArgN1, ArgN2, ArgN3,
|
|
||||||
ArgN4, ArgN5, ArgN6, ArgN7, true>> {
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern = Kokkos::Iterate::Right;
|
|
||||||
static const Kokkos::Iterate inner_iteration_pattern = Kokkos::Iterate::Left;
|
|
||||||
};
|
|
||||||
|
|
||||||
template <unsigned ArgN0, unsigned ArgN1, unsigned ArgN2, unsigned ArgN3,
|
|
||||||
unsigned ArgN4, unsigned ArgN5, unsigned ArgN6, unsigned ArgN7>
|
|
||||||
struct layout_iterate_type_selector<Kokkos::Experimental::LayoutTiled<
|
|
||||||
Kokkos::Iterate::Left, Kokkos::Iterate::Right, ArgN0, ArgN1, ArgN2, ArgN3,
|
|
||||||
ArgN4, ArgN5, ArgN6, ArgN7, true>> {
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern = Kokkos::Iterate::Left;
|
|
||||||
static const Kokkos::Iterate inner_iteration_pattern = Kokkos::Iterate::Right;
|
|
||||||
};
|
|
||||||
|
|
||||||
template <unsigned ArgN0, unsigned ArgN1, unsigned ArgN2, unsigned ArgN3,
|
|
||||||
unsigned ArgN4, unsigned ArgN5, unsigned ArgN6, unsigned ArgN7>
|
|
||||||
struct layout_iterate_type_selector<Kokkos::Experimental::LayoutTiled<
|
|
||||||
Kokkos::Iterate::Right, Kokkos::Iterate::Right, ArgN0, ArgN1, ArgN2, ArgN3,
|
|
||||||
ArgN4, ArgN5, ArgN6, ArgN7, true>> {
|
|
||||||
static const Kokkos::Iterate outer_iteration_pattern = Kokkos::Iterate::Right;
|
|
||||||
static const Kokkos::Iterate inner_iteration_pattern = Kokkos::Iterate::Right;
|
|
||||||
};
|
|
||||||
|
|
||||||
} // namespace Kokkos
|
} // namespace Kokkos
|
||||||
|
|
||||||
|
|||||||
@ -55,9 +55,22 @@
|
|||||||
|
|
||||||
#ifndef KOKKOS_DONT_INCLUDE_CORE_CONFIG_H
|
#ifndef KOKKOS_DONT_INCLUDE_CORE_CONFIG_H
|
||||||
#include <KokkosCore_config.h>
|
#include <KokkosCore_config.h>
|
||||||
|
#include <impl/Kokkos_DesulAtomicsConfig.hpp>
|
||||||
#include <impl/Kokkos_NvidiaGpuArchitectures.hpp>
|
#include <impl/Kokkos_NvidiaGpuArchitectures.hpp>
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
#if !defined(KOKKOS_ENABLE_CXX17)
|
||||||
|
#if __has_include(<version>)
|
||||||
|
#include <version>
|
||||||
|
#else
|
||||||
|
#include <ciso646>
|
||||||
|
#endif
|
||||||
|
#if defined(_GLIBCXX_RELEASE) && _GLIBCXX_RELEASE < 10
|
||||||
|
#error \
|
||||||
|
"Compiling with support for C++20 or later requires a libstdc++ version later than 9"
|
||||||
|
#endif
|
||||||
|
#endif
|
||||||
|
|
||||||
//----------------------------------------------------------------------------
|
//----------------------------------------------------------------------------
|
||||||
/** Pick up compiler specific #define macros:
|
/** Pick up compiler specific #define macros:
|
||||||
*
|
*
|
||||||
@ -332,6 +345,10 @@
|
|||||||
#define KOKKOS_DEFAULTED_FUNCTION
|
#define KOKKOS_DEFAULTED_FUNCTION
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
#if !defined(KOKKOS_DEDUCTION_GUIDE)
|
||||||
|
#define KOKKOS_DEDUCTION_GUIDE
|
||||||
|
#endif
|
||||||
|
|
||||||
#if !defined(KOKKOS_IMPL_HOST_FUNCTION)
|
#if !defined(KOKKOS_IMPL_HOST_FUNCTION)
|
||||||
#define KOKKOS_IMPL_HOST_FUNCTION
|
#define KOKKOS_IMPL_HOST_FUNCTION
|
||||||
#endif
|
#endif
|
||||||
@ -562,8 +579,44 @@ static constexpr bool kokkos_omp_on_host() { return false; }
|
|||||||
#define KOKKOS_IMPL_WARNING(desc) KOKKOS_IMPL_DO_PRAGMA(message(#desc))
|
#define KOKKOS_IMPL_WARNING(desc) KOKKOS_IMPL_DO_PRAGMA(message(#desc))
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
// clang-format off
|
||||||
|
#if defined(__NVCOMPILER)
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_PUSH() \
|
||||||
|
_Pragma("diag_suppress 1216")
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_POP() \
|
||||||
|
_Pragma("diag_default 1216")
|
||||||
|
#elif defined(__EDG__)
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_PUSH() \
|
||||||
|
_Pragma("warning push") \
|
||||||
|
_Pragma("warning disable 1478")
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_POP() \
|
||||||
|
_Pragma("warning pop")
|
||||||
|
#elif defined(__GNUC__) || defined(__clang__)
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_PUSH() \
|
||||||
|
_Pragma("GCC diagnostic push") \
|
||||||
|
_Pragma("GCC diagnostic ignored \"-Wdeprecated-declarations\"")
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_POP() \
|
||||||
|
_Pragma("GCC diagnostic pop")
|
||||||
|
#elif defined(_MSC_VER)
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_PUSH() \
|
||||||
|
_Pragma("warning(push)") \
|
||||||
|
_Pragma("warning(disable: 4996)")
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_POP() \
|
||||||
|
_Pragma("warning(pop)")
|
||||||
|
#else
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_PUSH()
|
||||||
|
#define KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_POP()
|
||||||
|
#endif
|
||||||
|
// clang-format on
|
||||||
|
|
||||||
#define KOKKOS_ATTRIBUTE_NODISCARD [[nodiscard]]
|
#define KOKKOS_ATTRIBUTE_NODISCARD [[nodiscard]]
|
||||||
|
|
||||||
|
#ifndef KOKKOS_ENABLE_CXX17
|
||||||
|
#define KOKKOS_IMPL_ATTRIBUTE_UNLIKELY [[unlikely]]
|
||||||
|
#else
|
||||||
|
#define KOKKOS_IMPL_ATTRIBUTE_UNLIKELY
|
||||||
|
#endif
|
||||||
|
|
||||||
#if (defined(KOKKOS_COMPILER_GNU) || defined(KOKKOS_COMPILER_CLANG) || \
|
#if (defined(KOKKOS_COMPILER_GNU) || defined(KOKKOS_COMPILER_CLANG) || \
|
||||||
defined(KOKKOS_COMPILER_INTEL) || defined(KOKKOS_COMPILER_INTEL_LLVM) || \
|
defined(KOKKOS_COMPILER_INTEL) || defined(KOKKOS_COMPILER_INTEL_LLVM) || \
|
||||||
defined(KOKKOS_COMPILER_NVHPC)) && \
|
defined(KOKKOS_COMPILER_NVHPC)) && \
|
||||||
|
|||||||
@ -277,12 +277,20 @@ KOKKOS_INLINE_FUNCTION long long abs(long long n) {
|
|||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
KOKKOS_INLINE_FUNCTION float abs(float x) {
|
KOKKOS_INLINE_FUNCTION float abs(float x) {
|
||||||
|
#ifdef KOKKOS_ENABLE_SYCL
|
||||||
|
return sycl::fabs(x); // sycl::abs is only provided for integral types
|
||||||
|
#else
|
||||||
using KOKKOS_IMPL_MATH_FUNCTIONS_NAMESPACE::abs;
|
using KOKKOS_IMPL_MATH_FUNCTIONS_NAMESPACE::abs;
|
||||||
return abs(x);
|
return abs(x);
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
KOKKOS_INLINE_FUNCTION double abs(double x) {
|
KOKKOS_INLINE_FUNCTION double abs(double x) {
|
||||||
|
#ifdef KOKKOS_ENABLE_SYCL
|
||||||
|
return sycl::fabs(x); // sycl::abs is only provided for integral types
|
||||||
|
#else
|
||||||
using KOKKOS_IMPL_MATH_FUNCTIONS_NAMESPACE::abs;
|
using KOKKOS_IMPL_MATH_FUNCTIONS_NAMESPACE::abs;
|
||||||
return abs(x);
|
return abs(x);
|
||||||
|
#endif
|
||||||
}
|
}
|
||||||
inline long double abs(long double x) {
|
inline long double abs(long double x) {
|
||||||
using std::abs;
|
using std::abs;
|
||||||
|
|||||||
@ -413,12 +413,13 @@ KOKKOS_FORCEINLINE_FUNCTION pair<T1&, T2&> tie(T1& x, T2& y) {
|
|||||||
return (pair<T1&, T2&>(x, y));
|
return (pair<T1&, T2&>(x, y));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
//
|
//
|
||||||
// Specialization of Kokkos::pair for a \c void second argument. This
|
// Specialization of Kokkos::pair for a \c void second argument. This
|
||||||
// is not actually a "pair"; it only contains one element, the first.
|
// is not actually a "pair"; it only contains one element, the first.
|
||||||
//
|
//
|
||||||
template <class T1>
|
template <class T1>
|
||||||
struct pair<T1, void> {
|
struct KOKKOS_DEPRECATED pair<T1, void> {
|
||||||
using first_type = T1;
|
using first_type = T1;
|
||||||
using second_type = void;
|
using second_type = void;
|
||||||
|
|
||||||
@ -448,41 +449,48 @@ struct pair<T1, void> {
|
|||||||
// Specialization of relational operators for Kokkos::pair<T1,void>.
|
// Specialization of relational operators for Kokkos::pair<T1,void>.
|
||||||
//
|
//
|
||||||
|
|
||||||
|
#if defined(KOKKOS_COMPILER_GNU) && (KOKKOS_COMPILER_GNU < 1110)
|
||||||
|
KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_PUSH()
|
||||||
|
#endif
|
||||||
template <class T1>
|
template <class T1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator==(
|
KOKKOS_DEPRECATED KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator==(
|
||||||
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
||||||
return lhs.first == rhs.first;
|
return lhs.first == rhs.first;
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T1>
|
template <class T1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator!=(
|
KOKKOS_DEPRECATED KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator!=(
|
||||||
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
||||||
return !(lhs == rhs);
|
return !(lhs == rhs);
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T1>
|
template <class T1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator<(
|
KOKKOS_DEPRECATED KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator<(
|
||||||
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
||||||
return lhs.first < rhs.first;
|
return lhs.first < rhs.first;
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T1>
|
template <class T1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator<=(
|
KOKKOS_DEPRECATED KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator<=(
|
||||||
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
||||||
return !(rhs < lhs);
|
return !(rhs < lhs);
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T1>
|
template <class T1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator>(
|
KOKKOS_DEPRECATED KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator>(
|
||||||
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
||||||
return rhs < lhs;
|
return rhs < lhs;
|
||||||
}
|
}
|
||||||
|
|
||||||
template <class T1>
|
template <class T1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator>=(
|
KOKKOS_DEPRECATED KOKKOS_FORCEINLINE_FUNCTION constexpr bool operator>=(
|
||||||
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
const pair<T1, void>& lhs, const pair<T1, void>& rhs) {
|
||||||
return !(lhs < rhs);
|
return !(lhs < rhs);
|
||||||
}
|
}
|
||||||
|
#if defined(KOKKOS_COMPILER_GNU) && (KOKKOS_COMPILER_GNU < 1110)
|
||||||
|
KOKKOS_IMPL_DISABLE_DEPRECATED_WARNINGS_POP()
|
||||||
|
#endif
|
||||||
|
#endif
|
||||||
|
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
template <class T>
|
template <class T>
|
||||||
|
|||||||
@ -137,9 +137,9 @@ inline void parallel_for(const std::string& str, const ExecPolicy& policy,
|
|||||||
ExecPolicy inner_policy = policy;
|
ExecPolicy inner_policy = policy;
|
||||||
Kokkos::Tools::Impl::begin_parallel_for(inner_policy, functor, str, kpID);
|
Kokkos::Tools::Impl::begin_parallel_for(inner_policy, functor, str, kpID);
|
||||||
|
|
||||||
Kokkos::Impl::shared_allocation_tracking_disable();
|
auto closure =
|
||||||
Impl::ParallelFor<FunctorType, ExecPolicy> closure(functor, inner_policy);
|
Kokkos::Impl::construct_with_shared_allocation_tracking_disabled<
|
||||||
Kokkos::Impl::shared_allocation_tracking_enable();
|
Impl::ParallelFor<FunctorType, ExecPolicy>>(functor, inner_policy);
|
||||||
|
|
||||||
closure.execute();
|
closure.execute();
|
||||||
|
|
||||||
@ -352,10 +352,10 @@ inline void parallel_scan(const std::string& str, const ExecutionPolicy& policy,
|
|||||||
ExecutionPolicy inner_policy = policy;
|
ExecutionPolicy inner_policy = policy;
|
||||||
Kokkos::Tools::Impl::begin_parallel_scan(inner_policy, functor, str, kpID);
|
Kokkos::Tools::Impl::begin_parallel_scan(inner_policy, functor, str, kpID);
|
||||||
|
|
||||||
Kokkos::Impl::shared_allocation_tracking_disable();
|
auto closure =
|
||||||
Impl::ParallelScan<FunctorType, ExecutionPolicy> closure(functor,
|
Kokkos::Impl::construct_with_shared_allocation_tracking_disabled<
|
||||||
inner_policy);
|
Impl::ParallelScan<FunctorType, ExecutionPolicy>>(functor,
|
||||||
Kokkos::Impl::shared_allocation_tracking_enable();
|
inner_policy);
|
||||||
|
|
||||||
closure.execute();
|
closure.execute();
|
||||||
|
|
||||||
@ -398,18 +398,19 @@ inline void parallel_scan(const std::string& str, const ExecutionPolicy& policy,
|
|||||||
Kokkos::Tools::Impl::begin_parallel_scan(inner_policy, functor, str, kpID);
|
Kokkos::Tools::Impl::begin_parallel_scan(inner_policy, functor, str, kpID);
|
||||||
|
|
||||||
if constexpr (Kokkos::is_view<ReturnType>::value) {
|
if constexpr (Kokkos::is_view<ReturnType>::value) {
|
||||||
Kokkos::Impl::shared_allocation_tracking_disable();
|
auto closure =
|
||||||
Impl::ParallelScanWithTotal<FunctorType, ExecutionPolicy,
|
Kokkos::Impl::construct_with_shared_allocation_tracking_disabled<
|
||||||
typename ReturnType::value_type>
|
Impl::ParallelScanWithTotal<FunctorType, ExecutionPolicy,
|
||||||
closure(functor, inner_policy, return_value);
|
typename ReturnType::value_type>>(
|
||||||
Kokkos::Impl::shared_allocation_tracking_enable();
|
functor, inner_policy, return_value);
|
||||||
closure.execute();
|
closure.execute();
|
||||||
} else {
|
} else {
|
||||||
Kokkos::Impl::shared_allocation_tracking_disable();
|
|
||||||
Kokkos::View<ReturnType, Kokkos::HostSpace> view(&return_value);
|
Kokkos::View<ReturnType, Kokkos::HostSpace> view(&return_value);
|
||||||
Impl::ParallelScanWithTotal<FunctorType, ExecutionPolicy, ReturnType>
|
auto closure =
|
||||||
closure(functor, inner_policy, view);
|
Kokkos::Impl::construct_with_shared_allocation_tracking_disabled<
|
||||||
Kokkos::Impl::shared_allocation_tracking_enable();
|
Impl::ParallelScanWithTotal<FunctorType, ExecutionPolicy,
|
||||||
|
ReturnType>>(functor, inner_policy,
|
||||||
|
view);
|
||||||
closure.execute();
|
closure.execute();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@ -72,7 +72,7 @@ struct Sum {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
Sum(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE Sum(View<Scalar, Properties...> const&)
|
||||||
->Sum<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->Sum<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Space>
|
template <class Scalar, class Space>
|
||||||
@ -117,7 +117,7 @@ struct Prod {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
Prod(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE Prod(View<Scalar, Properties...> const&)
|
||||||
->Prod<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->Prod<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Space>
|
template <class Scalar, class Space>
|
||||||
@ -164,7 +164,7 @@ struct Min {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
Min(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE Min(View<Scalar, Properties...> const&)
|
||||||
->Min<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->Min<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Space>
|
template <class Scalar, class Space>
|
||||||
@ -212,7 +212,7 @@ struct Max {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
Max(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE Max(View<Scalar, Properties...> const&)
|
||||||
->Max<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->Max<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Space>
|
template <class Scalar, class Space>
|
||||||
@ -258,7 +258,7 @@ struct LAnd {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
LAnd(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE LAnd(View<Scalar, Properties...> const&)
|
||||||
->LAnd<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->LAnd<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Space>
|
template <class Scalar, class Space>
|
||||||
@ -305,7 +305,7 @@ struct LOr {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
LOr(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE LOr(View<Scalar, Properties...> const&)
|
||||||
->LOr<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->LOr<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Space>
|
template <class Scalar, class Space>
|
||||||
@ -352,7 +352,7 @@ struct BAnd {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
BAnd(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE BAnd(View<Scalar, Properties...> const&)
|
||||||
->BAnd<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->BAnd<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Space>
|
template <class Scalar, class Space>
|
||||||
@ -399,7 +399,7 @@ struct BOr {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
BOr(View<Scalar, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE BOr(View<Scalar, Properties...> const&)
|
||||||
->BOr<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
->BOr<Scalar, typename View<Scalar, Properties...>::memory_space>;
|
||||||
|
|
||||||
template <class Scalar, class Index>
|
template <class Scalar, class Index>
|
||||||
@ -458,7 +458,8 @@ struct MinLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename Index, typename... Properties>
|
template <typename Scalar, typename Index, typename... Properties>
|
||||||
MinLoc(View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE MinLoc(
|
||||||
|
View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
||||||
->MinLoc<Scalar, Index,
|
->MinLoc<Scalar, Index,
|
||||||
typename View<ValLocScalar<Scalar, Index>,
|
typename View<ValLocScalar<Scalar, Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
@ -513,7 +514,8 @@ struct MaxLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename Index, typename... Properties>
|
template <typename Scalar, typename Index, typename... Properties>
|
||||||
MaxLoc(View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE MaxLoc(
|
||||||
|
View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
||||||
->MaxLoc<Scalar, Index,
|
->MaxLoc<Scalar, Index,
|
||||||
typename View<ValLocScalar<Scalar, Index>,
|
typename View<ValLocScalar<Scalar, Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
@ -577,7 +579,7 @@ struct MinMax {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename... Properties>
|
template <typename Scalar, typename... Properties>
|
||||||
MinMax(View<MinMaxScalar<Scalar>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE MinMax(View<MinMaxScalar<Scalar>, Properties...> const&)
|
||||||
->MinMax<Scalar,
|
->MinMax<Scalar,
|
||||||
typename View<MinMaxScalar<Scalar>, Properties...>::memory_space>;
|
typename View<MinMaxScalar<Scalar>, Properties...>::memory_space>;
|
||||||
|
|
||||||
@ -646,7 +648,8 @@ struct MinMaxLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename Index, typename... Properties>
|
template <typename Scalar, typename Index, typename... Properties>
|
||||||
MinMaxLoc(View<MinMaxLocScalar<Scalar, Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE MinMaxLoc(
|
||||||
|
View<MinMaxLocScalar<Scalar, Index>, Properties...> const&)
|
||||||
->MinMaxLoc<Scalar, Index,
|
->MinMaxLoc<Scalar, Index,
|
||||||
typename View<MinMaxLocScalar<Scalar, Index>,
|
typename View<MinMaxLocScalar<Scalar, Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
@ -713,7 +716,8 @@ struct MaxFirstLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename Index, typename... Properties>
|
template <typename Scalar, typename Index, typename... Properties>
|
||||||
MaxFirstLoc(View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE MaxFirstLoc(
|
||||||
|
View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
||||||
->MaxFirstLoc<Scalar, Index,
|
->MaxFirstLoc<Scalar, Index,
|
||||||
typename View<ValLocScalar<Scalar, Index>,
|
typename View<ValLocScalar<Scalar, Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
@ -782,7 +786,7 @@ struct MaxFirstLocCustomComparator {
|
|||||||
|
|
||||||
template <typename Scalar, typename Index, typename ComparatorType,
|
template <typename Scalar, typename Index, typename ComparatorType,
|
||||||
typename... Properties>
|
typename... Properties>
|
||||||
MaxFirstLocCustomComparator(
|
KOKKOS_DEDUCTION_GUIDE MaxFirstLocCustomComparator(
|
||||||
View<ValLocScalar<Scalar, Index>, Properties...> const&, ComparatorType)
|
View<ValLocScalar<Scalar, Index>, Properties...> const&, ComparatorType)
|
||||||
->MaxFirstLocCustomComparator<Scalar, Index, ComparatorType,
|
->MaxFirstLocCustomComparator<Scalar, Index, ComparatorType,
|
||||||
typename View<ValLocScalar<Scalar, Index>,
|
typename View<ValLocScalar<Scalar, Index>,
|
||||||
@ -846,7 +850,8 @@ struct MinFirstLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename Index, typename... Properties>
|
template <typename Scalar, typename Index, typename... Properties>
|
||||||
MinFirstLoc(View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE MinFirstLoc(
|
||||||
|
View<ValLocScalar<Scalar, Index>, Properties...> const&)
|
||||||
->MinFirstLoc<Scalar, Index,
|
->MinFirstLoc<Scalar, Index,
|
||||||
typename View<ValLocScalar<Scalar, Index>,
|
typename View<ValLocScalar<Scalar, Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
@ -915,7 +920,7 @@ struct MinFirstLocCustomComparator {
|
|||||||
|
|
||||||
template <typename Scalar, typename Index, typename ComparatorType,
|
template <typename Scalar, typename Index, typename ComparatorType,
|
||||||
typename... Properties>
|
typename... Properties>
|
||||||
MinFirstLocCustomComparator(
|
KOKKOS_DEDUCTION_GUIDE MinFirstLocCustomComparator(
|
||||||
View<ValLocScalar<Scalar, Index>, Properties...> const&, ComparatorType)
|
View<ValLocScalar<Scalar, Index>, Properties...> const&, ComparatorType)
|
||||||
->MinFirstLocCustomComparator<Scalar, Index, ComparatorType,
|
->MinFirstLocCustomComparator<Scalar, Index, ComparatorType,
|
||||||
typename View<ValLocScalar<Scalar, Index>,
|
typename View<ValLocScalar<Scalar, Index>,
|
||||||
@ -990,7 +995,8 @@ struct MinMaxFirstLastLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Scalar, typename Index, typename... Properties>
|
template <typename Scalar, typename Index, typename... Properties>
|
||||||
MinMaxFirstLastLoc(View<MinMaxLocScalar<Scalar, Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE MinMaxFirstLastLoc(
|
||||||
|
View<MinMaxLocScalar<Scalar, Index>, Properties...> const&)
|
||||||
->MinMaxFirstLastLoc<Scalar, Index,
|
->MinMaxFirstLastLoc<Scalar, Index,
|
||||||
typename View<MinMaxLocScalar<Scalar, Index>,
|
typename View<MinMaxLocScalar<Scalar, Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
@ -1069,7 +1075,7 @@ struct MinMaxFirstLastLocCustomComparator {
|
|||||||
|
|
||||||
template <typename Scalar, typename Index, typename ComparatorType,
|
template <typename Scalar, typename Index, typename ComparatorType,
|
||||||
typename... Properties>
|
typename... Properties>
|
||||||
MinMaxFirstLastLocCustomComparator(
|
KOKKOS_DEDUCTION_GUIDE MinMaxFirstLastLocCustomComparator(
|
||||||
View<MinMaxLocScalar<Scalar, Index>, Properties...> const&, ComparatorType)
|
View<MinMaxLocScalar<Scalar, Index>, Properties...> const&, ComparatorType)
|
||||||
->MinMaxFirstLastLocCustomComparator<
|
->MinMaxFirstLastLocCustomComparator<
|
||||||
Scalar, Index, ComparatorType,
|
Scalar, Index, ComparatorType,
|
||||||
@ -1133,7 +1139,8 @@ struct FirstLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Index, typename... Properties>
|
template <typename Index, typename... Properties>
|
||||||
FirstLoc(View<FirstLocScalar<Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE FirstLoc(
|
||||||
|
View<FirstLocScalar<Index>, Properties...> const&)
|
||||||
->FirstLoc<Index, typename View<FirstLocScalar<Index>,
|
->FirstLoc<Index, typename View<FirstLocScalar<Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
|
|
||||||
@ -1194,7 +1201,7 @@ struct LastLoc {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Index, typename... Properties>
|
template <typename Index, typename... Properties>
|
||||||
LastLoc(View<LastLocScalar<Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE LastLoc(View<LastLocScalar<Index>, Properties...> const&)
|
||||||
->LastLoc<Index,
|
->LastLoc<Index,
|
||||||
typename View<LastLocScalar<Index>, Properties...>::memory_space>;
|
typename View<LastLocScalar<Index>, Properties...>::memory_space>;
|
||||||
|
|
||||||
@ -1261,7 +1268,8 @@ struct StdIsPartitioned {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Index, typename... Properties>
|
template <typename Index, typename... Properties>
|
||||||
StdIsPartitioned(View<StdIsPartScalar<Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE StdIsPartitioned(
|
||||||
|
View<StdIsPartScalar<Index>, Properties...> const&)
|
||||||
->StdIsPartitioned<Index, typename View<StdIsPartScalar<Index>,
|
->StdIsPartitioned<Index, typename View<StdIsPartScalar<Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
|
|
||||||
@ -1323,7 +1331,8 @@ struct StdPartitionPoint {
|
|||||||
};
|
};
|
||||||
|
|
||||||
template <typename Index, typename... Properties>
|
template <typename Index, typename... Properties>
|
||||||
StdPartitionPoint(View<StdPartPointScalar<Index>, Properties...> const&)
|
KOKKOS_DEDUCTION_GUIDE StdPartitionPoint(
|
||||||
|
View<StdPartPointScalar<Index>, Properties...> const&)
|
||||||
->StdPartitionPoint<Index, typename View<StdPartPointScalar<Index>,
|
->StdPartitionPoint<Index, typename View<StdPartPointScalar<Index>,
|
||||||
Properties...>::memory_space>;
|
Properties...>::memory_space>;
|
||||||
|
|
||||||
@ -1502,18 +1511,18 @@ struct ParallelReduceAdaptor {
|
|||||||
using Analysis = FunctorAnalysis<FunctorPatternInterface::REDUCE,
|
using Analysis = FunctorAnalysis<FunctorPatternInterface::REDUCE,
|
||||||
PolicyType, typename ReducerSelector::type,
|
PolicyType, typename ReducerSelector::type,
|
||||||
typename return_value_adapter::value_type>;
|
typename return_value_adapter::value_type>;
|
||||||
Kokkos::Impl::shared_allocation_tracking_disable();
|
|
||||||
CombinedFunctorReducer functor_reducer(
|
|
||||||
functor, typename Analysis::Reducer(
|
|
||||||
ReducerSelector::select(functor, return_value)));
|
|
||||||
|
|
||||||
// FIXME Remove "Wrapper" once all backends implement the new interface
|
using CombinedFunctorReducerType =
|
||||||
Impl::ParallelReduce<decltype(functor_reducer), PolicyType,
|
CombinedFunctorReducer<FunctorType, typename Analysis::Reducer>;
|
||||||
typename Impl::FunctorPolicyExecutionSpace<
|
auto closure = construct_with_shared_allocation_tracking_disabled<
|
||||||
FunctorType, PolicyType>::execution_space>
|
Impl::ParallelReduce<CombinedFunctorReducerType, PolicyType,
|
||||||
closure(functor_reducer, inner_policy,
|
typename Impl::FunctorPolicyExecutionSpace<
|
||||||
return_value_adapter::return_value(return_value, functor));
|
FunctorType, PolicyType>::execution_space>>(
|
||||||
Kokkos::Impl::shared_allocation_tracking_enable();
|
CombinedFunctorReducerType(
|
||||||
|
functor, typename Analysis::Reducer(
|
||||||
|
ReducerSelector::select(functor, return_value))),
|
||||||
|
inner_policy,
|
||||||
|
return_value_adapter::return_value(return_value, functor));
|
||||||
closure.execute();
|
closure.execute();
|
||||||
|
|
||||||
Kokkos::Tools::Impl::end_parallel_reduce<PassedReducerType>(
|
Kokkos::Tools::Impl::end_parallel_reduce<PassedReducerType>(
|
||||||
|
|||||||
@ -38,6 +38,8 @@ static_assert(false,
|
|||||||
|
|
||||||
#ifdef KOKKOS_ENABLE_IMPL_MDSPAN
|
#ifdef KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
#include <View/MDSpan/Kokkos_MDSpan_Extents.hpp>
|
#include <View/MDSpan/Kokkos_MDSpan_Extents.hpp>
|
||||||
|
#include <View/MDSpan/Kokkos_MDSpan_Layout.hpp>
|
||||||
|
#include <View/MDSpan/Kokkos_MDSpan_Accessor.hpp>
|
||||||
#endif
|
#endif
|
||||||
#include <Kokkos_MinMax.hpp>
|
#include <Kokkos_MinMax.hpp>
|
||||||
|
|
||||||
@ -372,6 +374,35 @@ struct ViewTraits {
|
|||||||
//------------------------------------
|
//------------------------------------
|
||||||
};
|
};
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
|
namespace Impl {
|
||||||
|
struct UnsupportedKokkosArrayLayout;
|
||||||
|
|
||||||
|
template <class Traits, class Enabled = void>
|
||||||
|
struct MDSpanViewTraits {
|
||||||
|
using mdspan_type = UnsupportedKokkosArrayLayout;
|
||||||
|
};
|
||||||
|
|
||||||
|
// "Natural" mdspan for a view if the View's ArrayLayout is supported.
|
||||||
|
template <class Traits>
|
||||||
|
struct MDSpanViewTraits<Traits,
|
||||||
|
std::void_t<typename Impl::LayoutFromArrayLayout<
|
||||||
|
typename Traits::array_layout>::type>> {
|
||||||
|
using index_type = std::size_t;
|
||||||
|
using extents_type =
|
||||||
|
typename Impl::ExtentsFromDataType<index_type,
|
||||||
|
typename Traits::data_type>::type;
|
||||||
|
using mdspan_layout_type =
|
||||||
|
typename Impl::LayoutFromArrayLayout<typename Traits::array_layout>::type;
|
||||||
|
using accessor_type = Impl::SpaceAwareAccessor<
|
||||||
|
typename Traits::memory_space,
|
||||||
|
Kokkos::default_accessor<typename Traits::value_type>>;
|
||||||
|
using mdspan_type = mdspan<typename Traits::value_type, extents_type,
|
||||||
|
mdspan_layout_type, accessor_type>;
|
||||||
|
};
|
||||||
|
} // namespace Impl
|
||||||
|
#endif // KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
|
|
||||||
/** \class View
|
/** \class View
|
||||||
* \brief View to an array of data.
|
* \brief View to an array of data.
|
||||||
*
|
*
|
||||||
@ -522,7 +553,6 @@ constexpr bool is_assignable(const Kokkos::View<ViewTDst...>& dst,
|
|||||||
//----------------------------------------------------------------------------
|
//----------------------------------------------------------------------------
|
||||||
|
|
||||||
#include <impl/Kokkos_ViewMapping.hpp>
|
#include <impl/Kokkos_ViewMapping.hpp>
|
||||||
#include <impl/Kokkos_ViewArray.hpp>
|
|
||||||
|
|
||||||
//----------------------------------------------------------------------------
|
//----------------------------------------------------------------------------
|
||||||
//----------------------------------------------------------------------------
|
//----------------------------------------------------------------------------
|
||||||
@ -923,57 +953,30 @@ class View : public ViewTraits<DataType, Properties...> {
|
|||||||
template <typename I0, typename I1>
|
template <typename I0, typename I1>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
||||||
(Kokkos::Impl::always_true<I0, I1>::value && //
|
(Kokkos::Impl::always_true<I0, I1>::value && //
|
||||||
(2 == rank) && is_default_map && is_layout_left && (rank_dynamic == 0)),
|
(2 == rank) && is_default_map &&
|
||||||
|
(is_layout_left || is_layout_right || is_layout_stride)),
|
||||||
reference_type>
|
reference_type>
|
||||||
operator()(I0 i0, I1 i1) const {
|
operator()(I0 i0, I1 i1) const {
|
||||||
check_operator_parens_valid_args(i0, i1);
|
check_operator_parens_valid_args(i0, i1);
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1)
|
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1)
|
||||||
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_dim.N0 * i1];
|
if constexpr (is_layout_left) {
|
||||||
}
|
if constexpr (rank_dynamic == 0)
|
||||||
|
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_dim.N0 * i1];
|
||||||
template <typename I0, typename I1>
|
else
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_stride * i1];
|
||||||
(Kokkos::Impl::always_true<I0, I1>::value && //
|
} else if constexpr (is_layout_right) {
|
||||||
(2 == rank) && is_default_map && is_layout_left && (rank_dynamic != 0)),
|
if constexpr (rank_dynamic == 0)
|
||||||
reference_type>
|
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_dim.N1 * i0];
|
||||||
operator()(I0 i0, I1 i1) const {
|
else
|
||||||
check_operator_parens_valid_args(i0, i1);
|
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_stride * i0];
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1)
|
} else {
|
||||||
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_stride * i1];
|
static_assert(is_layout_stride);
|
||||||
}
|
return m_map.m_impl_handle[i0 * m_map.m_impl_offset.m_stride.S0 +
|
||||||
|
i1 * m_map.m_impl_offset.m_stride.S1];
|
||||||
template <typename I0, typename I1>
|
}
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
#if defined KOKKOS_COMPILER_INTEL
|
||||||
(Kokkos::Impl::always_true<I0, I1>::value && //
|
__builtin_unreachable();
|
||||||
(2 == rank) && is_default_map && is_layout_right && (rank_dynamic == 0)),
|
#endif
|
||||||
reference_type>
|
|
||||||
operator()(I0 i0, I1 i1) const {
|
|
||||||
check_operator_parens_valid_args(i0, i1);
|
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1)
|
|
||||||
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_dim.N1 * i0];
|
|
||||||
}
|
|
||||||
|
|
||||||
template <typename I0, typename I1>
|
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
|
||||||
(Kokkos::Impl::always_true<I0, I1>::value && //
|
|
||||||
(2 == rank) && is_default_map && is_layout_right && (rank_dynamic != 0)),
|
|
||||||
reference_type>
|
|
||||||
operator()(I0 i0, I1 i1) const {
|
|
||||||
check_operator_parens_valid_args(i0, i1);
|
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1)
|
|
||||||
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_stride * i0];
|
|
||||||
}
|
|
||||||
|
|
||||||
template <typename I0, typename I1>
|
|
||||||
KOKKOS_FORCEINLINE_FUNCTION
|
|
||||||
std::enable_if_t<(Kokkos::Impl::always_true<I0, I1>::value && //
|
|
||||||
(2 == rank) && is_default_map && is_layout_stride),
|
|
||||||
reference_type>
|
|
||||||
operator()(I0 i0, I1 i1) const {
|
|
||||||
check_operator_parens_valid_args(i0, i1);
|
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1)
|
|
||||||
return m_map.m_impl_handle[i0 * m_map.m_impl_offset.m_stride.S0 +
|
|
||||||
i1 * m_map.m_impl_offset.m_stride.S1];
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Rank 0 -> 8 operator() except for rank-1 and rank-2 with default map which
|
// Rank 0 -> 8 operator() except for rank-1 and rank-2 with default map which
|
||||||
@ -1066,57 +1069,30 @@ class View : public ViewTraits<DataType, Properties...> {
|
|||||||
template <typename I0, typename I1, typename... Is>
|
template <typename I0, typename I1, typename... Is>
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
||||||
(Kokkos::Impl::always_true<I0, I1, Is...>::value && (2 == rank) &&
|
(Kokkos::Impl::always_true<I0, I1, Is...>::value && (2 == rank) &&
|
||||||
is_default_map && is_layout_left && (rank_dynamic == 0)),
|
is_default_map &&
|
||||||
|
(is_layout_left || is_layout_right || is_layout_stride)),
|
||||||
reference_type>
|
reference_type>
|
||||||
access(I0 i0, I1 i1, Is... extra) const {
|
access(I0 i0, I1 i1, Is... extra) const {
|
||||||
check_access_member_function_valid_args(i0, i1, extra...);
|
check_access_member_function_valid_args(i0, i1, extra...);
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1, extra...)
|
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1, extra...)
|
||||||
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_dim.N0 * i1];
|
if constexpr (is_layout_left) {
|
||||||
}
|
if constexpr (rank_dynamic == 0)
|
||||||
|
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_dim.N0 * i1];
|
||||||
template <typename I0, typename I1, typename... Is>
|
else
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_stride * i1];
|
||||||
(Kokkos::Impl::always_true<I0, I1, Is...>::value && (2 == rank) &&
|
} else if constexpr (is_layout_right) {
|
||||||
is_default_map && is_layout_left && (rank_dynamic != 0)),
|
if constexpr (rank_dynamic == 0)
|
||||||
reference_type>
|
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_dim.N1 * i0];
|
||||||
access(I0 i0, I1 i1, Is... extra) const {
|
else
|
||||||
check_access_member_function_valid_args(i0, i1, extra...);
|
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_stride * i0];
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1, extra...)
|
} else {
|
||||||
return m_map.m_impl_handle[i0 + m_map.m_impl_offset.m_stride * i1];
|
static_assert(is_layout_stride);
|
||||||
}
|
return m_map.m_impl_handle[i0 * m_map.m_impl_offset.m_stride.S0 +
|
||||||
|
i1 * m_map.m_impl_offset.m_stride.S1];
|
||||||
template <typename I0, typename I1, typename... Is>
|
}
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
#if defined KOKKOS_COMPILER_INTEL
|
||||||
(Kokkos::Impl::always_true<I0, I1, Is...>::value && (2 == rank) &&
|
__builtin_unreachable();
|
||||||
is_default_map && is_layout_right && (rank_dynamic == 0)),
|
#endif
|
||||||
reference_type>
|
|
||||||
access(I0 i0, I1 i1, Is... extra) const {
|
|
||||||
check_access_member_function_valid_args(i0, i1, extra...);
|
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1, extra...)
|
|
||||||
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_dim.N1 * i0];
|
|
||||||
}
|
|
||||||
|
|
||||||
template <typename I0, typename I1, typename... Is>
|
|
||||||
KOKKOS_FORCEINLINE_FUNCTION std::enable_if_t<
|
|
||||||
(Kokkos::Impl::always_true<I0, I1, Is...>::value && (2 == rank) &&
|
|
||||||
is_default_map && is_layout_right && (rank_dynamic != 0)),
|
|
||||||
reference_type>
|
|
||||||
access(I0 i0, I1 i1, Is... extra) const {
|
|
||||||
check_access_member_function_valid_args(i0, i1, extra...);
|
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1, extra...)
|
|
||||||
return m_map.m_impl_handle[i1 + m_map.m_impl_offset.m_stride * i0];
|
|
||||||
}
|
|
||||||
|
|
||||||
template <typename I0, typename I1, typename... Is>
|
|
||||||
KOKKOS_FORCEINLINE_FUNCTION
|
|
||||||
std::enable_if_t<(Kokkos::Impl::always_true<I0, I1, Is...>::value &&
|
|
||||||
(2 == rank) && is_default_map && is_layout_stride),
|
|
||||||
reference_type>
|
|
||||||
access(I0 i0, I1 i1, Is... extra) const {
|
|
||||||
check_access_member_function_valid_args(i0, i1, extra...);
|
|
||||||
KOKKOS_IMPL_VIEW_OPERATOR_VERIFY(m_track, m_map, i0, i1, extra...)
|
|
||||||
return m_map.m_impl_handle[i0 * m_map.m_impl_offset.m_stride.S0 +
|
|
||||||
i1 * m_map.m_impl_offset.m_stride.S1];
|
|
||||||
}
|
}
|
||||||
|
|
||||||
//------------------------------
|
//------------------------------
|
||||||
@ -1442,8 +1418,7 @@ class View : public ViewTraits<DataType, Properties...> {
|
|||||||
std::is_same_v<typename traits::array_layout,
|
std::is_same_v<typename traits::array_layout,
|
||||||
Kokkos::LayoutRight> ||
|
Kokkos::LayoutRight> ||
|
||||||
std::is_same_v<typename traits::array_layout,
|
std::is_same_v<typename traits::array_layout,
|
||||||
Kokkos::LayoutStride> ||
|
Kokkos::LayoutStride>) {
|
||||||
is_layouttiled<typename traits::array_layout>::value) {
|
|
||||||
size_t i0 = arg_layout.dimension[0];
|
size_t i0 = arg_layout.dimension[0];
|
||||||
size_t i1 = arg_layout.dimension[1];
|
size_t i1 = arg_layout.dimension[1];
|
||||||
size_t i2 = arg_layout.dimension[2];
|
size_t i2 = arg_layout.dimension[2];
|
||||||
@ -1495,8 +1470,7 @@ class View : public ViewTraits<DataType, Properties...> {
|
|||||||
std::is_same_v<typename traits::array_layout,
|
std::is_same_v<typename traits::array_layout,
|
||||||
Kokkos::LayoutRight> ||
|
Kokkos::LayoutRight> ||
|
||||||
std::is_same_v<typename traits::array_layout,
|
std::is_same_v<typename traits::array_layout,
|
||||||
Kokkos::LayoutStride> ||
|
Kokkos::LayoutStride>) {
|
||||||
is_layouttiled<typename traits::array_layout>::value) {
|
|
||||||
size_t i0 = arg_layout.dimension[0];
|
size_t i0 = arg_layout.dimension[0];
|
||||||
size_t i1 = arg_layout.dimension[1];
|
size_t i1 = arg_layout.dimension[1];
|
||||||
size_t i2 = arg_layout.dimension[2];
|
size_t i2 = arg_layout.dimension[2];
|
||||||
@ -1725,6 +1699,79 @@ class View : public ViewTraits<DataType, Properties...> {
|
|||||||
"Layout is not constructible from extent arguments. Use "
|
"Layout is not constructible from extent arguments. Use "
|
||||||
"overload taking a layout object instead.");
|
"overload taking a layout object instead.");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
//----------------------------------------
|
||||||
|
// MDSpan converting constructors
|
||||||
|
#ifdef KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
|
template <typename U = typename Impl::MDSpanViewTraits<traits>::mdspan_type>
|
||||||
|
KOKKOS_INLINE_FUNCTION
|
||||||
|
#ifndef KOKKOS_ENABLE_CXX17
|
||||||
|
explicit(traits::is_managed)
|
||||||
|
#endif
|
||||||
|
View(const typename Impl::MDSpanViewTraits<traits>::mdspan_type& mds,
|
||||||
|
std::enable_if_t<
|
||||||
|
!std::is_same_v<Impl::UnsupportedKokkosArrayLayout, U>>* =
|
||||||
|
nullptr)
|
||||||
|
: View(mds.data_handle(),
|
||||||
|
Impl::array_layout_from_mapping<
|
||||||
|
typename traits::array_layout,
|
||||||
|
typename Impl::MDSpanViewTraits<traits>::mdspan_type>(
|
||||||
|
mds.mapping())) {
|
||||||
|
}
|
||||||
|
|
||||||
|
template <class ElementType, class ExtentsType, class LayoutType,
|
||||||
|
class AccessorType>
|
||||||
|
KOKKOS_INLINE_FUNCTION
|
||||||
|
#ifndef KOKKOS_ENABLE_CXX17
|
||||||
|
explicit(!std::is_convertible_v<
|
||||||
|
Kokkos::mdspan<ElementType, ExtentsType, LayoutType,
|
||||||
|
AccessorType>,
|
||||||
|
typename Impl::MDSpanViewTraits<traits>::mdspan_type>)
|
||||||
|
#endif
|
||||||
|
View(const Kokkos::mdspan<ElementType, ExtentsType, LayoutType,
|
||||||
|
AccessorType>& mds)
|
||||||
|
: View(typename Impl::MDSpanViewTraits<traits>::mdspan_type(mds)) {
|
||||||
|
}
|
||||||
|
|
||||||
|
//----------------------------------------
|
||||||
|
// Conversion to MDSpan
|
||||||
|
template <class OtherElementType, class OtherExtents, class OtherLayoutPolicy,
|
||||||
|
class OtherAccessor,
|
||||||
|
class ImplNaturalMDSpanType =
|
||||||
|
typename Impl::MDSpanViewTraits<traits>::mdspan_type,
|
||||||
|
typename = std::enable_if_t<std::conditional_t<
|
||||||
|
std::is_same_v<Impl::UnsupportedKokkosArrayLayout,
|
||||||
|
ImplNaturalMDSpanType>,
|
||||||
|
std::false_type,
|
||||||
|
std::is_assignable<mdspan<OtherElementType, OtherExtents,
|
||||||
|
OtherLayoutPolicy, OtherAccessor>,
|
||||||
|
ImplNaturalMDSpanType>>::value>>
|
||||||
|
KOKKOS_INLINE_FUNCTION constexpr operator mdspan<
|
||||||
|
OtherElementType, OtherExtents, OtherLayoutPolicy, OtherAccessor>() {
|
||||||
|
using mdspan_type = typename Impl::MDSpanViewTraits<traits>::mdspan_type;
|
||||||
|
return mdspan_type{data(),
|
||||||
|
Impl::mapping_from_view_mapping<mdspan_type>(m_map)};
|
||||||
|
}
|
||||||
|
|
||||||
|
template <class OtherAccessorType = Impl::SpaceAwareAccessor<
|
||||||
|
typename traits::memory_space,
|
||||||
|
Kokkos::default_accessor<typename traits::value_type>>,
|
||||||
|
typename = std::enable_if_t<std::is_assignable_v<
|
||||||
|
typename traits::value_type*&,
|
||||||
|
typename OtherAccessorType::data_handle_type>>>
|
||||||
|
KOKKOS_INLINE_FUNCTION constexpr auto to_mdspan(
|
||||||
|
const OtherAccessorType& other_accessor =
|
||||||
|
typename Impl::MDSpanViewTraits<traits>::accessor_type()) {
|
||||||
|
using mdspan_type = typename Impl::MDSpanViewTraits<traits>::mdspan_type;
|
||||||
|
using ret_mdspan_type =
|
||||||
|
mdspan<typename mdspan_type::element_type,
|
||||||
|
typename mdspan_type::extents_type,
|
||||||
|
typename mdspan_type::layout_type, OtherAccessorType>;
|
||||||
|
return ret_mdspan_type{data(),
|
||||||
|
Impl::mapping_from_view_mapping<mdspan_type>(m_map),
|
||||||
|
other_accessor};
|
||||||
|
}
|
||||||
|
#endif // KOKKOS_ENABLE_IMPL_MDSPAN
|
||||||
};
|
};
|
||||||
|
|
||||||
template <typename D, class... P>
|
template <typename D, class... P>
|
||||||
@ -1878,23 +1925,6 @@ KOKKOS_INLINE_FUNCTION bool operator!=(const View<LT, LP...>& lhs,
|
|||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
inline void shared_allocation_tracking_disable() {
|
|
||||||
Kokkos::Impl::SharedAllocationRecord<void, void>::tracking_disable();
|
|
||||||
}
|
|
||||||
|
|
||||||
inline void shared_allocation_tracking_enable() {
|
|
||||||
Kokkos::Impl::SharedAllocationRecord<void, void>::tracking_enable();
|
|
||||||
}
|
|
||||||
|
|
||||||
} /* namespace Impl */
|
|
||||||
} /* namespace Kokkos */
|
|
||||||
|
|
||||||
//----------------------------------------------------------------------------
|
|
||||||
//----------------------------------------------------------------------------
|
|
||||||
|
|
||||||
namespace Kokkos {
|
|
||||||
namespace Impl {
|
|
||||||
|
|
||||||
template <class Specialize, typename A, typename B>
|
template <class Specialize, typename A, typename B>
|
||||||
struct CommonViewValueType;
|
struct CommonViewValueType;
|
||||||
|
|
||||||
|
|||||||
@ -67,16 +67,7 @@ void *Kokkos::Experimental::OpenACCSpace::impl_allocate(
|
|||||||
ptr = acc_malloc(arg_alloc_size);
|
ptr = acc_malloc(arg_alloc_size);
|
||||||
|
|
||||||
if (!ptr) {
|
if (!ptr) {
|
||||||
size_t alignment = 1; // OpenACC does not handle alignment
|
Kokkos::Impl::throw_bad_alloc(name(), arg_alloc_size, arg_label);
|
||||||
using Kokkos::Experimental::RawMemoryAllocationFailure;
|
|
||||||
auto failure_mode =
|
|
||||||
arg_alloc_size > 0
|
|
||||||
? RawMemoryAllocationFailure::FailureMode::OutOfMemoryError
|
|
||||||
: RawMemoryAllocationFailure::FailureMode::InvalidAllocationSize;
|
|
||||||
auto alloc_mechanism =
|
|
||||||
RawMemoryAllocationFailure::AllocationMechanism::OpenACCMalloc;
|
|
||||||
throw RawMemoryAllocationFailure(arg_alloc_size, alignment, failure_mode,
|
|
||||||
alloc_mechanism);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
if (Kokkos::Profiling::profileLibraryLoaded()) {
|
||||||
|
|||||||
@ -44,10 +44,12 @@ class Kokkos::Impl::ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,
|
|||||||
auto team_size = m_policy.team_size();
|
auto team_size = m_policy.team_size();
|
||||||
auto vector_length = m_policy.impl_vector_length();
|
auto vector_length = m_policy.impl_vector_length();
|
||||||
|
|
||||||
|
int const async_arg = m_policy.space().acc_async_queue();
|
||||||
|
|
||||||
auto const a_functor(m_functor);
|
auto const a_functor(m_functor);
|
||||||
|
|
||||||
#pragma acc parallel loop gang vector num_gangs(league_size) \
|
#pragma acc parallel loop gang vector num_gangs(league_size) \
|
||||||
vector_length(team_size* vector_length) copyin(a_functor)
|
vector_length(team_size* vector_length) copyin(a_functor) async(async_arg)
|
||||||
for (int i = 0; i < league_size * team_size * vector_length; i++) {
|
for (int i = 0; i < league_size * team_size * vector_length; i++) {
|
||||||
int league_id = i / (team_size * vector_length);
|
int league_id = i / (team_size * vector_length);
|
||||||
typename Policy::member_type team(league_id, league_size, team_size,
|
typename Policy::member_type team(league_id, league_size, team_size,
|
||||||
@ -145,10 +147,12 @@ class Kokkos::Impl::ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,
|
|||||||
auto team_size = m_policy.team_size();
|
auto team_size = m_policy.team_size();
|
||||||
auto vector_length = m_policy.impl_vector_length();
|
auto vector_length = m_policy.impl_vector_length();
|
||||||
|
|
||||||
|
int const async_arg = m_policy.space().acc_async_queue();
|
||||||
|
|
||||||
auto const a_functor(m_functor);
|
auto const a_functor(m_functor);
|
||||||
|
|
||||||
#pragma acc parallel loop gang num_gangs(league_size) num_workers(team_size) \
|
#pragma acc parallel loop gang num_gangs(league_size) num_workers(team_size) \
|
||||||
vector_length(vector_length) copyin(a_functor)
|
vector_length(vector_length) copyin(a_functor) async(async_arg)
|
||||||
for (int i = 0; i < league_size; i++) {
|
for (int i = 0; i < league_size; i++) {
|
||||||
int league_id = i;
|
int league_id = i;
|
||||||
typename Policy::member_type team(league_id, league_size, team_size,
|
typename Policy::member_type team(league_id, league_size, team_size,
|
||||||
|
|||||||
@ -72,9 +72,28 @@ int OpenMP::concurrency(OpenMP const &instance) {
|
|||||||
int OpenMP::concurrency() const { return impl_thread_pool_size(); }
|
int OpenMP::concurrency() const { return impl_thread_pool_size(); }
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
void OpenMP::impl_static_fence(std::string const &name) {
|
||||||
|
Kokkos::Tools::Experimental::Impl::profile_fence_event<Kokkos::OpenMP>(
|
||||||
|
name,
|
||||||
|
Kokkos::Tools::Experimental::SpecialSynchronizationCases::
|
||||||
|
GlobalDeviceSynchronization,
|
||||||
|
[]() {
|
||||||
|
std::lock_guard<std::mutex> lock_all_instances(
|
||||||
|
Impl::OpenMPInternal::all_instances_mutex);
|
||||||
|
for (auto *instance_ptr : Impl::OpenMPInternal::all_instances) {
|
||||||
|
std::lock_guard<std::mutex> lock_instance(
|
||||||
|
instance_ptr->m_instance_mutex);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
void OpenMP::fence(const std::string &name) const {
|
void OpenMP::fence(const std::string &name) const {
|
||||||
Kokkos::Tools::Experimental::Impl::profile_fence_event<Kokkos::OpenMP>(
|
Kokkos::Tools::Experimental::Impl::profile_fence_event<Kokkos::OpenMP>(
|
||||||
name, Kokkos::Tools::Experimental::Impl::DirectFenceIDHandle{1}, []() {});
|
name, Kokkos::Tools::Experimental::Impl::DirectFenceIDHandle{1},
|
||||||
|
[this]() {
|
||||||
|
auto *internal_instance = this->impl_internal_space_instance();
|
||||||
|
std::lock_guard<std::mutex> lock(internal_instance->m_instance_mutex);
|
||||||
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
bool OpenMP::impl_is_initialized() noexcept {
|
bool OpenMP::impl_is_initialized() noexcept {
|
||||||
|
|||||||
@ -67,7 +67,15 @@ class OpenMP {
|
|||||||
|
|
||||||
OpenMP();
|
OpenMP();
|
||||||
|
|
||||||
OpenMP(int pool_size);
|
explicit OpenMP(int pool_size);
|
||||||
|
|
||||||
|
#ifdef KOKKOS_ENABLE_DEPRECATED_CODE_4
|
||||||
|
template <typename T = void>
|
||||||
|
KOKKOS_DEPRECATED_WITH_COMMENT(
|
||||||
|
"OpenMP execution space should be constructed explicitly.")
|
||||||
|
OpenMP(int pool_size)
|
||||||
|
: OpenMP(pool_size) {}
|
||||||
|
#endif
|
||||||
|
|
||||||
/// \brief Print configuration information to the given output stream.
|
/// \brief Print configuration information to the given output stream.
|
||||||
void print_configuration(std::ostream& os, bool verbose = false) const;
|
void print_configuration(std::ostream& os, bool verbose = false) const;
|
||||||
@ -146,14 +154,6 @@ inline int OpenMP::impl_thread_pool_rank() noexcept {
|
|||||||
KOKKOS_IF_ON_DEVICE((return -1;))
|
KOKKOS_IF_ON_DEVICE((return -1;))
|
||||||
}
|
}
|
||||||
|
|
||||||
inline void OpenMP::impl_static_fence(std::string const& name) {
|
|
||||||
Kokkos::Tools::Experimental::Impl::profile_fence_event<Kokkos::OpenMP>(
|
|
||||||
name,
|
|
||||||
Kokkos::Tools::Experimental::SpecialSynchronizationCases::
|
|
||||||
GlobalDeviceSynchronization,
|
|
||||||
[]() {});
|
|
||||||
}
|
|
||||||
|
|
||||||
inline bool OpenMP::is_asynchronous(OpenMP const& /*instance*/) noexcept {
|
inline bool OpenMP::is_asynchronous(OpenMP const& /*instance*/) noexcept {
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|||||||
@ -34,18 +34,8 @@
|
|||||||
namespace Kokkos {
|
namespace Kokkos {
|
||||||
namespace Impl {
|
namespace Impl {
|
||||||
|
|
||||||
void OpenMPInternal::acquire_lock() {
|
std::vector<OpenMPInternal *> OpenMPInternal::all_instances;
|
||||||
while (1 == desul::atomic_compare_exchange(&m_pool_mutex, 0, 1,
|
std::mutex OpenMPInternal::all_instances_mutex;
|
||||||
desul::MemoryOrderAcquire(),
|
|
||||||
desul::MemoryScopeDevice())) {
|
|
||||||
// do nothing
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
void OpenMPInternal::release_lock() {
|
|
||||||
desul::atomic_store(&m_pool_mutex, 0, desul::MemoryOrderRelease(),
|
|
||||||
desul::MemoryScopeDevice());
|
|
||||||
}
|
|
||||||
|
|
||||||
void OpenMPInternal::clear_thread_data() {
|
void OpenMPInternal::clear_thread_data() {
|
||||||
const size_t member_bytes =
|
const size_t member_bytes =
|
||||||
@ -123,17 +113,11 @@ void OpenMPInternal::resize_thread_data(size_t pool_reduce_bytes,
|
|||||||
if (nullptr != m_pool[rank]) {
|
if (nullptr != m_pool[rank]) {
|
||||||
m_pool[rank]->disband_pool();
|
m_pool[rank]->disband_pool();
|
||||||
|
|
||||||
space.deallocate(m_pool[rank], old_alloc_bytes);
|
// impl_deallocate to not fence here
|
||||||
|
space.impl_deallocate("[unlabeled]", m_pool[rank], old_alloc_bytes);
|
||||||
}
|
}
|
||||||
|
|
||||||
void *ptr = nullptr;
|
void *ptr = space.allocate("Kokkos::OpenMP::scratch_mem", alloc_bytes);
|
||||||
try {
|
|
||||||
ptr = space.allocate(alloc_bytes);
|
|
||||||
} catch (
|
|
||||||
Kokkos::Experimental::RawMemoryAllocationFailure const &failure) {
|
|
||||||
// For now, just rethrow the error message the existing way
|
|
||||||
Kokkos::Impl::throw_runtime_exception(failure.get_error_message());
|
|
||||||
}
|
|
||||||
|
|
||||||
m_pool[rank] = new (ptr) HostThreadTeamData();
|
m_pool[rank] = new (ptr) HostThreadTeamData();
|
||||||
|
|
||||||
@ -304,6 +288,18 @@ void OpenMPInternal::finalize() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
m_initialized = false;
|
m_initialized = false;
|
||||||
|
|
||||||
|
// guard erasing from all_instances
|
||||||
|
{
|
||||||
|
std::scoped_lock lock(all_instances_mutex);
|
||||||
|
|
||||||
|
auto it = std::find(all_instances.begin(), all_instances.end(), this);
|
||||||
|
if (it == all_instances.end())
|
||||||
|
Kokkos::abort(
|
||||||
|
"Execution space instance to be removed couldn't be found!");
|
||||||
|
*it = all_instances.back();
|
||||||
|
all_instances.pop_back();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
void OpenMPInternal::print_configuration(std::ostream &s) const {
|
void OpenMPInternal::print_configuration(std::ostream &s) const {
|
||||||
|
|||||||
@ -56,7 +56,13 @@ struct OpenMPTraits {
|
|||||||
class OpenMPInternal {
|
class OpenMPInternal {
|
||||||
private:
|
private:
|
||||||
OpenMPInternal(int arg_pool_size)
|
OpenMPInternal(int arg_pool_size)
|
||||||
: m_pool_size{arg_pool_size}, m_level{omp_get_level()}, m_pool() {}
|
: m_pool_size{arg_pool_size}, m_level{omp_get_level()}, m_pool() {
|
||||||
|
// guard pushing to all_instances
|
||||||
|
{
|
||||||
|
std::scoped_lock lock(all_instances_mutex);
|
||||||
|
all_instances.push_back(this);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
~OpenMPInternal() { clear_thread_data(); }
|
~OpenMPInternal() { clear_thread_data(); }
|
||||||
|
|
||||||
@ -66,7 +72,6 @@ class OpenMPInternal {
|
|||||||
|
|
||||||
int m_pool_size;
|
int m_pool_size;
|
||||||
int m_level;
|
int m_level;
|
||||||
int m_pool_mutex = 0;
|
|
||||||
|
|
||||||
HostThreadTeamData* m_pool[OpenMPTraits::MAX_THREAD_COUNT];
|
HostThreadTeamData* m_pool[OpenMPTraits::MAX_THREAD_COUNT];
|
||||||
|
|
||||||
@ -83,12 +88,6 @@ class OpenMPInternal {
|
|||||||
|
|
||||||
int thread_pool_size() const { return m_pool_size; }
|
int thread_pool_size() const { return m_pool_size; }
|
||||||
|
|
||||||
// Acquire lock used to protect access to m_pool
|
|
||||||
void acquire_lock();
|
|
||||||
|
|
||||||
// Release lock used to protect access to m_pool
|
|
||||||
void release_lock();
|
|
||||||
|
|
||||||
void resize_thread_data(size_t pool_reduce_bytes, size_t team_reduce_bytes,
|
void resize_thread_data(size_t pool_reduce_bytes, size_t team_reduce_bytes,
|
||||||
size_t team_shared_bytes, size_t thread_local_bytes);
|
size_t team_shared_bytes, size_t thread_local_bytes);
|
||||||
|
|
||||||
@ -107,6 +106,11 @@ class OpenMPInternal {
|
|||||||
bool verify_is_initialized(const char* const label) const;
|
bool verify_is_initialized(const char* const label) const;
|
||||||
|
|
||||||
void print_configuration(std::ostream& s) const;
|
void print_configuration(std::ostream& s) const;
|
||||||
|
|
||||||
|
std::mutex m_instance_mutex;
|
||||||
|
|
||||||
|
static std::vector<OpenMPInternal*> all_instances;
|
||||||
|
static std::mutex all_instances_mutex;
|
||||||
};
|
};
|
||||||
|
|
||||||
inline bool execute_in_serial(OpenMP const& space = OpenMP()) {
|
inline bool execute_in_serial(OpenMP const& space = OpenMP()) {
|
||||||
@ -157,7 +161,7 @@ inline std::vector<OpenMP> create_OpenMP_instances(
|
|||||||
"Kokkos::abort: Partition not enough resources left to create the last "
|
"Kokkos::abort: Partition not enough resources left to create the last "
|
||||||
"instance.");
|
"instance.");
|
||||||
}
|
}
|
||||||
instances[weights.size() - 1] = resources_left;
|
instances[weights.size() - 1] = OpenMP(resources_left);
|
||||||
|
|
||||||
return instances;
|
return instances;
|
||||||
}
|
}
|
||||||
|
|||||||
@ -108,6 +108,8 @@ class ParallelFor<FunctorType, Kokkos::RangePolicy<Traits...>, Kokkos::OpenMP> {
|
|||||||
|
|
||||||
public:
|
public:
|
||||||
inline void execute() const {
|
inline void execute() const {
|
||||||
|
// Serialize kernels on the same execution space instance
|
||||||
|
std::lock_guard<std::mutex> lock(m_instance->m_instance_mutex);
|
||||||
if (execute_in_serial(m_policy.space())) {
|
if (execute_in_serial(m_policy.space())) {
|
||||||
exec_range(m_functor, m_policy.begin(), m_policy.end());
|
exec_range(m_functor, m_policy.begin(), m_policy.end());
|
||||||
return;
|
return;
|
||||||
@ -202,6 +204,9 @@ class ParallelFor<FunctorType, Kokkos::MDRangePolicy<Traits...>,
|
|||||||
|
|
||||||
public:
|
public:
|
||||||
inline void execute() const {
|
inline void execute() const {
|
||||||
|
// Serialize kernels on the same execution space instance
|
||||||
|
std::lock_guard<std::mutex> lock(m_instance->m_instance_mutex);
|
||||||
|
|
||||||
#ifndef KOKKOS_COMPILER_INTEL
|
#ifndef KOKKOS_COMPILER_INTEL
|
||||||
if (execute_in_serial(m_iter.m_rp.space())) {
|
if (execute_in_serial(m_iter.m_rp.space())) {
|
||||||
exec_range(0, m_iter.m_rp.m_num_tiles);
|
exec_range(0, m_iter.m_rp.m_num_tiles);
|
||||||
@ -333,7 +338,8 @@ class ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,
|
|||||||
const size_t team_shared_size = m_shmem_size;
|
const size_t team_shared_size = m_shmem_size;
|
||||||
const size_t thread_local_size = 0; // Never shrinks
|
const size_t thread_local_size = 0; // Never shrinks
|
||||||
|
|
||||||
m_instance->acquire_lock();
|
// Serialize kernels on the same execution space instance
|
||||||
|
std::lock_guard<std::mutex> lock(m_instance->m_instance_mutex);
|
||||||
|
|
||||||
m_instance->resize_thread_data(pool_reduce_size, team_reduce_size,
|
m_instance->resize_thread_data(pool_reduce_size, team_reduce_size,
|
||||||
team_shared_size, thread_local_size);
|
team_shared_size, thread_local_size);
|
||||||
@ -343,8 +349,6 @@ class ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,
|
|||||||
m_functor, *(m_instance->get_thread_data()), 0,
|
m_functor, *(m_instance->get_thread_data()), 0,
|
||||||
m_policy.league_size(), m_policy.league_size());
|
m_policy.league_size(), m_policy.league_size());
|
||||||
|
|
||||||
m_instance->release_lock();
|
|
||||||
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -383,8 +387,6 @@ class ParallelFor<FunctorType, Kokkos::TeamPolicy<Properties...>,
|
|||||||
|
|
||||||
data.disband_team();
|
data.disband_team();
|
||||||
}
|
}
|
||||||
|
|
||||||
m_instance->release_lock();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
inline ParallelFor(const FunctorType& arg_functor, const Policy& arg_policy)
|
inline ParallelFor(const FunctorType& arg_functor, const Policy& arg_policy)
|
||||||
|
|||||||
@ -83,7 +83,8 @@ class ParallelReduce<CombinedFunctorReducerType, Kokkos::RangePolicy<Traits...>,
|
|||||||
|
|
||||||
const size_t pool_reduce_bytes = reducer.value_size();
|
const size_t pool_reduce_bytes = reducer.value_size();
|
||||||
|
|
||||||
m_instance->acquire_lock();
|
// Serialize kernels on the same execution space instance
|
||||||
|
std::lock_guard<std::mutex> lock(m_instance->m_instance_mutex);
|
||||||
|
|
||||||
m_instance->resize_thread_data(pool_reduce_bytes, 0 // team_reduce_bytes
|
m_instance->resize_thread_data(pool_reduce_bytes, 0 // team_reduce_bytes
|
||||||
,
|
,
|
||||||
@ -106,6 +107,7 @@ class ParallelReduce<CombinedFunctorReducerType, Kokkos::RangePolicy<Traits...>,
|
|||||||
update);
|
update);
|
||||||
|
|
||||||
reducer.final(ptr);
|
reducer.final(ptr);
|
||||||
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
const int pool_size = m_instance->thread_pool_size();
|
const int pool_size = m_instance->thread_pool_size();
|
||||||
@ -157,8 +159,6 @@ class ParallelReduce<CombinedFunctorReducerType, Kokkos::RangePolicy<Traits...>,
|
|||||||
m_result_ptr[j] = ptr[j];
|
m_result_ptr[j] = ptr[j];
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
m_instance->release_lock();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
//----------------------------------------
|
//----------------------------------------
|
||||||
@ -218,7 +218,8 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
const ReducerType& reducer = m_iter.m_func.get_reducer();
|
const ReducerType& reducer = m_iter.m_func.get_reducer();
|
||||||
const size_t pool_reduce_bytes = reducer.value_size();
|
const size_t pool_reduce_bytes = reducer.value_size();
|
||||||
|
|
||||||
m_instance->acquire_lock();
|
// Serialize kernels on the same execution space instance
|
||||||
|
std::lock_guard<std::mutex> lock(m_instance->m_instance_mutex);
|
||||||
|
|
||||||
m_instance->resize_thread_data(pool_reduce_bytes, 0 // team_reduce_bytes
|
m_instance->resize_thread_data(pool_reduce_bytes, 0 // team_reduce_bytes
|
||||||
,
|
,
|
||||||
@ -241,8 +242,6 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
|
|
||||||
reducer.final(ptr);
|
reducer.final(ptr);
|
||||||
|
|
||||||
m_instance->release_lock();
|
|
||||||
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
@ -299,8 +298,6 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
m_result_ptr[j] = ptr[j];
|
m_result_ptr[j] = ptr[j];
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
m_instance->release_lock();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
//----------------------------------------
|
//----------------------------------------
|
||||||
@ -415,7 +412,8 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
const size_t team_shared_size = m_shmem_size + m_policy.scratch_size(1);
|
const size_t team_shared_size = m_shmem_size + m_policy.scratch_size(1);
|
||||||
const size_t thread_local_size = 0; // Never shrinks
|
const size_t thread_local_size = 0; // Never shrinks
|
||||||
|
|
||||||
m_instance->acquire_lock();
|
// Serialize kernels on the same execution space instance
|
||||||
|
std::lock_guard<std::mutex> lock(m_instance->m_instance_mutex);
|
||||||
|
|
||||||
m_instance->resize_thread_data(pool_reduce_size, team_reduce_size,
|
m_instance->resize_thread_data(pool_reduce_size, team_reduce_size,
|
||||||
team_shared_size, thread_local_size);
|
team_shared_size, thread_local_size);
|
||||||
@ -433,8 +431,6 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
|
|
||||||
reducer.final(ptr);
|
reducer.final(ptr);
|
||||||
|
|
||||||
m_instance->release_lock();
|
|
||||||
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -510,8 +506,6 @@ class ParallelReduce<CombinedFunctorReducerType,
|
|||||||
m_result_ptr[j] = ptr[j];
|
m_result_ptr[j] = ptr[j];
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
m_instance->release_lock();
|
|
||||||
}
|
}
|
||||||
|
|
||||||
//----------------------------------------
|
//----------------------------------------
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user