Intrepid2
|
Implementation of a general sum factorization algorithm, abstracted from the algorithm described by Mora and Demkowicz, for integration. Uses hierarchical parallelism. More...
#include <Intrepid2_IntegrationToolsDef.hpp>
Public Member Functions | |
F_Integrate (Data< Scalar, DeviceType > integralData, TensorData< Scalar, DeviceType > leftComponent, Data< Scalar, DeviceType > composedTransform, TensorData< Scalar, DeviceType > rightComponent, TensorData< Scalar, DeviceType > cellMeasures, int a_offset, int b_offset, int leftFieldOrdinalOffset, int rightFieldOrdinalOffset, bool forceNonSpecialized) | |
template<size_t maxComponents, size_t numComponents = maxComponents> | |
KOKKOS_INLINE_FUNCTION int | incrementArgument (Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds) const |
KOKKOS_INLINE_FUNCTION int | incrementArgument (Kokkos::Array< int, Parameters::MaxTensorComponents > &arguments, const Kokkos::Array< int, Parameters::MaxTensorComponents > &bounds, const int &numComponents) const |
runtime-sized variant of incrementArgument; gets used by approximate flop count. | |
template<size_t maxComponents, size_t numComponents = maxComponents> | |
KOKKOS_INLINE_FUNCTION int | nextIncrementResult (const Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds) const |
KOKKOS_INLINE_FUNCTION int | nextIncrementResult (const Kokkos::Array< int, Parameters::MaxTensorComponents > &arguments, const Kokkos::Array< int, Parameters::MaxTensorComponents > &bounds, const int &numComponents) const |
runtime-sized variant of nextIncrementResult; gets used by approximate flop count. | |
template<size_t maxComponents, size_t numComponents = maxComponents> | |
KOKKOS_INLINE_FUNCTION int | relativeEnumerationIndex (const Kokkos::Array< int, maxComponents > &arguments, const Kokkos::Array< int, maxComponents > &bounds, const int startIndex) const |
KOKKOS_INLINE_FUNCTION void | runSpecialized3 (const TeamMember &teamMember) const |
runSpecialized implementations are hand-coded variants of run() for a particular number of components. To allow comparisons with the generic implementation (both in terms of performance and for verification), we use the member variable forceNonSpecialized_ to determine whether runSpecialized is selected when a specialized implementation is available. | |
template<size_t numTensorComponents> | |
KOKKOS_INLINE_FUNCTION void | run (const TeamMember &teamMember) const |
KOKKOS_INLINE_FUNCTION void | operator() (const TeamMember &teamMember) const |
long | approximateFlopCountPerCell () const |
returns an estimate of the number of floating point operations per cell (counting sums, subtractions, divisions, and multiplies, each of which counts as one operation). | |
int | teamSize (const int &maxTeamSizeFromKokkos) const |
returns the team size that should be provided to the policy constructor, based on the Kokkos maximum and the amount of thread parallelism we have available. | |
size_t | team_shmem_size (int team_size) const |
Provide the shared memory capacity. | |
Private Types | |
using | ExecutionSpace = typename DeviceType::execution_space |
using | TeamPolicy = Kokkos::TeamPolicy< ExecutionSpace > |
using | TeamMember = typename TeamPolicy::member_type |
using | IntegralViewType = Kokkos::View< typename RankExpander< Scalar, integralViewRank >::value_type, DeviceType > |
Private Attributes | |
IntegralViewType | integralView_ |
TensorData< Scalar, DeviceType > | leftComponent_ |
Data< Scalar, DeviceType > | composedTransform_ |
TensorData< Scalar, DeviceType > | rightComponent_ |
TensorData< Scalar, DeviceType > | cellMeasures_ |
int | a_offset_ |
int | b_offset_ |
int | leftComponentSpan_ |
int | rightComponentSpan_ |
int | numTensorComponents_ |
int | leftFieldOrdinalOffset_ |
int | rightFieldOrdinalOffset_ |
bool | forceNonSpecialized_ |
size_t | fad_size_output_ = 0 |
Kokkos::Array< int, 7 > | offsetsForComponentOrdinal_ |
Kokkos::Array< int, Parameters::MaxTensorComponents > | leftFieldBounds_ |
Kokkos::Array< int, Parameters::MaxTensorComponents > | rightFieldBounds_ |
Kokkos::Array< int, Parameters::MaxTensorComponents > | pointBounds_ |
Kokkos::Array< int, Parameters::MaxTensorComponents > | leftFieldRelativeEnumerationSpans_ |
Kokkos::Array< int, Parameters::MaxTensorComponents > | rightFieldRelativeEnumerationSpans_ |
int | maxFieldsLeft_ |
int | maxFieldsRight_ |
int | maxPointCount_ |
Implementation of a general sum factorization algorithm, abstracted from the algorithm described by Mora and Demkowicz, for integration. Uses hierarchical parallelism.
Definition at line 65 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 67 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 71 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 69 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 68 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 103 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
returns an estimate of the number of floating point operations per cell (counting sums, subtractions, divisions, and multiplies, each of which counts as one operation).
Definition at line 872 of file Intrepid2_IntegrationToolsDef.hpp.
References Intrepid2::Data< DataScalar, DeviceType >::extent_int(), Intrepid2::TensorData< Scalar, DeviceType >::extent_int(), Intrepid2::TensorData< Scalar, DeviceType >::getTensorComponent(), and Intrepid2::TensorData< Scalar, DeviceType >::numTensorComponents().
|
inline |
Definition at line 188 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runtime-sized variant of incrementArgument; gets used by approximate flop count.
Definition at line 211 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 229 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runtime-sized variant of nextIncrementResult; gets used by approximate flop count.
Definition at line 250 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 848 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 266 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
Definition at line 582 of file Intrepid2_IntegrationToolsDef.hpp.
|
inline |
runSpecialized implementations are hand-coded variants of run() for a particular number of components. To allow comparisons with the generic implementation (both in terms of performance and for verification), we use the member variable forceNonSpecialized_ to determine whether runSpecialized is selected when a specialized implementation is available.
Definition at line 299 of file Intrepid2_IntegrationToolsDef.hpp.
References Intrepid2::Data< DataScalar, DeviceType >::extent_int(), Intrepid2::TensorData< Scalar, DeviceType >::getTensorComponent(), Intrepid2::Data< DataScalar, DeviceType >::getUnderlyingView4(), and Intrepid2::Data< DataScalar, DeviceType >::underlyingMatchesLogical().
|
inline |
Provide the shared memory capacity.
Definition at line 959 of file Intrepid2_IntegrationToolsDef.hpp.
References Intrepid2::Data< DataScalar, DeviceType >::extent_int().
|
inline |
returns the team size that should be provided to the policy constructor, based on the Kokkos maximum and the amount of thread parallelism we have available.
Definition at line 951 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 77 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 78 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 76 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 74 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 86 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 84 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 72 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 73 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 79 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 92 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 82 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 96 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 99 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 100 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 101 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 81 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 88 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 94 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 75 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 80 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 93 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 83 of file Intrepid2_IntegrationToolsDef.hpp.
|
private |
Definition at line 97 of file Intrepid2_IntegrationToolsDef.hpp.