elsa storage

The storage module of elsa is an implementation detail of the DataContainer and functionals. It contains data types for owning and non-owning storage plus a set of algorithms. The storage module is used as generic implementation module and should be used with some caution. The implementation is based on the library thrust and hence might depend on CUDA internals. Exposing thrust to headers thus might introduce a dependency on the Nvidia compiler. In certain cases, this is not what we want, hence please be careful when using functionality from this module.

Algorithms

The most important algorithms are reductions and (unary and binary) transformations.

Reductions

group reductions

Reductions are algorithms which reduce lists of values into a single value. Given a r-rank tensor reductions lower the rank to an arbitary value. But so far, only reductions to 0-rank tensors (i.e. scalars) are implemented.

In this case, some of the common mathematical reductions or functionals are implemented. This includes binary reductions as the dot/scalar product. But also many unary reductions and norms, such as the $\ell^1$ or the $\ell^2$.

Functions

template<class InputIter1, class InputIter2, class data_t = std::common_type_t<thrust::iterator_value_t<InputIter1>, thrust::iterator_value_t<InputIter2>>>
auto dot(InputIter1 xfirst, InputIter1 xlast, InputIter2 yfirst) -> std::common_type_t<thrust::iterator_value_t<InputIter1>, thrust::iterator_value_t<InputIter2>>

Compute the dot product between two vectors.

Compute the sum of products of each entry in the vectors, i.e. $\sum_i x_i * y_i$. If any of the two vectors is complex, the dot product is conjugate linear in the first component and linear in the second, i.e. $\sum_i \bar{x}_i * y_i$, as is done in Eigen and Numpy.

The return type is determined from the value types of the two iterators. If any is a complex type, the return type will also be a complex type.

template<class InputIter>
auto minElement(InputIter first, InputIter last) -> thrust::iterator_value_t<InputIter>

Compute the minimum element of the given vector.

The minimum is determined via the operator< of the iterators value type. If the vector is empty, a default constructed value type is returned.

template<class InputIter>
auto maxElement(InputIter first, InputIter last) -> thrust::iterator_value_t<InputIter>

Compute the maximum element of the given vector.

The maximum is determined via the operator< of the iterators value type. If the vector is empty, a default constructed value type is returned.

template<class InputIter>
std::ptrdiff_t l0PseudoNorm(InputIter first, InputIter last)

compute the l0-“norm”, which counts the number of non-zero elements.

template<class InputIter>
auto l1Norm(InputIter first, InputIter last) -> typename value_type_of<thrust::iterator_value_t<InputIter>>::type

Compute the l1 norm, i.e. the sum of absolute values, of the vector.

template<class InputIter>
auto squaredL2Norm(InputIter first, InputIter last) -> value_type_of_t<thrust::iterator_value_t<InputIter>>

Compute the squared L2-norm, the sum of squares ( $\sum_i x_i * x_i$.)

template<class InputIter>
auto l2Norm(InputIter first, InputIter last) -> value_type_of_t<thrust::iterator_value_t<InputIter>>

Compute the L2-norm, the square root of the sum of squares ( $\sqrt{\sum_i x_i * x_i}$.)

template<class InputIter>
auto lInf(InputIter first, InputIter last) -> value_type_of_t<thrust::iterator_value_t<InputIter>>

Compute the max of the vector ( $\sup_i |x_i|$.)

template<class InputIter>
auto sum(InputIter first, InputIter last) -> thrust::iterator_value_t<InputIter>

Compute the sum of the vector ( $\sum_i x_i$.)

Transforms

group transforms

Transforms are algorithms which take an list as input and return a list. Transformations do not change the rank of a given tensor.

Currently, common binary transformations such as component wise (in place) addition, subtraction, multiplication, and division are implement. Many mathematical component wise unary transformations such as square, square root and logarithm.

Functions

template<class InputIter, class OutIter>
void cwiseAbs(InputIter first, InputIter last, OutIter out)

Compute the coefficient wise absolute value of the input ranges.

template<class InputIter1, class InputIter2, class OutIter>
void add(InputIter1 xfirst, InputIter1 xlast, InputIter2 yfirst, OutIter out)

Compute the component wise addition of two vectors.

template<class data_t, class InputIter, class OutIter>
void addScalar(InputIter first, InputIter last, const data_t &scalar, OutIter out)

Compute the component wise addition of a vectors and a scalar.

template<class data_t, class InputIter, class OutIter>
void addScalar(const data_t &scalar, InputIter first, InputIter last, OutIter out)

Compute the component wise addition of a scalar and a vector.

template<class InputIter, class data_t>
void fill(InputIter first, InputIter last, const data_t &scalar)

Fill given range with scalar value.

template<class InputIter, class OutputIter>
void assign(InputIter first, InputIter last, OutputIter out)

Copy input range to the output range.

template<class InputIter, class OutIter>
void bessel_log_0(InputIter first, InputIter last, OutIter out)

Compute the log of modified Bessel function of the first kind of order zero for each element of the input range.

template<class InputIter, class OutIter>
void bessel_1_0(InputIter first, InputIter last, OutIter out)

Compute the modified Bessel function of the first kind of order one divided by that of order zero for each element of the input range.

template<class InputIter, class OutIter>
void cast(InputIter first, InputIter last, OutIter out)

Cast input range to type from output range.

template<class Iter, class OutIter, class T = thrust::iterator_value_t<Iter>, class U = thrust::iterator_value_t<Iter>>
void clip(Iter first, Iter last, OutIter out, const T &minval, const U &maxval)

Clip input range to minval and maxval

template<class Iter, class OutIter, class T = thrust::iterator_value_t<Iter>>
void clip(Iter first, Iter last, OutIter out, const T &maxval)

Clip input range to 0 and maxval

template<class InputIter1, class InputIter2, class OutIter>
void div(InputIter1 xfirst, InputIter1 xlast, InputIter2 yfirst, OutIter out)

Compute the component wise division of two vectors.

template<class data_t, class InputIter, class OutIter>
void divScalar(InputIter first, InputIter last, const data_t &scalar, OutIter out)

Compute the component wise division of a vectors and a scalar.

template<class data_t, class InputIter, class OutIter>
void divScalar(const data_t &scalar, InputIter first, InputIter last, OutIter out)

Compute the component wise division of a scalar and a vector.

template<class InputIter, class OutIter>
void exp(InputIter first, InputIter last, OutIter out)

apply the exponentional function for each element in the range

template<class InputIter1, class InputIter2, class OutIter>
void cwiseMax(InputIter1 xfirst, InputIter1 xlast, InputIter2 yfirst, OutIter out)

Compute the coefficient wise maximum of two input ranges. For complex input’s the absolute value of the complex number is used.

template<class InputIter1, class InputIter2, class OutIter>
void cwiseMin(InputIter1 xfirst, InputIter1 xlast, InputIter2 yfirst, OutIter out)

Compute the coefficient wise minimum of two input ranges. For complex input’s the absolute value of the complex number is used.

template<class InputIter1, class Scalar, class OutIter>
void minimum(InputIter1 xfirst, InputIter1 xlast, Scalar scalar, OutIter out)

For each element in the vector set the element to the minimum of the element and the given scalar.

template<class InputIter1, class Scalar, class OutIter>
void maximum(InputIter1 xfirst, InputIter1 xlast, Scalar scalar, OutIter out)

For each element in the vector set the element to the maximum of the element and the given scalar.

template<class InputIter, class OutIter>
void imag(InputIter first, InputIter last, OutIter out)

Extract the imaginary part of a range. If input range is not complex, it is treated as complex numbers with an imaginary part of 0

template<class InOutIter, class InputIter>
void inplaceAdd(InOutIter xfirst, InOutIter xlast, InputIter yfirst)

Add the two ranges together, while the first is the output range at the same time.

template<class InOutIter, class Scalar>
void inplaceAddScalar(InOutIter xfirst, InOutIter xlast, const Scalar &scalar)

Add a range to a scalar, while the given range is also the output range.

template<class InOutIter, class InputIter>
void inplaceDiv(InOutIter xfirst, InOutIter xlast, InputIter yfirst)

Divide the two ranges coefficient wise, while the first is the output range at the same time.

template<class InOutIter, class Scalar>
void inplaceDivScalar(InOutIter xfirst, InOutIter xlast, Scalar scalar)

Divide the range coefficient wise with a scalar, while the first is the output range at the same time.

template<class InOutIter, class InputIter>
void inplaceMul(InOutIter xfirst, InOutIter xlast, InputIter yfirst)

Multiply the two ranges together, while the first is the output range at the same time.

template<class InOutIter, class Scalar>
void inplaceMulScalar(InOutIter xfirst, InOutIter xlast, const Scalar &scalar)

Multiply a range to a scalar, while the given range is also the output range.

template<class InOutIter, class InputIter>
void inplaceSub(InOutIter xfirst, InOutIter xlast, InputIter yfirst)

Subtract the two ranges together, while the first is the output range at the same time.

template<class InOutIter, class Scalar>
void inplaceSubScalar(InOutIter xfirst, InOutIter xlast, const Scalar &scalar)

Add a scalar from a range, while the given range is also the output range.

template<class data_t, class InputIter1, class InputIter2, class OutIter>
void lincomb(data_t a, InputIter1 first1, InputIter1 last1, data_t b, InputIter2 first2, OutIter out)

Compute the linear combination of $a * x + b * y$, where $x$ and $y$ are vectors given as iterators, and written to the output iterator.

template<class InputIter, class OutIter>
void log(InputIter first, InputIter last, OutIter out)

apply the log function for each element in the range

template<class InputIter1, class InputIter2, class OutIter>
void mul(InputIter1 xfirst, InputIter1 xlast, InputIter2 yfirst, OutIter out)

Compute the component wise multiplication of two vectors.

template<class data_t, class InputIter, class OutIter>
void mulScalar(InputIter first, InputIter last, const data_t &scalar, OutIter out)

Compute the component wise multiplies of a vectors and a scalar.

template<class data_t, class InputIter, class OutIter>
void mulScalar(const data_t &scalar, InputIter first, InputIter last, OutIter out)

Compute the component wise multiplication of a scalar and a vector.

template<class InputIter, class OutIter>
void real(InputIter first, InputIter last, OutIter out)

Extract the real part of a range. If the input range is not complex, it is equivalent to a copy.

template<class InputIter, class OutIter>
void sign(InputIter first, InputIter last, OutIter out)

apply the log function for each element in the range

template<class InputIter, class OutIter>
void sqrt(InputIter first, InputIter last, OutIter out)

Compute the square root for each element of the input range.

template<class InputIter, class OutIter>
void square(InputIter first, InputIter last, OutIter out)

Compute the square for each element of the input range.

template<class InputIter1, class InputIter2, class OutIter>
void sub(InputIter1 xfirst, InputIter1 xlast, InputIter2 yfirst, OutIter out)

Compute the component wise subtraction of two vectors.

template<class data_t, class InputIter, class OutIter>
void subScalar(InputIter first, InputIter last, const data_t &scalar, OutIter out)

Compute the component wise subtraction of a vectors and a scalar.

template<class data_t, class InputIter, class OutIter>
void subScalar(const data_t &scalar, InputIter first, InputIter last, OutIter out)

Compute the component wise subtraction of a scalar and a vector.

Data Structures

The data structures contain both the ContiguousStorage as internal wrapper for the DataContainer contents, as well as the NdView as helping wrapper class for constant views on data and faster and cheaper manipulation of the viewing shape.

ContiguousStorage

template<class T>
class ContiguousStorage : public elsa::mr::ContiguousVector<T, mr::type_tags::uninitialized, thrust::universal_ptr, thrust::universal_ptr>

Represents the internal storage type used by the DataContainer. It uses the contiguous vector with the uninitialized data tag, in order to prevent unnecessary initializations. Iterator and pointer types are mapped to thrust::universal_ptr. It inherits from the contiguous vector in order to allow explicit constructors for ContiguousStorage to exist, in comparison to having to use the ContigousVector constructor, when using using ContiguousStorage = mr::ContiguousVector…

Template Parameters
  • T: type of the data stored in the container.

Warning

doxygenclass: Cannot find class “elsa::ContiguousVector” in doxygen xml output for project “elsa” from directory: /var/lib/gitlab-runner/builds/RFzX5nBc_/0/tum-ciip/elsa/build/docs/xml

NdView

Warning

doxygenclass: Cannot find class “elsa::NdView” in doxygen xml output for project “elsa” from directory: /var/lib/gitlab-runner/builds/RFzX5nBc_/0/tum-ciip/elsa/build/docs/xml

template<class data_t, mr::StorageType tag>
class elsa::NdViewTagged

Represents a non-owning view. It can handle arbitrary strides (including negative). Supports the creation of subviews. Supports iteration in canonical order with a thrust::device compatible iterator, provided the storage type is device accessible. Upon deletion, if no other NdView has a reference to the data, signals to the owner of the data that it may be deleted via a destructor that is passed as constructor parameter. Additionally, NdViewTagged provides elementwise unary and binary operations and filtered assignments.

See

is_canonical()

Template Parameters
  • data_t: type of the data that the NdView points to

  • tag: storage type/location of the data; Unlike DataContainer, NdView supports non-universal memory, if compiled with CUDA

Public Functions

NdViewTagged(data_t *raw_data, const IndexVector_t &shape, const IndexVector_t &strides, const std::shared_ptr<Cleanup> &cleanup)

Create a view on raw (possibly non-contiguous) data.

Parameters
  • cleanup: shared encapsulated destructor; to be called, once this NdView (and all of its parent or subviews) have been deleted

NdViewTagged(data_t *raw_data, const IndexVector_t &shape, const IndexVector_t &strides, std::function<void()> destructor)

Create a view on raw (possibly non-contiguous) data.

Parameters
  • cleanup: destructor to be called, once this NdView (and all of its parent or subviews) have been deleted

NdViewTagged()

Create an empty view.

bool is_contiguous() const

Return

true iff the raw data is layed out contiguously

bool is_canonical() const

Return

true iff the raw data follows the canonical layout Canonical layout is defined as follows: strides[0] = 1; strides[i] = strides[i - 1] * shape[i - 1]; It is tempting to call this column major layout, but there is one caveat. In elsa, the first index refers to the column (i.e. x-coordinate) and the second refers to the row (i.e. y-coordinate).

bool is_empty() const

Return

true iff the view does not contain any data

IteratorRange<pointer_type> contiguous_range()

iterate over the raw data in any order, if the data is contiguous (iterators are pointer) Useful for reductions or transformations where the order of the data is not relevant (strides are ignored).

IteratorRange<pointer_type> canonical_range()

iterate over the raw data in canonical order, if the data is layed out in canonical layout (iterators are pointer).

See

is_canonical()

StridedRange<pointer_type> range()

iterate over the data in canonical order, regardless of its real layout

template<class ...Indices>
data_t &operator()(Indices... index)

extract a single element; Indices for all dimensions must be supplied

self_type fix(size_t dim, size_t where)

returned NdView has its dimensionality reduce by one by selecting a point along one dimension to bind to a set value

Parameters
  • dim: dimension to fix

  • where: the index of the slice along the dimension

self_type slice(size_t dim, size_t where_begin, size_t where_end)

returned NdView has same dimensionality but a shape of less or equal to the original along the corresponding dimension

Parameters
  • dim: dimension along which to take a sub-range

  • where_begin: lowest index along dimension dim (inclusive)

  • where_end: highest index along dimension dim (exclusive); where_begin <= where_end must hold!

const IndexVector_t &shape() const

Return

the shape of this NdView

const IndexVector_t &strides() const

Return

the strides of this NdView

const ContiguousStorage<dim_data> &layout_data() const

Return

the shape and strides of this NdView; Stored in memory of type mr::sysStorageType, i.e. they are device accessible if compiled with CUDA

std::shared_ptr<Cleanup> getCleanup()

Return

the cleanup sentinel; once its last reference is dropped, the destructor is called

size_t size() const

Return

the number of elements in this view

template<typename Functor>
auto with_canonical_range(Functor &&f) -> decltype(std::declval<Functor>()(this->canonical_range()))

Calls the functor with the lowest overhead iterator range that guarantees canonical iteration order. I.e. if the data is naturarlly layed out in canonical order, the iterators will be pointers.

Parameters
  • f: functor to call with an iterator range object with methods .begin() and .end()

template<typename Functor>
auto with_canonical_crange(Functor &&f) const -> decltype(std::declval<Functor>()(this->canonical_range()))

Const version of with_canonical_range()

See

with_canonical_range()

template<typename Functor>
auto with_unordered_range(Functor f) -> decltype(std::declval<Functor>(this->contiguous_range()))

Calls the functor with the lowest overhead iterator range available. I.e. if the data is naturarlly layed out in contiguously, the iterators will be pointers.

Parameters
  • f: functor to call with an iterator range object with methods .begin() and .end()

template<mr::StorageType other_tag>
BoolIndexedView<data_t, tag, other_tag> operator[](const NdViewTagged<bool, other_tag> &index)

Creates a left hand side, to be used for filtered assignments. The index parameter must be an NdView of equal dimensions to this. The returned object can be assigned to, replacing all entries in this view, whose corresponding index element is true. Indices whose filter- element is false remain unchanged. If the index tensor does not have the correct dimensions, no exception is thrown until the actual assignment occurs.

Example:

template<typename T>
void zero_value_range(NdViewTagged<float, T> &x, float lb, float ub) {
    x[x >= lb && x < ub] = 0.0f;
}

This example function sets all elements in the value range lb <= x < ub to zero.

See

BoolIndexedView

Private Functions

template<mr::StorageType other_tag, typename Functor> NdViewTagged< decltype(std::declval< Functor >)(std::declval< data_t >), std::declval< data_t >))), mr::sysStorageType > binop (const NdViewTagged< data_t, other_tag > &other, Functor functor) const

Performs an element-wise binary operation. The result is returned in a view, which owns a newly allocated buffer for the output. The strides of the output may not match the strides of either input.

template<typename Functor> NdViewTagged< decltype(std::declval< Functor >)(std::declval< data_t >))), mr::sysStorageType > unop (Functor functor) const

Performs an element-wise unary operation. The result is returned in a view, which owns a newly allocated buffer for the output. The strides of the output may not match the strides of the input.

struct Cleanup
struct Container
template<class ItType>
struct IteratorRange