elsa solvers

Solver Interface

template<typename data_t = real_t>
class elsa::Solver : public elsa::Cloneable<Solver<real_t>>

Base class representing a solver for an optimization problem.

This class represents abstract (typically iterative) solvers acting on optimization problems.

Author

  • Matthias Wieczorek - initial code

  • Maximilian Hornung - modularization

  • Tobias Lasser - rewrite, modernization

Template Parameters
  • data_t: data type for the domain and range of the problem, defaulting to real_t

Subclassed by elsa::LinearizedADMM< data_t >, elsa::PDHG< data_t >

Public Types

using Scalar = data_t

Scalar alias.

Public Functions

~Solver() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt)

Solve the optimization problem from a zero starting point. This will setup the iterative algorithm and then run the algorithm for the specified number of iterations (assuming the algorithm doesn’t convergence before).

Return

the current solution (after solving)

Parameters
  • [in] iterations: number of iterations to execute

  • [in] x0: optional initial solution, initial solution set to zero if not present

DataContainer<data_t> setup(std::optional<DataContainer<data_t>> x)

Setup and reset the solver. This function should set all values, vectors and temporaries necessary to run the iterative reconstruction algorithm.

Return

initial solution

Parameters
  • [in] x: optional to initial solution, if optional is empty, some defaulted (e.g. 0 filled) initial solution should be returned

DataContainer<data_t> step(DataContainer<data_t> x)

Perform a single step of the iterative reconstruction algorithm.

Return

the updated solution estimate

Parameters
  • [in] x: the current solution

DataContainer<data_t> run(index_t iterations, const DataContainer<data_t> &x0, bool show = false)

Run the iterative reconstruction algorithm for the given number of iterations on the initial starting point. This fun.

Return

the current solution estimate

Parameters
  • [in] iterations: the number of iterations to execute

  • [in] x0: the initial starting point of the run

  • [in] show: if the algorithm should print

bool shouldStop() const

Function to determine when to stop. This function should implement algorithm specific stopping criterions.

Return

true if the algorithm should stop

std::string formatHeader() const

Format the header string for the iterative reconstruction algorithm.

void printHeader() const

Print the header of the iterative reconstruction algorithm.

std::string formatStep(const DataContainer<data_t> &x) const

Format the step string for the iterative reconstruction algorithm.

void printStep(const DataContainer<data_t> &x, index_t curiter, std::chrono::duration<double> steptime, std::chrono::duration<double> elapsed) const

Print a step of the iterative reconstruction algorithm.

void setMaxiters(index_t maxiters)

Set the maximum number of iterations.

void setCallback(const std::function<void(const DataContainer<data_t>&)> &callback)

Set the callback function.

An example usage to later plot the loss (assuming some ground truth data):

auto phantom = getSomePhantom();
auto solver = SomeSolver(...);

// compute mean square error for each iteration
std::vector<float> msre;
solver.setCallback([&msre](const DataContainer<data_t>& x){
    msre.push_back(square(phantom - x).sum());
});

Similarly from Python:

phantom = getSomePhantom()
solver = SomeSolver(...)

// compute mean square error for each iteration
msre = []
solver.setCallback(lambda x: msre.append(square(phantom - x).sum()))

void printEvery(index_t printEvery)

Set variable to print only every n steps of the iterative reconstruction algorithm.

Protected Attributes

bool configured_ = false

Is the solver already configured (no need to call setup)

std::string name_ = "Solver"

Logging name of the iterative reconstruction algorithm.

index_t curiter_ = 0

Current iteration.

Private Members

index_t maxiters_ = 10000

Max iteration to run in total.

std::function<void(const DataContainer<data_t>&)> callback_ = [](auto) {}

Callback function to call each iteration, by default, do nothing.

Iterative

Iterative solvers to solve $A x = b$.

CGLS

template<typename data_t = real_t>
class elsa::CGLS : public elsa::Solver<real_t>

Conjugate Gradient for Least Squares Problems.

CGLS minimizes: $ \frac{1}{2} || Ax - b ||_2^2 + \eps^2 || x || $ where $A$ is an operator, it does not need to be square, symmetric, or positive definite. $b$ is the measured quantity, and $\eps$ a dampening factors.

If the dampening factor $\eps$ is non zero, the problem solves a Tikhonov problem.

CGLS is equivalent to apply CG to the normal equations $A^TAx = A^Tb$. However, it doesn not need to actually form the normal equation. Primarily, this improves the runtime performance, and it further improves the stability of the algorithm.

References:

Author

David Frank

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

CGLS(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> eps = 0, SelfType_t<data_t> tol = 1e-4)

Construct the necessary form of CGLS using some linear operator and the measured data.

Parameters
  • A: linear operator for the problem

  • b: the measured data

  • eps: the dampening factor

  • tol: stopping tolerance

CGLS(const CGLS<data_t>&) = delete

make copy constructor deletion explicit

~CGLS() override = default

default destructor

Private Functions

CGLS<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

CGNE

template<typename data_t = real_t>
class elsa::CGNE : public elsa::Solver<real_t>

Conjugate Gradient via the Normal equation.

CG solves the system of equations: [ A x = b ] where $A$ is symmetric positive definite operator. $b$ is the measured quantity.

In our implementation, we always assume $A$ is non-symmetric and not positive definite. Hence, we compute the solution to the normal equation [ A^T A x = A^t b ]

References:

  • An Introduction to the Conjugate Gradient Method Without the Agonizing Pain, by Shewchuk

Author

David Frank

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

CGNE(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> tol = 1e-4)

Construct the necessary form of CGNE using some linear operator and the measured data.

Parameters
  • A: linear operator for the problem

  • b: the measured data

  • tol: stopping tolerance

CGNE(const CGNE<data_t>&) = delete

make copy constructor deletion explicit

~CGNE() override = default

default destructor

Private Functions

CGNE<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Landweber iteration

Warning

doxygenclass: Cannot find class “elsa::LandweberIteration” in doxygen xml output for project “elsa” from directory: /var/lib/gitlab-runner/builds/RFzX5nBc_/0/tum-ciip/elsa/build/docs/xml

Landweber

template<typename data_t = real_t>
class elsa::Landweber : public LandweberIteration<real_t>

Implementation of the Landweber iterations with both $ T $ and $ M $ being the identity matrix. This reduces the update rule to:

  • $ x_{k+1} = x_{k} + \lambda A^T (A(x_{k}) - b)$

This is basically a special case of the gradient descent.

Author

David Frank

See

LandweberIteration

Public Types

using Scalar = typename LandweberIteration<data_t>::Scalar

Scalar alias.

Public Functions

Landweber(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> stepSize)

Constructor for classical Landweber, accepting an operator, a measurement vector and a step size.

Parameters
  • [in] A: linear operator to solve the problem with

  • [in] b: measurement vector of the problem

  • [in] stepSize: the fixed step size to be used while solving

Landweber(const LinearOperator<data_t> &A, const DataContainer<data_t> &b)

Constructor for classical Landweber, accepting an operator and a measurement vector.

Parameters
  • [in] A: linear operator to solve the problem with

  • [in] b: measurment vector of the problem

Landweber(const Landweber<data_t>&) = delete

make copy constructor deletion explicit

~Landweber() override = default

default destructor

Private Functions

Landweber<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

SIRT

template<typename data_t = real_t>
class elsa::SIRT : public LandweberIteration<real_t>

Implementation of the Simultaneous Iterative Reconstruction Technique (SIRT). For SIRT $ T = \text{diag}(\frac{1}{\text{row sum}}) = \text{diag}(\frac{1}{\sum_i A_{ij}})$ and $ M = \text{diag}( \frac{i}{\text{col sum}}) = \text{diag}(\frac{1}{\sum_j A_{ij}})$.

Outside of the Computed Tomography community, this algorithm is also often know as Simultaneous Algebraic Reconstruction Technique (SART).

Author

David Frank

See

LandweberIteration

Public Types

using Scalar = typename LandweberIteration<data_t>::Scalar

Scalar alias.

Public Functions

SIRT(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, SelfType_t<data_t> stepSize)

Constructor for SIRT, accepting an operator, a measurement vector and a step size#.

Parameters
  • [in] A: linear operator to solve the problem with

  • [in] b: measurment vector of the problem

  • [in] stepSize: the fixed step size to be used while solving

SIRT(const LinearOperator<data_t> &A, const DataContainer<data_t> &b)

Constructor for SIRT, accepting an operator and a measurement vector.

Parameters
  • [in] A: linear operator to solve the problem with

  • [in] b: measurement vector of the problem

SIRT(const SIRT<data_t>&) = delete

make copy constructor deletion explicit

~SIRT() override = default

default destructor

Private Functions

SIRT<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

GMRES

AB_GMRES

template<typename data_t = real_t>
class elsa::AB_GMRES : public elsa::Solver<real_t>

Class implementing the AB_GMRES method for solving with an unmatched Back Projector.

GMRES is an iterative method for solving an indefinite nonsymmetric linear system $ Ax = b $ It solves the system by minimizing a least squares Problem for $ J(y) = \left\| \beta {e_{1}} - \bar{H}y \right\| $ where $ \beta $ is $ \left\| f - Ax \right\| $, $ {e_{1}} $ is the first column of the $ (k + 1) \times (k + 1) $ identity matrix and $ \bar{H} $ is $ (k + 1) \times (k) $ matrix whose only nonzero entries are generated by the algorithm.

We implement an expanded AB_GMRES method for solving large sparse least square problems. Here we define $ A^T = B $ for our matrix $ A \epsilon \mathbb{R}^{m \times n} $ with $ B \epsilon \mathbb{R}^{n \times m} $. We solve the so-called unmatched normal equations $ BAx = Bb $ and $ ABy = b $ where $ x = By $:

AB_GMRES solves $ min_{y \epsilon \mathbb{R}^m} \left\| b - ABy \right\|$, $ x = By $.

$ A $ can be imagined as a discretization of the forward projector, while $ B $ represents an matched or unmatched back projector or preconditioner/backprojector, the algorithm can reach semi-convergence when $ A^T \neq B $ as long as valid models for the projector / backprojector pair are chosen

Convergence is considered reached when $ ||r|| <= epsilon $

References: http://epubs.siam.org/doi/10.1137/0907058 http://epubs.siam.org/doi/10.1137/070696313 http://arxiv.org/abs/2110.01481

Author

Daniel Klitzner - initial code

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

AB_GMRES(const LinearOperator<data_t> &projector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for AB_GMRES, accepting a forward projector A, *a sinogram b and an optional value for epsilon, the backprojector B will be the adjoint *of projector A.

Parameters
  • [in] projector: Discretization of a forward projector

  • [in] sinogram: sinogram of measured Data when applying phantom to projector

  • [in] epsilon: affects the stopping condition

AB_GMRES(const LinearOperator<data_t> &projector, const LinearOperator<data_t> &backprojector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for AB_GMRES, accepting a forward projector A, an *unmatched backprojector B, a sinogram b and an optional value for *epsilon.

Parameters
  • [in] projector: Discretization of a forward projector A

  • [in] backprojector: An optional unmatched/matched backprojector B

  • [in] sinogram: sinogram “b” of measured Data when applying phantom to projector

  • [in] epsilon: affects the stopping condition

AB_GMRES(const AB_GMRES<data_t>&) = delete

make copy constructor deletion explicit

~AB_GMRES() override = default

default destructor

DataContainer<data_t> solveAndRestart(index_t iterations, index_t restarts, std::optional<DataContainer<data_t>> x0 = std::nullopt)

Solves the given Linear System using AB-GMRES as default, i.e. apply i iterations with r restarts of AB-GMRES.

Return

an approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] restarts: number of restarts to execute after iterations

  • [in] x0: optional approximate starting positions (used for preconditioning)

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solves the given Linear System using AB-GMRES as default, i.e. apply i iterations of AB-GMRES.

Return

an approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] x0: optional approximate starting positions (used for preconditioning)

Private Functions

AB_GMRES<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<LinearOperator<data_t>> _A

Projector A.

std::unique_ptr<LinearOperator<data_t>> _B

Unmatched Backprojector B, used for Preconditioning.

DataContainer<data_t> _b

sinogram b

data_t _epsilon

variable affecting the stopping condition

BA_GMRES

template<typename data_t = real_t>
class elsa::BA_GMRES : public elsa::Solver<real_t>

Class implementing the BA_GMRES method for solving with an unmatched Back Projector.

GMRES is an iterative method for solving an indefinite nonsymmetric linear system $ Ax = b $ It solves the system by minimizing a least squares Problem for $ J(y) = \left\| \beta {e_{1}} - \bar{H}y \right\| $ where $ \beta $ is $ \left\| f - Ax \right\| $, $ {e_{1}} $ is the first column of the $ (k + 1) \times (k + 1) $ identity matrix and $ \bar{H} $ is $ (k + 1) \times (k) $ matrix whose only nonzero entries are generated by the algorithm.

We implement an expanded BA_GMRES method for solving large sparse least square problems. Here we define $ A^T = B $ for our matrix $ A \epsilon \mathbb{R}^{m \times n} $ with $ B \epsilon \mathbb{R}^{n \times m} $. We solve the so-called unmatched normal equations $ BAx = Bb $ and $ ABy = b $ where $ x = By $:

BA_GMRES solves $ min_{x \epsilon \mathbb{R}^m} \left\| Bb - BAx \right\|$.

$ A $ can be imagined as a discretization of the forward projector, while $ B $ represents an matched or unmatched back projector or preconditioner/backprojector, the algorithm can reach semi-convergence when $ A^T \neq B $ as long as valid models for the projector / backprojector pair are chosen

Convergence is considered reached when $ ||r|| <= epsilon $

References: http://epubs.siam.org/doi/10.1137/0907058 http://epubs.siam.org/doi/10.1137/070696313 http://arxiv.org/abs/2110.01481

Author

Daniel Klitzner - initial code

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

BA_GMRES(const LinearOperator<data_t> &projector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for BA_GMRES, accepting a forward projector A, *a sinogram b and an optional value for epsilon, the backprojector B will be the adjoint *of projector A.

Parameters
  • [in] projector: Discretization of a forward projector

  • [in] sinogram: sinogram of measured Data when applying phantom to projector

  • [in] epsilon: affects the stopping condition

BA_GMRES(const LinearOperator<data_t> &projector, const LinearOperator<data_t> &backprojector, const DataContainer<data_t> &sinogram, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for BA_GMRES, accepting a forward projector A, an *unmatched backprojector B, a sinogram b and an optional value for *epsilon.

Parameters
  • [in] projector: Discretization of a forward projector A

  • [in] backprojector: An optional unmatched/matched backprojector B

  • [in] sinogram: sinogram “b” of measured Data when applying phantom to projector

  • [in] epsilon: affects the stopping condition

BA_GMRES(const BA_GMRES<data_t>&) = delete

make copy constructor deletion explicit

~BA_GMRES() override = default

default destructor

DataContainer<data_t> solveAndRestart(index_t iterations, index_t restarts, std::optional<DataContainer<data_t>> x0 = std::nullopt)

Solves the given Linear System using BA-GMRES as default, i.e. apply i iterations with r restarts of BA-GMRES.

Return

an approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] restarts: number of restarts to execute after iterations

  • [in] x0: optional approximate starting positions (used for preconditioning)

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solves the given Linear System using BA-GMRES as default, i.e. apply i iterations of BA-GMRES.

Return

an approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] x0: optional approximate starting positions (used for preconditioning)

Private Functions

BA_GMRES<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<LinearOperator<data_t>> _A

Projector A.

std::unique_ptr<LinearOperator<data_t>> _B

Unmatched Backprojector B, used for Preconditioning.

DataContainer<data_t> _b

sinogram b

data_t _epsilon

variable affecting the stopping condition

Smooth

Optimization algorithms for smooth problem formulations.

First-order optimization algorithms

First-order algorithms solve problems of the form

\[\min_{x \in \mathbb{R}^d} f(x)\]

with two assumptions:

  • $f: \mathbb{R}^d \to \mathbb{R}$ is a convex continuously differentiable function with Lipschitz continuous gradient, i.e. $f \in C_{L}^{1, 1}(\mathbb{R}^d)$ (with $L > 0$ is the Lipschitz constant)

  • The problem is solvable, i.e. there exists an optimal $x^{*}$

Intuition for Momentum

A nice analogy, is a ball in hilly terrain. The ball is at a random position, with zero initial velocity. The algorithm determines the gradient of potential energy, which is the force acting on the ball. Which in our case, is exactly the (negative) gradient of f$ff$. Then the algorithm updates the velocity, which in turn updates the position of the ball. Compared to a vanilla gradient descent, where the position is directly integrated instead of the velocity.

Phrased differently, the velocity is a look ahead position, from where the gradient of the current solution is applied to. Nesterov’s algorithm improves on that, by computing the gradient at the look ahead position, instead of at the current solutions position.

GradientDescent

template<typename data_t = real_t>
class elsa::GradientDescent : public elsa::Solver<real_t>

Class representing a simple gradient descent solver with a fixed, given step size.

This class implements a simple gradient descent iterative solver with a fixed, given step size. No particular stopping rule is currently implemented (only a fixed number of iterations, default to 100).

Author

  • Tobias Lasser - initial code

Template Parameters
  • data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

GradientDescent(const Functional<data_t> &problem, data_t stepSize)

Constructor for gradient descent, accepting a problem and a fixed step size.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] stepSize: the fixed step size to be used while solving

GradientDescent(const Functional<data_t> &problem)

Constructor for gradient descent, accepting a problem. The step size will be computed as $ 1 \over L $ with $ L $ being the Lipschitz constant of the function.

Parameters
  • [in] problem: the problem that is supposed to be solved

GradientDescent(const Functional<data_t> &problem, const LineSearchMethod<data_t> &lineSearchMethod)

Constructor for gradient descent, accepting a problem and a line search method.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] lineSearchMethod: the line search method to find the step size at each iteration

GradientDescent(const GradientDescent<data_t>&) = delete

make copy constructor deletion explicit

~GradientDescent() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters
  • [in] iterations: number of iterations to execute

Private Functions

GradientDescent<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> _problem

the differentiable optimizaion problem

std::unique_ptr<LineSearchMethod<data_t>> _lineSearchMethod

the line search method

Nesterov’s Fast Gradient Method

template<typename data_t = real_t>
class elsa::FGM : public elsa::Solver<real_t>

Class implementing Nesterov’s Fast Gradient Method.

This class implements Nesterov’s Fast Gradient Method. FGM is a first order method to efficiently optimize convex functions with Lipschitz-Continuous gradients.

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

FGM(const Functional<data_t> &problem, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for FGM, accepting an optimization problem and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] epsilon: affects the stopping condition

FGM(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for FGM, accepting an optimization problem, the inverse of a preconditioner and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] preconditionerInverse: the inverse of the preconditioner

  • [in] epsilon: affects the stopping condition

FGM(const Functional<data_t> &problem, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for FGM, accepting an optimization problem and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] lineSearchMethod: the line search method to find the step size at each iteration

  • [in] epsilon: affects the stopping condition

FGM(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for FGM, accepting an optimization problem, the inverse of a preconditioner and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] preconditionerInverse: the inverse of the preconditioner

  • [in] lineSearchMethod: the line search method to find the step size at each iteration

  • [in] epsilon: affects the stopping condition

FGM(const FGM<data_t>&) = delete

make copy constructor deletion explicit

~FGM() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

FGM<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> _problem

the differentiable optimizaion problem

data_t _epsilon

variable affecting the stopping condition

std::unique_ptr<LinearOperator<data_t>> _preconditionerInverse = {}

the inverse of the preconditioner (if supplied)

std::unique_ptr<LineSearchMethod<data_t>> _lineSearchMethod

the line search method

Optimized Gradient Method

template<typename data_t = real_t>
class elsa::OGM : public elsa::Solver<real_t>

Class representing the Optimized Gradient Method.

This class implements the Optimized Gradient Method as introduced by Kim and Fessler in 2016. OGM is a first order method to efficiently optimize convex functions with Lipschitz-Continuous gradients. It can be seen as an improvement on Nesterov’s Fast Gradient Method.

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

OGM(const Functional<data_t> &problem, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for OGM, accepting an optimization problem and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] epsilon: affects the stopping condition

OGM(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for OGM, accepting an optimization problem the inverse of a preconditioner and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] preconditionerInverse: the inverse of the preconditioner

  • [in] epsilon: affects the stopping condition

OGM(const Functional<data_t> &problem, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for OGM, accepting an optimization problem and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] lineSearchMethod: the line search method to find the step size at each iteration

  • [in] epsilon: affects the stopping condition

OGM(const Functional<data_t> &problem, const LinearOperator<data_t> &preconditionerInverse, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Constructor for OGM, accepting an optimization problem the inverse of a preconditioner and, optionally, a value for epsilon.

Parameters
  • [in] problem: the problem that is supposed to be solved

  • [in] preconditionerInverse: the inverse of the preconditioner

  • [in] lineSearchMethod: the line search method to find the step size at each iteration

  • [in] epsilon: affects the stopping condition

OGM(const OGM<data_t>&) = delete

make copy constructor deletion explicit

~OGM() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

OGM<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> _problem

the differentiable optimizaion problem

data_t _epsilon

variable affecting the stopping condition

std::unique_ptr<LinearOperator<data_t>> _preconditionerInverse = {}

the inverse of the preconditioner (if supplied)

std::unique_ptr<LineSearchMethod<data_t>> _lineSearchMethod

the line search method

SQS Ordered Subsets

template<typename data_t = real_t>
class elsa::SQS : public elsa::Solver<real_t>

Class representing an SQS Solver.

This class implements an SQS solver with multiple options for momentum acceleration and ordered subsets.

No particular stopping rule is currently implemented (only a fixed number of iterations, default to 100).

Currently, this is limited to least square problems.

References: https://doi.org/10.1109/TMI.2014.2350962

Author

Michael Loipführer - initial code

Template Parameters
  • data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

SQS(const SQS<data_t>&) = delete

make copy constructor deletion explicit

~SQS() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of gradient descent.

Return

the approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

SQS<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<LeastSquares<data_t>> fullProblem_

The optimization to solve TODO: This might be a nice variant?

data_t epsilon_

variable affecting the stopping condition

std::unique_ptr<LinearOperator<data_t>> preconditioner_ = {}

the preconditioner (if supplied)

bool momentumAcceleration_

whether to enable momentum acceleration

bool subsetMode_

whether to operate in subset based mode

Non-Smooth

Optimization algorithms for non-smooth problem formulations. These problem formulations usually contain at least one non-differentiable term, such as the L1-Norm.

Proximal Gradient Methods

Proximal gradient methods solves problems of the form

\[\min_{x \in \mathbb{R}^d} g(x) + h(x)\]

where $g$ is a smooth function, and $h$ is simple (meaning the proximal operator is easy to compute).

Proximal Gradient Descent

template<typename data_t = real_t>
class elsa::PGD : public elsa::Solver<real_t>

Proximal Gradient Descent (PGD)

PGD minimizes function of the form:

\[ \min_x g(x) + h(x) \]

where $g: \mathbb{R}^n \to \mathbb{R}$ is convex and differentiable, and $h: \mathbb{R}^n \to \mathbb{R} \cup \{-\infty, \infty\}$ is closed convex. Importantly $h$ needs not to be differentiable, but it needs an proximal operator. Usually, the proximal operator is assumed to be simple, and have an analytical solution.

This class currently implements the special case of $ g(x) = \frac{1}{2} ||A x - b||_2^2$. However, $h$ can be chosen freely.

Given $g$ defined as above and a convex set $\mathcal{C}$, one can define an constrained optimization problem:

\[ \min_{x \in \mathcal{C}} g(x) \]
Such constraints can take the form of, non-negativity or box constraints. This can be reformulated as an unconstrained problem:
\[ \min_{x} g(x) + \mathcal{I}_{\mathcal{C}}(x) \]
where $\mathcal{I}_{\mathcal{C}}(x)$ is the indicator function of the convex set $\mathcal{C}$, defined as:

\[ \mathcal{I}_{\mathcal{C}}(x) = \begin{cases} 0, & \text{if } x \in \mathcal{C} \\ \infty, & \text{if } x \notin \mathcal{C} \end{cases} \]

References:

Note

PGD has a worst-case complexity result of $ O(1/k) $.

Note

A special class of optimization is of the form:

\[ \min_{x} \frac{1}{2} || A x - b ||_2^2 + ||x||_1 \]
often referred to as $\ell_1$-Regularization. In this case, the proximal operator for the $\ell_1$-Regularization is the soft thresolding operator (ProximalL1). This can also be extended with constrains, such as non-negativity constraints.

See

An accerlerated version of proximal gradient descent is APGD.

Author

  • Andi Braimllari - initial code

  • David Frank - generalization

Template Parameters
  • data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

PGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • h: prox-friendly function

  • mu: the step length

PGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • W: the weights (usually counts / I0)

  • prox: the proximal operator for g

  • mu: the step length

PGD(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a given data fidelity term

Parameters
  • g: differentiable function

  • h: prox-friendly function

  • mu: the step length

PGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • h: prox-friendly function

  • lineSearchMethod: the line search method to find the step size at each iteration

PGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • W: the weights (usually counts / I0)

  • prox: the proximal operator for g

  • lineSearchMethod: the line search method to find the step size at each iteration

PGD(const Functional<data_t> &g, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct PGD with a given data fidelity term

Parameters
  • g: differentiable function

  • h: prox-friendly function

  • lineSearchMethod: the line search method to find the step size at each iteration

PGD(const PGD<data_t>&) = delete

make copy constructor deletion explicit

~PGD() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of PGD.

Return

the approximated solution

Parameters
  • [in] iterations: number of iterations to execute

  • [in] x0: optional initial solution, initial solution set to zero if not present

Protected Functions

auto cloneImpl() const -> PGD<data_t>* override

implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> g_

differentiable function of problem formulation

std::unique_ptr<Functional<data_t>> h_

prox-friendly function of problem formulation

data_t epsilon_

variable affecting the stopping condition

std::unique_ptr<LineSearchMethod<data_t>> lineSearchMethod_

the line search method

Accelerated Proximal Gradient Descent

template<typename data_t = real_t>
class elsa::APGD : public elsa::Solver<real_t>

Accelerated Proximal Gradient Descent (APGD)

APGD minimizes function of the the same for as PGD. See the documentation there.

This class represents a APGD solver with the following steps:

  • $ x_{k} = prox_h(y_k - \mu * A^T (Ay_k - b)) $

  • $ t_{k+1} = \frac{1 + \sqrt{1 + 4 * t_{k}^2}}{2} $

  • $ y_{k+1} = x_{k} + (\frac{t_{k} - 1}{t_{k+1}}) * (x_{k} - x_{k - 1}) $

APGD has a worst-case complexity result of $ O(1/k^2) $.

References: http://www.cs.cmu.edu/afs/cs/Web/People/airg/readings/2012_02_21_a_fast_iterative_shrinkage-thresholding.pdf https://arxiv.org/pdf/2008.02683.pdf

See

For a more detailed discussion of the type of problem for this solver, see PGD.

Author

Andi Braimllari - initial code David Frank - generalization to APGD

Template Parameters
  • data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

APGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • prox: the proximal operator for g

  • mu: the step length

APGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • W: the weights (usually counts / I0)

  • prox: the proximal operator for g

  • mu: the step length

APGD(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a given data fidelity term

Parameters
  • g: differentiable function of the LASSO problem

  • h: prox friendly functional of the LASSO problem

  • mu: the step length

APGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • prox: the proximal operator for g

  • lineSearchMethod: the line search method to find the step size at each iteration

APGD(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • W: the weights (usually counts / I0)

  • prox: the proximal operator for g

  • lineSearchMethod: the line search method to find the step size at each iteration

APGD(const Functional<data_t> &g, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct APGD with a given data fidelity term

Parameters
  • g: differentiable function of the LASSO problem

  • h: prox friendly functional of the LASSO problem

  • lineSearchMethod: the line search method to find the step size at each iteration

~APGD() override = default

default destructor

Protected Functions

auto cloneImpl() const -> APGD<data_t>* override

implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> g_

Differentiable part of the problem formulation.

std::unique_ptr<Functional<data_t>> h_

Prox-friendly part of the problem formulation.

std::unique_ptr<LineSearchMethod<data_t>> lineSearchMethod_

the line search method

data_t epsilon_

variable affecting the stopping condition

Proximal Optimized Gradient Method

template<typename data_t = real_t>
class elsa::POGM : public elsa::Solver<real_t>

Proximal Optimized Gradient Method (POGM)

POGM minimizes function of the the same for as PGD. See the documentation there.

The update steps for POGM are:

\[ \begin{split} \theta_k &= \begin{cases}% \frac{1}{2}(1 + \sqrt{4 \theta_{k-1}^2 + 1}) & \text{ if } 2 \leq k < N \\% \frac{1}{2}(1 + \sqrt{8 \theta_{k-1}^2 + 1}) & \text{ if } k = N % \end{cases} \\% \gamma_k &= \frac{1}{L} \frac{2\theta_{k-1} + \theta_k - 1}{\theta_k} \\% \omega_k &= x_{k+1} - \frac{1}{L} \nabla f(x_{k-1}) \\% z_k &= \omega_k +% \underset{Nesterov}{\underbrace{\frac{\theta_{k-1} - 1}{\theta_k}(\omega_k + *\omega_{k-1})}}+% \underset{OGM}{\underbrace{\frac{\theta_{k-1}}{\theta_k}(\omega_k - *x_{k-1})}}+% \underset{POGM}{\underbrace{\frac{\theta_{k-1} - 1}{L *\gamma_{k-1}\theta_k}(z_{k-1} - x_{k-1})}}\\% x_k &= \operatorname{prox}_{\gamma_k g}(z_k) \end{split} \]

References:

See

For a more detailed discussion of the type of problem for this solver, see PGD.

Author

  • David Frank - Initial implementation

Template Parameters
  • data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

POGM(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • g: prox-friendly function

  • mu: the step length

POGM(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, std::optional<data_t> mu = std::nullopt, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$. If mu is not given, the step length is chosen by default, the computation of the power method might be expensive.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • W: the weights (usually counts / I0)

  • prox: the proximal operator for g

  • mu: the step length

POGM(const Functional<data_t> &g, const Functional<data_t> &h, data_t mu, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a given data fidelity term.

POGM(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • g: prox-friendly function

  • lineSearchMethod: the line search method to find the step size at each iteration

POGM(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const DataContainer<data_t> &W, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a weighted least squares data fidelity term

Note

The step length for least squares can be chosen to be dependend on the Lipschitz constant. Compute it using powerIterations(adjoint(A) * A). Depending on which literature, both $ \frac{2}{L} $ and $ \frac{1}{L}$.

Parameters
  • A: the operator for the least squares data term

  • b: the measured data for the least squares data term

  • W: the weights (usually counts / I0)

  • prox: the proximal operator for g

  • lineSearchMethod: the line search method to find the step size at each iteration

POGM(const Functional<data_t> &g, const Functional<data_t> &h, const LineSearchMethod<data_t> &lineSearchMethod, data_t epsilon = std::numeric_limits<data_t>::epsilon())

Construct POGM with a given data fidelity term.

~POGM() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the optimization problem, i.e. apply iterations number of iterations of POGM.

Return

the approximated solution

Parameters
  • [in] iterations: number of iterations to execute

Protected Functions

auto cloneImpl() const -> POGM<data_t>* override

implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<Functional<data_t>> g_

The LASSO optimization problem.

std::unique_ptr<LineSearchMethod<data_t>> lineSearchMethod_

the line search method

data_t epsilon_

variable affecting the stopping condition

Linearized Bregman

template<typename data_t = real_t>
class elsa::LB : public elsa::Solver<real_t>

Linearized Bregman solves problems of form:

\[ \min_{x} \frac{1}{2\mu} || A x - b ||_2^2 + |x| \]

by iteratively solving:

\[ \begin{aligned} x^{k+1} & =\mu \cdot \operatorname{shrink}\left(v^k, 1\right) \\ v^{k+1} & =v^k+\beta A^{\top}\left(b-A x^{k+1}\right) \end{aligned} \]

References:

Public Functions

LB(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const ProximalOperator<data_t> &prox, data_t mu = 5, std::optional<data_t> beta = std::nullopt, data_t epsilon = 1e-5)

Construct Linearized Bregman.

LB(const LB<data_t>&) = delete

make copy constructor deletion explicit

~LB() override = default

default destructor

Protected Functions

auto cloneImpl() const -> LB<data_t>* override

implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<LinearOperator<data_t>> A_

The LASSO optimization problem.

data_t mu_

wage of fidelity term

data_t beta_

step size

data_t epsilon_

variable affecting the stopping condition

template<typename data_t = real_t>
class elsa::ALB : public elsa::Solver<real_t>

Aссelerated Linearized Bregman converges faster than Linearized Bregman and solves problems of form:

\[ \min_{x} \frac{1}{2\mu} || A x - b ||_2^2 + |x| \]

by iteratively solving:

\[ \begin{aligned} x^{k+1} & =\mu \cdot \operatorname{prox}\left(\tilde{v}^k, 1\right) \\ v^{k+1} & =\tilde{v}^k+\beta A^T\left(b-A x^{k+1}\right) \\ \tilde{v}^{k+1} & :=\alpha_k v^{k+1}+\left(1-\alpha_k\right) v^k \end{aligned} \]

where $ a_k = \frac{2k + 3}{k + 3} $

References:

Public Functions

ALB(const LinearOperator<data_t> &A, const DataContainer<data_t> &b, const ProximalOperator<data_t> prox, data_t mu = 5, std::optional<data_t> beta = std::nullopt, data_t epsilon = 1e-5)

Construct Linearized Bregman.

ALB(const ALB<data_t>&) = delete

make copy constructor deletion explicit

~ALB() override = default

default destructor

DataContainer<data_t> setup(std::optional<DataContainer<data_t>> x) override

Setup variables to default values.

DataContainer<data_t> step(DataContainer<data_t> x) override

Perform single step of the algorithm.

bool shouldStop() const override

Check if the L2 distance between the residual and b is below a certain threshold.

Protected Functions

auto cloneImpl() const -> ALB<data_t>* override

implement the polymorphic clone operation

auto isEqual(const Solver<data_t> &other) const -> bool override

implement the polymorphic comparison operation

Private Members

std::unique_ptr<LinearOperator<data_t>> A_

The LASSO optimization problem.

data_t mu_

wage of fidelity term

data_t beta_

step size

data_t epsilon_

variable affecting the stopping condition

Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers (ADMM) solves problems of the form:

\[\min f(x) + g(z) \\ \text{s.t. } Ax + Bz = c\]

With $x \in \mathbb{R}^{n}$, $z \in \mathbb{R}^{m}$, $c \in \mathbb{R}^{p}$, $A \in \mathbb{R}^{p \times n}$ and $B \in \mathbb{R}^{p \times m}$. Usually, one assumes $f$ and $g$ to be convex at least.

This problem in general is quite hard to solve for many interesting applications such as X-ray CT. However, certain special cases are quite interesting and documented below.

ADMM with L2 data fidelity term

template<typename data_t = real_t>
class elsa::ADMML2 : public elsa::Solver<real_t>

Class representing an Alternating Direction Method of Multipliers solver for a specific subset of constraints.

The general form of ADMM solves the following optimization problem:

\[ \min f(x) + g(z) \\ \text{s.t. } Ax + Bz = c \]
with $x \in \mathbb{R}^n$, $z \in \mathbb{R}^m$, $A \in \mathbb{R}^{p\times n}$, $B \in \mathbb{R}^{p\times m}$ and $c \in \mathbb{R}^p$

This specific version solves the problem of the form:

\[ \min \frac{1}{2} || Op x - b ||_2^2 + g(z) \\ \text{s.t. } Ax = z \]
with $B = Id$ and $c = 0$. Further: $f(x) = || Op x - b ||_2^2$.

This version of ADMM is useful, as the proximal operator is not known for the least squares functional, and this specifically implements and optimization of the first update step of ADMM. In this implementation, this is done via CGLS.

The update steps for ADMM are:

\[ x_{k+1} = \argmin_{x} \frac{1}{2}||Op x - b||_2^2 + \frac{1}{2\tau} ||Ax - z_k + u_k||_2^2 \\ z_{k+1} = \prox_{\tau g}(Ax_{k+1} + u_{k}) \\ u_{k+1} = u_k + Ax_{k+1} - z_{k+1} \]

This is further useful to solve problems such as TV, by setting the $A = \nabla$. And $ g = || \dot ||_1$

References:

  • Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, by Boyd et al.

  • Chapter 5.3 of “An introduction to continuous optimization for imaging”, by Chambolle and Pock

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

~ADMML2() override = default

default destructor

Protected Functions

ADMML2<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic equality operation

Private Members

data_t tau_ = {1}

$ \tau $ from the problem definition

Linearized ADMM

template<class data_t>
class elsa::LinearizedADMM : public elsa::Solver<data_t>

Class representing the linearized version of ADMM.

The general form of ADMM solves the following optimization problem:

\[ \min f(x) + g(z) \\ \text{s.t. } Ax + Bz = c \]
with $x \in \mathbb{R}^n$, $z \in \mathbb{R}^m$, $A \in \mathbb{R}^{p\times n}$, $B \in \mathbb{R}^{p\times m}$ and $c \in \mathbb{R}^p$

The linearized version assumes $A = K$ to be any linear operator, $B = - Id$ and $c = 0$. Hence, the optimization problem reduces to:

\[ \min f(x) + g(Ax) \]

The main benefit of this is, that in many applications both $f$ and $g$ can be simple functionals, for which proximal operators are easy to evaluate. I.e. The usual least square formulation without regularization is achieved with: $f(x) = 0$, and $g(z) = || z - b ||_2^2 $, with $ z = Kx$. For both the proximal operator is analytically known.

Many other problems can be converted to this form in a similar fashion by “stacking” the operator $K$. I.e. $L_1$ regularization can as: $f(x) = 0$, and $g(z) = || z_1 - b ||_2^2 + || z_2 ||_1$, with $ K = \begin{bmatrix} A \\ Id \end{bmatrix}$, and $z = \begin{bmatrix} z_1 \\ z_2 \end{bmatrix}$. Further, constraints can be added easily via the function $f$, by setting it to some indicator function (i.e. non-negativity or similar).

References:

  • Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, by Boyd et al.

  • Chapter 5.3 of “An introduction to continuous optimization for imaging”, by Chambolle and Pock

TODO:

  • Once implemented, take functionals which know their respective proximal instead of handing proximal operators directly

Public Functions

LinearizedADMM(const LinearOperator<data_t> &K, ProximalOperator<data_t> proxf, ProximalOperator<data_t> proxg, SelfType_t<data_t> sigma, std::optional<data_t> tau = std::nullopt, bool computeKNorm = true)

Construct the linearized version of ADMM given two proximal operators for $f$ and $g$ respectively, the potentially stacked operator $K$ and the two step length parameters $\sigma$ and $\tau$.

To ensure convergence it is checked if $0 < \tau < \frac{\sigma}{||K||_2^2}$. Here $||K||_2^2$ is the operator norm, which is approximated using a couple if iterations of the power method, to apprxoximate the largest eigenvalue of the operator. If computeKNorm is false, the computation is not performed, and it is assumed the above inequality holds.

~LinearizedADMM() override = default

default destructor

DataContainer<data_t> setup(std::optional<DataContainer<data_t>> x) override

Setup and reset the solver. This function should set all values, vectors and temporaries necessary to run the iterative reconstruction algorithm.

Return

initial solution

Parameters
  • [in] x: optional to initial solution, if optional is empty, some defaulted (e.g. 0 filled) initial solution should be returned

DataContainer<data_t> step(DataContainer<data_t> x) override

Perform a single step of the iterative reconstruction algorithm.

Return

the updated solution estimate

Parameters
  • [in] x: the current solution

bool shouldStop() const override

Function to determine when to stop. This function should implement algorithm specific stopping criterions.

Return

true if the algorithm should stop

std::string formatHeader() const override

Format the header string for the iterative reconstruction algorithm.

std::string formatStep(const DataContainer<data_t> &x) const override

Format the step string for the iterative reconstruction algorithm.

Private Members

std::unique_ptr<LinearOperator<data_t>> K_

The stacked linear operator $K$.

ProximalOperator<data_t> proxf_

The proximal operator for $f$.

ProximalOperator<data_t> proxg_

The proximal operator for $g$.

data_t sigma_

Step length $\sigma$.

data_t tau_

Step length $\tau$.

Orthogonal Matching Pursuit

template<typename data_t = real_t>
class elsa::OrthogonalMatchingPursuit : public elsa::Solver<real_t>

Class representing the Orthogonal Matching Pursuit.

Orthogonal Matching Pursuit is a greedy algorithm to find a sparse representation. It starts with the 0-vector and adds one non-zero entry per iteration. The algorithm works in the following manner:

  1. Find the next atom that should be used in the representation. This done by finding the atom that is correlated the most with the current residual.

  2. Construct a dictionary that only contains the atoms that are being used, defined as $ D_S $ (dictionary restricted to the support).

  3. The representation is the solution to the least square problem $ min_x \|y-D_S*x\| $

Author

Jonas Buerger - initial code

Template Parameters
  • data_t: data type for the domain and range of the problem, defaulting to real_t

Public Types

using Scalar = typename Solver<data_t>::Scalar

Scalar alias.

Public Functions

OrthogonalMatchingPursuit(const Dictionary<data_t> &D, const DataContainer<data_t> &y, data_t epsilon)

Constructor for OrthogonalMatchingPursuit, accepting a dictionary representation problem and, optionally, a value for epsilon.

Parameters
  • [in] D: dictionary operator

  • [in] y: signal that should be sparsely represented

  • [in] epsilon: affects the stopping condition

OrthogonalMatchingPursuit(const OrthogonalMatchingPursuit<data_t>&) = delete

make copy constructor deletion explicit

~OrthogonalMatchingPursuit() override = default

default destructor

DataContainer<data_t> solve(index_t iterations, std::optional<DataContainer<data_t>> x0 = std::nullopt) override

Solve the representation problem, i.e. apply iterations number of iterations of matching pursuit.

Return

the approximated solution

Parameters
  • [in] iterations: number of iterations to execute. As OrthogonalMatchingPursuit is a greedy algorithm, this corresponds to the desired sparsity level

  • [in] x0: optional initial solution, initial solution set to zero if not present

Private Functions

index_t mostCorrelatedAtom(const Dictionary<data_t> &dict, const DataContainer<data_t> &evaluatedResidual)

helper method to find the index of the atom that is most correlated with the residual

OrthogonalMatchingPursuit<data_t> *cloneImpl() const override

implement the polymorphic clone operation

bool isEqual(const Solver<data_t> &other) const override

implement the polymorphic comparison operation

Private Members

data_t epsilon_

variable affecting the stopping condition