Ceres Solver 中文文档
  • 😜Ceres Solver 中文文档
  • Why Ceres ?
  • Installation
  • Tutorial
    • Non-linear Least Squares
      • Introduction
      • Hello World
      • Derivatives
        • Numeric Derivatives
        • Analytic Derivatives
        • More About Derivatives
      • Powell’s Function
      • Curve Fitting
      • Robust Curve Fitting
      • Bundle Adjustment
      • Other Examples
    • General Unconstrained Minimization
      • General Unconstrained Minimization
  • On Derivatives
    • Spivak Notation
    • Analytic Derivatives
    • Numeric Derivatives
    • Automatic Derivatives
    • Interfacing with Automatic Differentiation
    • Using Inverse & Implicit Function Theorems
  • Modeling Non-linear Least Squares
    • Introduction
    • Main Class Interface
      • CostFunction
      • SizeCostFunction
      • AutoDiffCostFunction
      • DynamicAutoDiffCostFunction
      • NumericDiffCostFunction
      • DynamicNumericDifferCostFunction
      • CostFunctionToFunctor
      • DynamicCostFunctionToFunctor
      • ConditionedCostFunction
      • GradientChecker
      • NormalPrior
      • LossFunction
      • Manifold
      • AutoDIffManifold
      • Problem
      • EvaluatationCallback
      • Rotation
      • Cubic Interpolation
        • CubicInterpolator
        • BiCubicInterpolator
  • Solveing Non-linear Least Squares
    • Introduction
    • Trust Region Methodd
    • Line Search Methods
    • Linear Solvers
    • Mixed Precision Solves
    • Preconditioners
    • Ordering
    • Main Class Interfaces
      • Solver::Options
      • ParameterBlockOrdering
      • IterationSummary
      • IterationCallback
      • CRSMatrix
      • Solver::Summary
  • Covariance Estimation
    • Introduction
    • Gauge Invariance
    • Covariance
    • Rank of the Jacobian
      • Options
      • Covariance
      • GetCovarianceBlock
      • GetCovarianceBlockInTangentSpace
    • Example Usage
Powered by GitBook
On this page
  1. Modeling Non-linear Least Squares
  2. Main Class Interface

LossFunction

class LossFunction

对于最小二乘问题,在优化中很容易受到输入中离群点的影响,影响整个优化问题的收敛,此时可以使用 loss function 去降低离群点的影响。

考虑一个 structure from motion 问题,3D 点和相机参数均为未知待优化变量,测量值是图像坐标,描述了某一点的预期重投影位置。例如,我们要模拟一个街道场景的几何形状,其中有消防栓和汽车,由一个未知参数的移动像机观测(内外参均未知),我们唯一关心的三维点是消防栓的尖顶。我们的图像处理算法负责生成输入到 Ceres 的测量值,它已在所有图像帧中找到并匹配了所有此类顶点,只有其中一帧将汽车前灯误认为是消防栓。如果我们不做任何特殊处理,错误测量的残差将导致优化结果偏离最佳值,以减少因错误测量而产生的巨大误差。

使用鲁棒损失核函数可以降低大残差的影响。在上面的例子中,这会导致离群项的权重降低,不会过度影响最终解。

class LossFunction {
 public:
  virtual void Evaluate(double s, double out[3]) const = 0;
};

关键在于 LossFunction::Evaluate 的计算,给定一个非负的标量 sss,计算鲁棒核函数的值,以及对应的一阶和二阶导数。

out=[τ(s),τ′(s),τ′′(s)]\text{out}=[\tau(s),\tau^{\prime}(s),\tau^{\prime\prime}(s)]out=[τ(s),τ′(s),τ′′(s)]

在使用 loss function 之后,最小二乘问题中的残差项对 costfunction 的贡献为 12τ(s)\frac{1}{2}\tau(s)21​τ(s),其中 s=∣∣fi∣∣2s=||f_i||^2s=∣∣fi​∣∣2。最理智的 τ\tauτ 选择满足

τ(0)=0τ′(0)=1τ′(s)<1 in the outlier region τ′′(s)<0 in the outlier region \begin{aligned} \tau(0) & =0 \\ \tau^{\prime}(0) & =1 \\ \tau^{\prime}(s) & <1 \text { in the outlier region } \\ \tau^{\prime \prime}(s) & <0 \text { in the outlier region } \end{aligned}τ(0)τ′(0)τ′(s)τ′′(s)​=0=1<1 in the outlier region <0 in the outlier region ​

给定一个鲁棒函数 τ(s)\tau(s)τ(s) ,通过添加一个尺度因子 a>0a>0a>0 ,可以改变残差大小,即 τ(s,a)=a2τ(s/a2)\tau(s, a)=a^{2} \tau\left(s / a^{2}\right) τ(s,a)=a2τ(s/a2)同时其一阶导和二阶导分别是 τ′(s/a2)\tau^{\prime}\left(s / a^{2}\right)τ′(s/a2) 和 (1/a2)τ′′(s/a2)\left(1 / a^{2}\right) \tau^{\prime \prime}\left(s / a^{2}\right)(1/a2)τ′′(s/a2) 。The reason for the appearance of squaring is that aaa is in the units of the residual vector norm whereas sss is a squared norm. For applications it is more convenient to specify aaa than its square.

Instance

Ceres 拥有一些已经提前定义好的鲁棒核函数,简单起见,这里只简单介绍他们的无尺度版本,下图展示了他们的函数图像,更多的细节可以在 include/ceres/loss_function.h 中找到。

class TrivialLoss
τ(s)=s\tau(s)=sτ(s)=s
class HuberLoss
τ(s)={ss≤12s−1s>1\tau (s)= \begin{cases} s & s\le 1 \\ 2\sqrt s-1 & s>1 \end{cases}τ(s)={s2s​−1​s≤1s>1​
class SoftLOneLoss
τ(s)=2(1+s−1)\tau (s)=2(\sqrt{1+s} -1 )τ(s)=2(1+s​−1)
class CauchyLoss
τ(s)=log⁡(1+s)\tau(s)=\log(1+s)τ(s)=log(1+s)
class ArctanLoss
τ(s)=arctan⁡(s)\tau(s)=\arctan(s)τ(s)=arctan(s)
class TolerantLoss
τ(s,a,b)=blog⁡(1+e(s−a)/b)−blog⁡(1+e(−a/b)\tau(s,a,b)=b\log(1+e^{(s-a)/b})-b\log(1+e^{(-a/b})τ(s,a,b)=blog(1+e(s−a)/b)−blog(1+e(−a/b)
class TukeyLoss
τ(s)={13(1−(1−s)3)s≤113s>1\tau (s)= \begin{cases} \frac{1}{3}(1-(1-s)^3) & s\le 1 \\ \frac{1}{3} & s>1 \end{cases}τ(s)={31​(1−(1−s)3)31​​s≤1s>1​
class CompusedLoss

给定两个鲁棒核函数 f,gf,gf,g 组成 h(s)=f(g(s))h(s)=f(g(s))h(s)=f(g(s))。

class ScaleLoss

有时,可能只想简单地缩放鲁棒函数的输出值。例如,您可能希望对不同的误差项进行不同的加权(例如,对像素重投影误差进行不同的加权)。给定核函数 τ(s)\tau (s)τ(s) 和一个尺度因子 aaa,ScaleLoss 的做法是实现了 aτ(s)a\tau (s)aτ(s)。

class LossFunctionWrapper

有时,在构建优化问题后,我们希望改变损失函数的参数大小。例如,在对有大量离群值的数据进行估计时,可以先使用大尺度,优化问题,然后缩小尺度,从而改善收敛性。这比使用小尺度的损失函数收敛效果更好。这个模板类允许用户在构建优化问题后,实现规模可变的损失函数,例如

Problem problem;

// Add parameter blocks

auto* cost_function =
    new AutoDiffCostFunction<UW_Camera_Mapper, 2, 9, 3>(feature_x, feature_y);

LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
problem.AddResidualBlock(cost_function, loss_function, parameters);

Solver::Options options;
Solver::Summary summary;
Solve(options, &problem, &summary);

loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
Solve(options, &problem, &summary);

Theory

我们来考虑一个只有一个残差块的优化问题

min⁡x12τ(f2(x))\min _{x} \frac{1}{2} \tau\left(f^{2}(x)\right)xmin​21​τ(f2(x))

那么该鲁棒核函数的梯度 g(x)g(x)g(x) 和高斯牛顿 HHH 矩阵如下

g(x)=τ′J⊤(x)f(x)H(x)=J⊤(x)(τ′+2τ′′f(x)f⊤(x))J(x)\begin{aligned} g(x) & =\tau^{\prime} J^{\top}(x) f(x) \\ H(x) & =J^{\top}(x)\left(\tau^{\prime}+2 \tau^{\prime \prime} f(x) f^{\top}(x)\right) J(x) \end{aligned}g(x)H(x)​=τ′J⊤(x)f(x)=J⊤(x)(τ′+2τ′′f(x)f⊤(x))J(x)​

where the terms involving the second derivatives of f(x)f(x)f(x) have been ignored. Note that H(x)H(x)H(x) is indefinite if τ′′f(x)⊤f(x)+12τ′<0\tau^{\prime \prime} f(x)^{\top} f(x)+\frac{1}{2} \tau^{\prime}<0τ′′f(x)⊤f(x)+21​τ′<0 . If this is not the case, then its possible to re-weight the residual and the Jacobian matrix such that the robustified Gauss-Newton step corresponds to an ordinary linear least squares problem.

α\alphaα 为下面方程的根

12α2−α−τ′′τ′∥f(x)∥2=0.\frac{1}{2} \alpha^{2}-\alpha-\frac{\tau^{\prime \prime}}{\tau^{\prime}}\|f(x)\|^{2}=0 .21​α2−α−τ′τ′′​∥f(x)∥2=0.

Then, define the rescaled residual and Jacobian as

f~(x)=τ′1−αf(x)J~(x)=τ′(1−αf(x)f⊤(x)∥f(x)∥2)J(x)\begin{array}{l} \tilde{f}(x)=\frac{\sqrt{\tau^{\prime}}}{1-\alpha} f(x) \\ \tilde{J}(x)=\sqrt{\tau^{\prime}}\left(1-\alpha \frac{f(x) f^{\top}(x)}{\|f(x)\|^{2}}\right) J(x) \end{array}f~​(x)=1−ατ′​​f(x)J~(x)=τ′​(1−α∥f(x)∥2f(x)f⊤(x)​)J(x)​

With this simple rescaling, one can apply any Jacobian based non-linear least squares algorithm to robustified non-linear least squares problems.

PreviousNormalPriorNextManifold

Last updated 1 year ago

In the case 2τ′′∥f(x)∥2+τ′≲02 \tau^{\prime \prime}\|f(x)\|^{2}+\tau^{\prime} \lesssim 02τ′′∥f(x)∥2+τ′≲0 , we limit α≤1−ϵ\alpha \leq 1-\epsilonα≤1−ϵ for some small ϵ\epsilonϵ . For more details see .

While the theory described above is elegant, in practice we observe that using the Triggs correction when τ′′<0\tau ^{\prime\prime} <0τ′′<0 leads to poor performance, so we upper bound it by zero. For more details see

Triggs
corrector.cc