LossFunction

class LossFunction

对于最小二乘问题，在优化中很容易受到输入中离群点的影响，影响整个优化问题的收敛，此时可以使用 loss function 去降低离群点的影响。

考虑一个 structure from motion 问题，3D 点和相机参数均为未知待优化变量，测量值是图像坐标，描述了某一点的预期重投影位置。例如，我们要模拟一个街道场景的几何形状，其中有消防栓和汽车，由一个未知参数的移动像机观测（内外参均未知），我们唯一关心的三维点是消防栓的尖顶。我们的图像处理算法负责生成输入到 Ceres 的测量值，它已在所有图像帧中找到并匹配了所有此类顶点，只有其中一帧将汽车前灯误认为是消防栓。如果我们不做任何特殊处理，错误测量的残差将导致优化结果偏离最佳值，以减少因错误测量而产生的巨大误差。

使用鲁棒损失核函数可以降低大残差的影响。在上面的例子中，这会导致离群项的权重降低，不会过度影响最终解。

class LossFunction {
 public:
  virtual void Evaluate(double s, double out[3]) const = 0;
};

关键在于 LossFunction::Evaluate 的计算，给定一个非负的标量 $s$ ，计算鲁棒核函数的值，以及对应的一阶和二阶导数。

\text{out}=[\tau(s),\tau^{\prime}(s),\tau^{\prime\prime}(s)]

在使用 loss function 之后，最小二乘问题中的残差项对 costfunction 的贡献为 $\frac{1}{2}\tau(s)$ ，其中 $s=||f_i||^2$ 。最理智的 $\tau$ 选择满足

\begin{aligned} \tau(0) & =0 \\ \tau^{\prime}(0) & =1 \\ \tau^{\prime}(s) & <1 \text { in the outlier region } \\ \tau^{\prime \prime}(s) & <0 \text { in the outlier region } \end{aligned}

给定一个鲁棒函数 $\tau(s)$ ，通过添加一个尺度因子 $a>0$ ，可以改变残差大小，即 $\tau(s, a)=a^{2} \tau\left(s / a^{2}\right)$ 同时其一阶导和二阶导分别是 $\tau^{\prime}\left(s / a^{2}\right)$ 和 $\left(1 / a^{2}\right) \tau^{\prime \prime}\left(s / a^{2}\right)$ 。The reason for the appearance of squaring is that $a$ is in the units of the residual vector norm whereas $s$ is a squared norm. For applications it is more convenient to specify $a$ than its square.

Instance

Ceres 拥有一些已经提前定义好的鲁棒核函数，简单起见，这里只简单介绍他们的无尺度版本，下图展示了他们的函数图像，更多的细节可以在 include/ceres/loss_function.h 中找到。

class TrivialLoss
$\tau(s)=s$
class HuberLoss
$\tau (s)= \begin{cases} s & s\le 1 \\ 2\sqrt s-1 & s>1 \end{cases}$
class SoftLOneLoss
$\tau (s)=2(\sqrt{1+s} -1 )$
class CauchyLoss
$\tau(s)=\log(1+s)$
class ArctanLoss
$\tau(s)=\arctan(s)$
class TolerantLoss
$\tau(s,a,b)=b\log(1+e^{(s-a)/b})-b\log(1+e^{(-a/b})$
class TukeyLoss
$\tau (s)= \begin{cases} \frac{1}{3}(1-(1-s)^3) & s\le 1 \\ \frac{1}{3} & s>1 \end{cases}$
class CompusedLoss
给定两个鲁棒核函数 $f,g$ 组成 $h(s)=f(g(s))$ 。
class ScaleLoss
有时，可能只想简单地缩放鲁棒函数的输出值。例如，您可能希望对不同的误差项进行不同的加权（例如，对像素重投影误差进行不同的加权）。给定核函数 $\tau (s)$ 和一个尺度因子 $a$ ，ScaleLoss 的做法是实现了 $a\tau (s)$ 。
class LossFunctionWrapper
有时，在构建优化问题后，我们希望改变损失函数的参数大小。例如，在对有大量离群值的数据进行估计时，可以先使用大尺度，优化问题，然后缩小尺度，从而改善收敛性。这比使用小尺度的损失函数收敛效果更好。这个模板类允许用户在构建优化问题后，实现规模可变的损失函数，例如
Problem problem;

// Add parameter blocks

auto* cost_function =
    new AutoDiffCostFunction<UW_Camera_Mapper, 2, 9, 3>(feature_x, feature_y);

LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
problem.AddResidualBlock(cost_function, loss_function, parameters);

Solver::Options options;
Solver::Summary summary;
Solve(options, &problem, &summary);

loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
Solve(options, &problem, &summary);

Theory

我们来考虑一个只有一个残差块的优化问题

\min _{x} \frac{1}{2} \tau\left(f^{2}(x)\right)

那么该鲁棒核函数的梯度 $g(x)$ 和高斯牛顿 $H$ 矩阵如下

\begin{aligned} g(x) & =\tau^{\prime} J^{\top}(x) f(x) \\ H(x) & =J^{\top}(x)\left(\tau^{\prime}+2 \tau^{\prime \prime} f(x) f^{\top}(x)\right) J(x) \end{aligned}

where the terms involving the second derivatives of $f(x)$ have been ignored. Note that $H(x)$ is indefinite if $\tau^{\prime \prime} f(x)^{\top} f(x)+\frac{1}{2} \tau^{\prime}<0$ . If this is not the case, then its possible to re-weight the residual and the Jacobian matrix such that the robustified Gauss-Newton step corresponds to an ordinary linear least squares problem.

$\alpha$ 为下面方程的根

\frac{1}{2} \alpha^{2}-\alpha-\frac{\tau^{\prime \prime}}{\tau^{\prime}}\|f(x)\|^{2}=0 .

Then, define the rescaled residual and Jacobian as

\begin{array}{l} \tilde{f}(x)=\frac{\sqrt{\tau^{\prime}}}{1-\alpha} f(x) \\ \tilde{J}(x)=\sqrt{\tau^{\prime}}\left(1-\alpha \frac{f(x) f^{\top}(x)}{\|f(x)\|^{2}}\right) J(x) \end{array}

In the case $2 \tau^{\prime \prime}\|f(x)\|^{2}+\tau^{\prime} \lesssim 0$ , we limit $\alpha \leq 1-\epsilon$ for some small $\epsilon$ . For more details see Triggs.

With this simple rescaling, one can apply any Jacobian based non-linear least squares algorithm to robustified non-linear least squares problems.

While the theory described above is elegant, in practice we observe that using the Triggs correction when $\tau ^{\prime\prime} <0$ leads to poor performance, so we upper bound it by zero. For more details see corrector.cc

PreviousNormalPrior NextManifold

Last updated 1 year ago