So, to compute A(n,1) for increasing values of n we move from the left to the right, computing one column at a time. Assuming that the primary cost here is the evaluation of the function f(x) , the cost of computing a new column of the above tableau is two function evaluations. Since the cost of evaluating A(1,n) , requires evaluating the central difference formula for step size of 21−nh
Applying this method to f(x)=sinx−x2ex starting with a fairly large step size h=0.01 , we get:
Compared to the correct value Df(1.0)=140.73773557129658,A(5,1) has a relative error of 10−13 . For comparison, the relative error for the central difference formula with the same step size (0.01/24=0.000625)is 10−5 .
The above tableau is the basis of Ridders’ method for numeric differentiation. The full implementation is an adaptive scheme that tracks its own estimation error and stops automatically when the desired precision is reached. Of course it is more expensive than the forward and central difference formulae, but is also significantly more robust and accurate.
Using Ridder’s method instead of forward or central differences in Ceres is again a simple matter of changing a template argument to NumericDiffCostFunction as follows:
CostFunction* cost_function =
new NumericDiffCostFunction<Rat43CostFunctor, RIDDERS, 1, 4>(
new Rat43CostFunctor(x, y));
The following graph shows the relative error of the three methods as a function of the absolute step size. For Ridders’s method we assume that the step size for evaluating A(n,1) is 21−nh.
Using the 10 function evaluations that are needed to compute A(5,1) we are able to approximate Df(1.0) about a 1000 times better than the best central differences estimate. To put these numbers in perspective, machine epsilon for double precision arithmetic is ≈2.22×10−16.
Going back to Rat43, let us also look at the runtime cost of the various methods for computing numeric derivatives.
CostFunction
Times
Rat43Analytic
255
Rat43AnalyticOptimized
92
Rat43NumericDiffForward
262
Rat43NumericDiffCentral
517
Rat43NumericDiffRidders
3760
As expected, Central Differences is about twice as expensive as Forward Differences and the remarkable accuracy improvements of Ridders’ method cost an order of magnitude more runtime.
Recommendations
Numeric differentiation should be used when you cannot compute the derivatives either analytically or using automatic differentiation. This is usually the case when you are calling an external library or function whose analytic form you do not know or even if you do, you are not in a position to re-write it in a manner required to use Automatic Derivatives.
When using numeric differentiation, use at least Central Differences, and if execution time is not a concern or the objective function is such that determining a good static relative step size is hard, Ridders’ method is recommended.