floating-point theory numerical-methods derivative approximation

What value should I use to approximate gradient of a function in practice?

For some function f(x), I want to find f'(x) and f''(x) using the following approximation:

[f(x+h) - f(x)] / h

Which values of h should I choose in each scenario? I know for f'(x) it should be h = sqrt(epsilon) where epsilon is the machine epsilon. but do I have to handle the h value differently for f''(x)? My guess is that the truncation and rounding error somewhat cancel out each other and results in this value.

How should I approximate the error that I will find for double precision (as an example)?

Solution

You should use something like h=sqrt(epsilon)*(1+abs(x)) as for large x the relative precision of h=sqrt(epsilon) will decrease.

In general the difference formula for the k-th derivative will have an evaluation error proportional to epsilon/h^k which adds to the theoretical error of order h^p of the approximation formula. The total error is minimal if both contributions are about equal, that is for h=epsilon^(1/(k+p)), which then needs to be scaled for the size of x.

Let x be of scale 1 in the following examples to avoid that scale. In your one-sided first derivative formula k=p=1, so h=epsilon^(1/2). If you take the symmetric difference formula, then k=1, p=2 and the optimal h is about epsilon^(1/3). If you approximate the second derivative using the symmetric second order difference formula, you get k=p=2, so h=epsilon^(1/4) is optimal, etc.