What is xtol for in minimize(method=’Nelder-Mead’)?

The documentation of minimize(method=’Nelder-Mead’) reads: Absolute error in xopt between iterations that is acceptable for convergence. What does that mean exactly? Are there examples showing how it could be used?

Solution

Short answer: it's how accurate you want the result to be, in terms of absolute error. If xatol is 0.01 and the method returns the location of the minimum as [1.23, 4.56], then there is hope (but no certainty) that the actual minimum has coordinates somewhere within 1.22 - 1.24 and 4.55 - 4.57.

Long answer. The Nelder-Mead method operates with a simplex (a triangle in two dimensions, tetrahedron in 3D, etc). The Wikipedia page illistrates how this simplex moves toward a minimum, while changing size and shape (it becomes smaller near the minimum). The search is considered to be successful if two conditions are satisfied:

The size of the simplex is at most xatol (the option xtol is deprecated for this method; xatol is recommended)
The difference of function values at the vertices of the simplex is at most fatol.

Informally, this means the simplex became small and the values of the objective function at its vertices are almost the same. Formally, this is the stopping condition:

if (numpy.max(numpy.ravel(numpy.abs(sim[1:] - sim[0]))) <= xatol and
        numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= fatol):
    break

Here sim[0] is the first vertex of the simplex, sim[1:] are the rest of the vertices. The condition requires every coordinate of every vertex to be within xatol of the corresponding coordinate of sim[0]. The array fsim holds the function values at these vertices; here the requirement is that |fsim[k] - fsim[0]| <= fatol for all k.

The default value of xatol is 0.0001. When the search is successful, the final simplex will cover a point of minimum; thus, the size of the simplex is the precision to which we know the location of the minimum. Smaller xatol can be used to pinpoint the minimum more precisely, at the expense of a longer running time.

Usage example

Looking for the minimum of (x^2+y^2), which of course is at the point (0, 0). With the default settings the answer is off by about 3e-5.

>>> from scipy.optimize import minimize
>>> minimize(lambda x: x[0]**2+x[1]**2, [1, 2], method='Nelder-Mead').x
array([ -3.62769110e-05,  -3.03662006e-05])

With smaller xatol (1e-6 instead of default 1e-4), the result is about 100 times more accurate, with the error about 3e-7.

>>> minimize(lambda x: x[0]**2+x[1]**2, [1, 2], method='Nelder-Mead', options={'xatol': 1e-6}).x
array([  3.12645001e-07,  -2.53507540e-07])