Search code examples
pythonscipymathematical-optimization

Interpreting the amax parameter in scipy's line_search


In SciPy 1.0.0, according to its documentation, scipy.optimize.line_search has an optional parameter amax which determines the "Maximum step size".

As such, I would expect that the first return value, alpha, would always be less than the given amax, but this is not the case, as the following example illustrates:

from scipy.optimize import line_search

def f(x):
    return x[0]**2 + x[1]**2

def df(x):
    return 2*x

x = np.array([4, 5])
line_search(f, df, x, -df(x), amax=0.001)
# Returns (0.5, 3, 1, 0.0, 41.0, array([ 0.,  0.]))

In this, the value of alpha is 0.5 but the value of amax is 0.001, which is less than 0.5.

Another interpretation might be that amax limits the distance between the output values x_new and x0, but this is also not the case.

Am I misinterpreting the documentation, or is this a bug in SciPy? And, if it is working as intended, what is the correct interpretation of amax, and is there some other way to limit the range of alphas in which to perform the line search?


Solution

  • As sascha said, the current implementation of line_search (SciPy 1.0.0) does not handle amax correctly. Specifically, the function scalar_search_wolfe2 breaks out of the loop for i in xrange(maxiter) if a suitable alpha is found, while the check for amax appears later in the loop and is not reached.

    Workarounds:

    1. Use the older routine line_search_wolfe1, which is a wrapper for MINPACK's line search. It enforces not only amax but also amin, minimal step size.
    from scipy.optimize.linesearch import line_search_wolfe1
    line_search_wolfe1(f, df, x, -df(x), amax=0.001)
    

    This returns (None, 1, 0, 41.0, 41.0, array([ 8, 10])) where None means that no suitable step was found. This is the correct answer: no step size of size at most 0.001 satisfies the Wolfe conditions.

    1. Pass the requirement on the step as extra_condition argument to the line_search routine.
    line_search(f, df, x, -df(x), extra_condition=lambda a, x, f, g: a <= 0.001)
    

    This returns (None, 13, 1, None, 41.0, None) which again is correct: there is no suitable step size.

    1. Perhaps you don't really mean to impose Wolfe's condition? Getting None as a (correct) answer isn't very helpful, after all. If so, use Armijo backtracking. The parameters are different: it does not ask for df function, but it does ask for the gradient and value at the initial point of the search. The parameter alpha0 is the initial and maximal step size, the algorithm will only reduce it if necessary.
    from scipy.optimize.linesearch import line_search_armijo
    line_search_armijo(f, x, -df(x), df(x), f(x), alpha0=0.001)
    

    returns (0.001, 1, 40.836164000000004) meaning the step size is 0.001, there was 1 function evaluation, and the function value after the step is 40.8... which is of course smaller than 41 it was initially.

    I don't see the documentation of line_search_armijo on the SciPy website, but we can read its docstring on Github.