I am using scipy.optimize
's minimize
function, and I would like to terminate the search as soon as the function value is below some threshold. I have tried using a callback that returns true when the above mentioned condition is met, but this in my code the search just continues.
I also have another "fundamental" issue with using the structure of callback they documentation requires: my function is pretty expensive to evaluate, and using the callback I evaluate it twice for the same set of parameters (once with the callback and the second as actual iteration); so if I could be spared from the extra computational cost it would also be nice.
Below my code
class MinimizeStopper(object):
def __init__(self, maximal_non_overlap = 0.05):
self.max_non_overlap = maximal_non_overlap
def __call__(self, xk):
res = fit_min(xk)
return (res <= self.max_non_overlap)
my_cb = MinimizeStopper(0.1)
print(scipy.optimize.minimize(fit_min, ansatz_params[1], callback= my_cb.__call__, method='COBYLA'))
I guess scipy.optimize.minimize's documentation is not 100% clear in regard to the callbacks. AFAIK, only the 'trust-constr' method gets terminated by the callback once the callback returns True. For all the remaining methods the callback can only be used for logging purposes.
Regarding the second part of your question, I cite the docs:
For ‘trust-constr’ it is a callable with the signature: callback(xk, OptimizeResult state) -> bool where xk is the current parameter vector. and state is an OptimizeResult > object, with the same fields as the ones from the return.
Thus, assuming you're open for 'trust-constr', you don't need to evaluate your objective function again since you can directly access the objective value in the current iteration via state.fun
:
from scipy.optimize import minimize
def cb(xk, state, threshold_value = 0.1):
return state.fun <= threshold_value
res = minimize(your_fun, x0=x0, callback=cb, method="trust-constr")
Since your objective function is expensive to evaluate, it's highly recommended to pass the gradient and hessian as well if your objective is twice continuously differentiable and both are known. Otherwise, both get approximated by finite differences which will be quite slow.