How to change max_iter in optimize function used by sklearn gaussian process regression?

I am using sklearn's GPR library, but occasionally run into this annoying warning:

ConvergenceWarning: lbfgs failed to converge (status=2):
ABNORMAL_TERMINATION_IN_LNSRCH.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
  _check_optimize_result("lbfgs", opt_res)

Not only can I find almost no documentation on this warning, max_iter is not a parameter in sklearn's GPR model at all. I tried to rescale the data as suggested, but it didn't work and frankly I didn't understand it (do I also need to scale the output? Again, little documentation).

Increasing the maximum iterations in the optimization process makes sense, but sklearn does not appear to have a way to do that, which is frustrating because they suggest it in response to this warning.

Looking at the GPR source code, this is how sklearn calls the optimizer,

 def _constrained_optimization(self, obj_func, initial_theta, bounds):
        if self.optimizer == "fmin_l_bfgs_b":
            opt_res = scipy.optimize.minimize(
                obj_func, initial_theta, method="L-BFGS-B", jac=True,
                bounds=bounds)
            _check_optimize_result("lbfgs", opt_res)
            theta_opt, func_min = opt_res.x, opt_res.fun
        elif callable(self.optimizer):
            theta_opt, func_min = \
                self.optimizer(obj_func, initial_theta, bounds=bounds)
        else:
            raise ValueError("Unknown optimizer %s." % self.optimizer)

        return theta_opt, func_min

where scipy.optimize.minimize() has default values of

scipy.optimize.minimize(fun, x0, args=(), method='L-BFGS-B', jac=None, bounds=None, 
tol=None, callback=None, options={'disp': None, 'maxcor': 10, 'ftol': 2.220446049250313e-09,
'gtol': 1e-05, 'eps': 1e-08, 'maxfun': 15000, 'maxiter': 15000, 'iprint': -1, 'maxls': 20})

according to the scipy docs.

I would like to use exactly the optimizer as shown above in the GPR source code, but change maxiter to a higher number. In other words, I don't want to change the behavior of the optimizer other than the changes made by increasing the maximum iterations.

The challenge there is that other parameters like obj_func, initial_theta, bounds are set within the GPR source code and are not accessible from the GPR object.

This is how I'm calling GPR, note that these are mostly default parameters with the exception of n_restarts_optimizer and the kernel.

for kernel in kernels:
    gp = gaussian_process.GaussianProcessRegressor(
                    kernel              = kernel,
                    alpha               = 1e-10,
                    copy_X_train        = True,
                    optimizer           = "fmin_l_bfgs_b",
                    n_restarts_optimizer= 25,
                    normalize_y         = False,
                    random_state        = None)

Solution

You want to extend and/or modify the behavior of an existing Python object, which sounds like a good use case for inheritance.

A solution could be to inherit from the scikit-learn implementation, and ensure that the usual optimizer is called with the arguments you'd like. Here's a sketch, but note that this is not tested.

from functools import partial
from sklearn.gaussian_process import GaussianProcessRegressor
import scipy.optimize

class MyGPR(GaussianProcessRegressor):
    def __init__(self, *args, max_iter=15000, **kwargs):
        super().__init__(*args, **kwargs)
        self._max_iter = max_iter

    def _constrained_optimization(self, obj_func, initial_theta, bounds):
        def new_optimizer(obj_func, initial_theta, bounds):
            return scipy.optimize.minimize(
                obj_func,
                initial_theta,
                method="L-BFGS-B",
                jac=True,
                bounds=bounds,
                max_iter=self._max_iter,
            )
        self.optimizer = new_optimizer
        return super()._constrained_optimization(obj_func, initial_theta, bounds)