Search code examples
pythonregressionxgboostbayesianhyperparameters

I am currently trying to optimize an XGBRegressor using the BayesianOptimization. Here is the code :


I want to apply regression to my dataset. I am currently trying to optimize an XGBRegressor using the BayesianOptimization, but every time I run it I get the same error. I am not very familiar with machine earning, so I would really appreciate any help I can get. Here is the code :

optimizer = BayesianOptimization(f=xgboostcv,
                     domain=params,
                     model_type='GP',
                     acquisition_type='EI',
                     acquisition_jitter=0.05,
                     exact_feval=True,
                     maximize=True,
                     verbosity=True) 
optimizer.run_optimization(max_iter=20,verbosity=True)

And here is the error :

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-27-921c3885d33d> in <module>
----> 1 optimizer = BayesianOptimization(f=xgboostcv,
  2                                  domain=params,
  3                                  model_type='GP',
  4                                  acquisition_type='EI',
  5                                  acquisition_jitter=0.05,

~\anaconda3\envs\sklearn\lib\site-packages\GPyOpt\methods\bayesian_optimization.py in 
__init__(self, f, domain, constraints, cost_withGradients, model_type, X, Y, 
initial_design_numdata, initial_design_type, acquisition_type, normalize_Y, 
exact_feval, acquisition_optimizer_type, model_update_interval, evaluator_type, 
batch_size, num_cores, verbosity, verbosity_model, maximize, de_duplication, **kwargs)
   92         self.constraints = constraints
   93         self.domain = domain
---> 94         self.space = Design_space(self.domain, self.constraints)
   95 
   96         # --- CHOOSE objective function

~\anaconda3\envs\sklearn\lib\site-packages\GPyOpt\core\task\space.py in 
__init__(self, space, constraints, store_noncontinuous)
   69 
   70         ## --- Transform input config space into the objects used to run the 
optimization
---> 71         self._translate_space(self.config_space)
   72         self._expand_space()
   73         self._compute_variables_indices()

~\anaconda3\envs\sklearn\lib\site-packages\GPyOpt\core\task\space.py in 
_translate_space(self, space)
   177         for i, d in enumerate(space):
   178             descriptor = deepcopy(d)
--> 179             descriptor['name'] = descriptor.get('name', 'var_' + str(i))
 180             descriptor['type'] = descriptor.get('type', 'continuous')
 181             if 'domain' not in descriptor:

 AttributeError: 'str' object has no attribute 'get'

Here are the "params" object specifications :

params ={'max_depth': (2, 5),
     'learning_rate': (0.01, 0.3),
     'n_estimators': (1000, 2500),
     'gamma': (1., 0.01),
     'min_child_weight': (1, 10),
     'max_delta_step': (0, 0.1),
     'subsample': (0.5, 0.8),
     'colsample_bytree' :(0.1, 0.99),
     'reg_alpha':(0.1, 0.5), 
     'reg_lambda':(0.1, 0.9)
}

Solution

  • The GPyOpt package specifies the hyperparameter space in a more verbose (and so more flexible?) way than other popular search methods. There are examples in the documentation: https://gpyopt.readthedocs.io/en/latest/GPyOpt.core.task.html#GPyOpt.core.task.space.Design_space

    space = [
        {'name': 'max_depth', 'type': 'discrete', 'domain': (2,3,4,5)},
        {'name': 'learning_rate', 'type': 'continuous', 'domain': (0.01, 0.3)},
        ...
    ]
    

    Importantly for here, the domain needs to be a list of dicts. In the traceback you can see that the code uses enumerate on the domain, and enumerating a dict just enumerates the keys, hence the complaint about strings where it's expecting a dict.