I want to apply regression to my dataset. I am currently trying to optimize an XGBRegressor using the BayesianOptimization, but every time I run it I get the same error. I am not very familiar with machine earning, so I would really appreciate any help I can get. Here is the code :
optimizer = BayesianOptimization(f=xgboostcv,
domain=params,
model_type='GP',
acquisition_type='EI',
acquisition_jitter=0.05,
exact_feval=True,
maximize=True,
verbosity=True)
optimizer.run_optimization(max_iter=20,verbosity=True)
And here is the error :
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-27-921c3885d33d> in <module>
----> 1 optimizer = BayesianOptimization(f=xgboostcv,
2 domain=params,
3 model_type='GP',
4 acquisition_type='EI',
5 acquisition_jitter=0.05,
~\anaconda3\envs\sklearn\lib\site-packages\GPyOpt\methods\bayesian_optimization.py in
__init__(self, f, domain, constraints, cost_withGradients, model_type, X, Y,
initial_design_numdata, initial_design_type, acquisition_type, normalize_Y,
exact_feval, acquisition_optimizer_type, model_update_interval, evaluator_type,
batch_size, num_cores, verbosity, verbosity_model, maximize, de_duplication, **kwargs)
92 self.constraints = constraints
93 self.domain = domain
---> 94 self.space = Design_space(self.domain, self.constraints)
95
96 # --- CHOOSE objective function
~\anaconda3\envs\sklearn\lib\site-packages\GPyOpt\core\task\space.py in
__init__(self, space, constraints, store_noncontinuous)
69
70 ## --- Transform input config space into the objects used to run the
optimization
---> 71 self._translate_space(self.config_space)
72 self._expand_space()
73 self._compute_variables_indices()
~\anaconda3\envs\sklearn\lib\site-packages\GPyOpt\core\task\space.py in
_translate_space(self, space)
177 for i, d in enumerate(space):
178 descriptor = deepcopy(d)
--> 179 descriptor['name'] = descriptor.get('name', 'var_' + str(i))
180 descriptor['type'] = descriptor.get('type', 'continuous')
181 if 'domain' not in descriptor:
AttributeError: 'str' object has no attribute 'get'
Here are the "params" object specifications :
params ={'max_depth': (2, 5),
'learning_rate': (0.01, 0.3),
'n_estimators': (1000, 2500),
'gamma': (1., 0.01),
'min_child_weight': (1, 10),
'max_delta_step': (0, 0.1),
'subsample': (0.5, 0.8),
'colsample_bytree' :(0.1, 0.99),
'reg_alpha':(0.1, 0.5),
'reg_lambda':(0.1, 0.9)
}
The GPyOpt package specifies the hyperparameter space in a more verbose (and so more flexible?) way than other popular search methods. There are examples in the documentation: https://gpyopt.readthedocs.io/en/latest/GPyOpt.core.task.html#GPyOpt.core.task.space.Design_space
space = [
{'name': 'max_depth', 'type': 'discrete', 'domain': (2,3,4,5)},
{'name': 'learning_rate', 'type': 'continuous', 'domain': (0.01, 0.3)},
...
]
Importantly for here, the domain needs to be a list of dicts. In the traceback you can see that the code uses enumerate
on the domain, and enumerating a dict just enumerates the keys, hence the complaint about strings where it's expecting a dict.