Search code examples
python-3.xmachine-learningscikit-learnclassificationhyperparameters

Assigning best grid searched hyperparameters into final model in Python Bagging Classifier


I am training a Logistic Regression and using bagging. I want to use gridsearch CV to find the best hyperparameters. I used the '__' to denote a hyperparameter of the base estimator:

from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import BaggingClassifier

param_grid = {
    'base_estimator__C': [1e-15, 1e-10, 1e-8, 1e-4, 1e-3, 1e-2, 1, 5, 10, 20, 50, 100, 1000], # lambdas for regularization
    'max_samples': [0.05, 0.1, 0.2, 0.5], # for bootstrap sampling
    'max_features': [0.3,0.5,0.7,0.9]} 



clf = GridSearchCV(BaggingClassifier(LogisticRegression(penalty='l2'),
                                            n_estimators = 100),
                        param_grid, cv=cv, scoring='f1', return_train_score=True)
clf.fit(x,y)
best_hyperparams = clf.best_params_
best_hyperparams

Results:
{'base_estimator__C': 10, 'max_features': 0.3, 'max_samples': 0.1}

Now that I have got the best parameters, how do I put it into the bagging classifier again? Using **best_hyperparams does not work as the Bagging classifier does not recognize that the base_estimator__C should go into the base estimator, Logistic Regression

best_clf = BaggingClassifier(LogisticRegression(penalty='l2'), n_estimators = 100, **best_hyperparams) # train model with best hyperparams

Solution

  • You can use set_params() after initializing the bagging classifier.

    best_clf = BaggingClassifier(LogisticRegression(penalty='l2'), n_estimators = 100)
    best_clf.set_params(**best_hyperparams)