I am trying to run a RandomForest Classifier using Pipeline, GridSerach and CV
I am getting an error when I fit the data. I am not sure how to fix it. I found a similar Question with the solution https://stackoverflow.com/a/34890246/9592484 but didn't work for me
Will appreciate any help on this.
My code is:
column_trans = make_column_transformer((OneHotEncoder(), ['CategoricalData']),
remainder='passthrough')
RF = RandomForestClassifier()
pipe = make_pipeline(column_trans, RF)
# Set grid search params
grid_params = [{'randomforestclassifier_criterion': ['gini', 'entropy'],
'randomforestclassifier_min_samples_leaf': [5,10,20,30,50,80,100],
'randomforestclassifier_max_depth': [3,4,6,8,10],
'randomforestclassifier_min_samples_split': [2,4,6,8,10]}]
# Construct grid search
gs = GridSearchCV(estimator = pipe,
param_grid = grid_params,
scoring='accuracy',
cv=5)
gs.fit(train_features, train_target) ----This is where I get an error
ValueError: Invalid parameter randomforestclassifier_criterion for estimator Pipeline(steps=[('columntransformer',
ColumnTransformer(remainder='passthrough',
transformers=[('onehotencoder',
OneHotEncoder(),
['saleschanneltypeid'])])),
('randomforestclassifier', RandomForestClassifier())]). Check the list of available parameters with `estimator.get_params().keys()`.
The make_pipeline
utility function derives step names from transformer/estimator class names. For example, the RandomForestClassifier
is mapped to randomforestclassifier
step.
Please adjust your grid search parameter prefixes acordingly (ie. from RF
to randomforestclassifier
). For example, RF__criterion
should become randomforestclassifier__criterion
.