When I try to run a RandomForestClassifier with Pipeline and param_grid:
nominal_columns = ['heating', 'fuel', 'sewer', 'waterfront', 'newConstruction', 'centralAir']
numerical_pipeline = Pipeline([('imputer', SimpleImputer(strategy='mean')),
('scaler', StandardScaler())])
nominal_pipeline = Pipeline([('imputer', SimpleImputer(strategy='most_frequent')),
('encoder', OneHotEncoder(handle_unknown='ignore'))])
preprocessor = ColumnTransformer([
('numerical_transformer', numerical_pipeline, numerical_columns),
('nominal_transformer', nominal_pipeline, nominal_columns),
])
pipeline = Pipeline([
('preprocessor', preprocessor),
('regressor', RandomForestRegressor(random_state=0))
])
model = pipeline.fit(X_train, y_train)
param_grid = [
{'imputer__strategy': ['mean', 'median'],
'regressor__n_estimators': [3, 10, 30],
'regressor__max_features': [2, 4, 6]},
{'imputer__strategy': ['mean', 'median'],
'regressor__bootstrap': [False],
'regressor__n_estimators': [3, 10],
'regressor__max_features': [2, 3, 4]},
]
gridSearch = GridSearchCV(model, param_grid, cv=3,
scoring='neg_mean_squared_error',
return_train_score=True)
I get this error
ValueError: Invalid parameter imputer for estimator Pipeline(steps=[('preprocessor',
ColumnTransformer(transformers=[('numerical_transformer',
Pipeline(steps=[('imputer',
SimpleImputer()),
('scaler',
StandardScaler())]),
['lotSize', 'age',
'landValue', 'livingArea',
'pctCollege', 'bedrooms',
'fireplaces', 'bathrooms',
'rooms']),
('nominal_transformer',
Pipeline(steps=[('imputer',
SimpleImputer(strategy='most_frequent')),
('encoder',
OneHotEncoder(handle_unknown='ignore'))]),
['heating', 'fuel', 'sewer',
'waterfront',
'newConstruction',
'centralAir'])])),
('regressor', RandomForestRegressor(random_state=0))]). Check the list of available parameters with `estimator.get_params().keys()`.
I've been reading documentation for the past hour and still haven't managed to find a solution to this. Is there a problem with my preprocessor? I've tried to change my strategy to mean instead of most_frequent but that means I get a cannot convert stirng to float error
You've misspecified one of the hyperparameters, imputer__strategy
. Your model is a pipeline containing a column transformer containing pipelines, so you need a name for each of those. I believe you need
preprocessor__numerical_transformer__imputer__strategy