I'm using python 3.7.6
, and I'm trying to tune some hyperparameters using GridSearchCV
I created a pipeline
with the following steps: scaling-> feature selection -> model
But I'm getting error about the C
parameter of the feature selection step.
steps = [('scaler', StandardScaler()),
('FeatureSelection', SelectFromModel(LogisticRegression(penalty='l1', solver='liblinear'))),
('SVM', SVC())]
pipeline = Pipeline(steps) # define the pipeline object.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=30, stratify=y)
parameteres = {'SVM__C': [0.001, 0.1, 10, 100, 10e5],
'SVM__gamma':[0.1,0.01],
'FeatureSelection__C':['0','0.5']}
grid = GridSearchCV(pipeline, param_grid=parameteres, cv=5, n_jobs=-1)
grid.fit(X_train, y_train)
print("pipeline score: ", grid.score(X_test, y_test))
I'm getting the following error:
ValueError: Invalid parameter C for estimator SelectFromModel(estimator=LogisticRegression(C=1.0, class_weight=None,
dual=False, fit_intercept=True,
intercept_scaling=1, l1_ratio=None,
max_iter=100, multi_class='auto',
n_jobs=None, penalty='l1',
random_state=None,
solver='liblinear', tol=0.0001,
verbose=0, warm_start=False),
max_features=None, norm_order=1, prefit=False, threshold=None). Check the list of available parameters with `estimator.get_params().keys()`.
What is wrong and how can I fix it ?
As is, the pipeline looks for a parameter C
in SelectFromModel
, can't find one (unsurprisingly, since the module does not have such a parameter), and raises an error. Since you actually want the parameter C
of LogisticRegression
, you should go a level deeper: change FeatureSelection__C
to FeatureSelection__estimator__C
in your parameters
grid and you will be fine.