Search code examples
pythonscikit-learnpipelinelogistic-regressiongrid-search

ValueError: Invalid parameter C for estimator SelectFromModel when using GridSearchCV


I'm using python 3.7.6, and I'm trying to tune some hyperparameters using GridSearchCV

I created a pipeline with the following steps: scaling-> feature selection -> model

But I'm getting error about the C parameter of the feature selection step.

    steps = [('scaler', StandardScaler()),
             ('FeatureSelection', SelectFromModel(LogisticRegression(penalty='l1', solver='liblinear'))),
             ('SVM', SVC())]
    pipeline = Pipeline(steps)  # define the pipeline object.

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=30, stratify=y)

    parameteres = {'SVM__C': [0.001, 0.1, 10, 100, 10e5],
                   'SVM__gamma':[0.1,0.01],
                   'FeatureSelection__C':['0','0.5']}
    grid = GridSearchCV(pipeline, param_grid=parameteres, cv=5, n_jobs=-1)
    grid.fit(X_train, y_train)
    print("pipeline score: ", grid.score(X_test, y_test))

I'm getting the following error:

ValueError: Invalid parameter C for estimator SelectFromModel(estimator=LogisticRegression(C=1.0, class_weight=None,
                                             dual=False, fit_intercept=True,
                                             intercept_scaling=1, l1_ratio=None,
                                             max_iter=100, multi_class='auto',
                                             n_jobs=None, penalty='l1',
                                             random_state=None,
                                             solver='liblinear', tol=0.0001,
                                             verbose=0, warm_start=False),
                max_features=None, norm_order=1, prefit=False, threshold=None). Check the list of available parameters with `estimator.get_params().keys()`.

What is wrong and how can I fix it ?


Solution

  • As is, the pipeline looks for a parameter C in SelectFromModel, can't find one (unsurprisingly, since the module does not have such a parameter), and raises an error. Since you actually want the parameter C of LogisticRegression, you should go a level deeper: change FeatureSelection__C to FeatureSelection__estimator__C in your parameters grid and you will be fine.