Search code examples
scikit-learncross-validationvalueerror

Cross Val Score with Elastic Value Error for parameters


I was trying to implement a simple Elastic Net Regression in python using the cross_val_score() function and nested cross validation, but it wont let me pass my parameters. It keeps stating a ValueError from an invalid parameter for my l1_ratio, which I do not understand why given it is between 0 and 1.

ValueError: Invalid parameter l1_ratio for estimator Pipeline(steps=[('preprocessor',
                 ColumnTransformer(remainder='passthrough',
                                   transformers=[('cat', OneHotEncoder(), [0]),
                                                 ('num', StandardScaler(),
                                                  slice(1, 37, None))])),
                ('model', ElasticNet(random_state=42))]).
Check the list of available parameters with `estimator.get_params().keys()`.

My code:

cv_outer=KFold(10, shuffle=True)
cv_inner=KFold(10, shuffle=True)
models_params = {
    'en': (LREN(random_state=42), # Elastic Net
        {'l1_ratio': [0,0.25,0.5,0.75,1]
         ,'alpha':[1e-2,1e-1,1,1e1]})
# My first column is Categorical, the other 36 are numerical
preprocessor = ColumnTransformer(
                    transformers=[
                        ('cat', OneHotEncoder(), [0])
                        ,('num', StandardScaler(), slice(1,37))
                    ]
                    ,remainder = 'passthrough')
# Store Results
average_scores = dict()
for name, (model, params) in models_params.items():
    mymodel = Pipeline(steps = [('preprocessor', preprocessor),
                                ('model', model)
                                ])
        # this object is a regressor that also happens to choose
        # its hyperparameters automatically using `inner_cv`
    optimize_hparams = GridSearchCV(
            estimator = mymodel, param_grid=params, n_jobs = -1,
            cv=cv_inner, scoring='neg_mean_absolute_error')
# estimate generalization error on the outer-fold splits of the data
    outer_folds_scores = cross_val_score(
        optimize_hparams,
        X, y, cv=cv_outer, scoring='neg_mean_absolute_error')

Solution

  • You could try define your grid such as:

    models_params = {
            {'model__l1_ratio': [0,0.25,0.5,0.75,1],
             'model__alpha':[1e-2,1e-1,1,1e1]}
        }