python scikit-learn logistic-regression gridsearchcv

GridSearchCV unexpected behaviour (always returns the first parameter as the best)

I've got a multiclass classification problem and I need to find the best parameters. I cannot change the max_iter, solver, and tol (they are given), but I'd like to check which penalty is better. However, GridSearchCV always returns the first given penalty as the best one.

Example:

from sklearn.model_selection import cross_val_score, GridSearchCV, StratifiedKFold
cv = StratifiedKFold(n_splits=5, random_state=0, shuffle=True)

fixed_params = {
    'random_state': 42,
    'multi_class': 'multinomial',
    'solver': 'saga',
    'tol': 1e-3,
    'max_iter': 500
}

parameters = [
    {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000], 'penalty': ['l1', 'l2', None]},
    {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000], 'penalty': ['elasticnet'], 'l1_ratio': np.arange(0.0, 1.0, 0.1)}     
]

model = GridSearchCV(LogisticRegression(**fixed_params), parameters, n_jobs=-1, verbose=10, scoring='f1_macro' ,cv=cv)
model.fit(X_train, y_train)

print(model.best_score_)
# 0.6836409100287101

print(model.best_params_)
# {'C': 0.1, 'penalty': 'l2'}

If I change the order of parameters rows, the result will be quite opposite:

from sklearn.model_selection import cross_val_score, GridSearchCV, StratifiedKFold
cv = StratifiedKFold(n_splits=5, random_state=0, shuffle=True)
    
fixed_params = {
    'random_state': 42,
    'multi_class': 'multinomial',
    'solver': 'saga',
    'tol': 1e-3,
    'max_iter': 500
}

parameters = [
        {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000], 'penalty': ['elasticnet'], 'l1_ratio': np.arange(0.0, 1.0, 0.1)} 
        {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000], 'penalty': ['l1', 'l2', None]}          
]
    
model = GridSearchCV(LogisticRegression(**fixed_params), parameters, n_jobs=-1, verbose=10, scoring='f1_macro' ,cv=cv)
model.fit(X_train, y_train)

print(model.best_score_)
# 0.6836409100287101

print(model.best_params_)
# {'C': 0.1, 'l1_ratio': 0.0, 'penalty': 'elasticnet'}

So, the best_score_ is the same for both options, but the best_params_ is not.

Could you please tell me what is wrong?

Edited
GridSearchCV gives a worse result in comparison to baseline with default parameters.
Baseline:

baseline_model = LogisticRegression(multi_class='multinomial', solver='saga', tol=1e-3, max_iter=500)
baseline_model.fit(X_train, y_train)
train_pred_baseline = baseline_model.predict(X_train)
print(f1_score(y_train, train_pred_baseline, average='micro'))

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, intercept_scaling=1, l1_ratio=None, max_iter=500, multi_class='multinomial', n_jobs=None, penalty='l2', random_state=None, solver='saga', tol=0.001, verbose=0, warm_start=False)

Baseline gives me f1_micro better than GridSearchCV:

0.7522768670309654

Edited-2
So, according to best f1_score performance, C = 1 is the best choice for my model. But GridSearchCV returns me C = 0.1. I think, I miss something...
Baseline's f1_macro better than GridSearchCV too:

    train_pred_baseline = baseline_model.predict(X_train)
    print(f1_score(y_train, train_pred_baseline, average='macro'))
    # 0.7441968750050458

Solution

Actually there's nothing wrong. Here's the thing. Elasticnet uses both L1 and L2 penalty terms. However, if your l1_ratio is 0, then you're basically applying L2 regularization so you're only using the L2 penalty term. As stated in the docs:

Setting l1_ratio=0 is equivalent to using penalty='l2', while setting l1_ratio=1 is equivalent to using penalty='l1'. For 0 < l1_ratio <1, the penalty is a combination of L1 and L2.

Since your second result had l1_ratio to be 0, it's equivalent to using L2 penalty terms.