Search code examples
scikit-learnlogistic-regressioncross-validation

Can't get LogisticRegressionCV to converge for any other Cs then 1


I'm trying to build a logistic regression model with an array of hyperparameter values such as:

lambdas = [0.001, 0.01, 0.05, 0.1, 1., 100.]

However, the model won't converge unless i have Cs = 1.Here is my code:

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2,random_state=42)
lambdas = [0.001, 0.01, 0.05, 0.1, 1., 100.]

RidgeCV = LogisticRegressionCV(Cs = lambdas,penalty ="l2",cv=10,solver="saga",max_iter=1000)
RidgeCV.fit(X_train, y_train)

Does anyone know how to solve this?

I tried to change the solver, inrease max_iter, change the cross validation ammount. Different scaling of the data.The data looks as follows before applying a standard scaler: data head screenshot


Solution

  • Cs = 1 means you have C as 0.0001. See https://github.com/scikit-learn/scikit-learn/blob/98cf537f5c538fdbc9d27b851cf03ce7611b8a48/sklearn/linear_model/_logistic.py#L266

    It seems like your data need stronger regularization. You may try grid search for even lower lambda, such as [0.0001, 0.00001, 0.000001].