Search code examples
pythonmachine-learningscikit-learnlogistic-regressionlasso-regression

Is there a parameter to set a penalty threshold in sklearn?


I am fitting an sklearn.linear_model.LogisticRegression model to my data with an L1 penalty as part of a feature selection process. It is my understanding that using penalty='l1' means that the optimization process will minimize a cost function subject to the sum of the absolute value of all coefficients being less than a given threshold (as explained here).

Is there a parameter to declare a threshold for the sum of the absolute value of the coefficients?

Here's my classifier:

clf = LogisticRegression(penalty='l1', dual=False, tol=0.01, C=1.0,
                         fit_intercept=True, intercept_scaling=1,
                         random_state=0, solver='saga', max_iter=500,
                         multi_class='auto', n_jobs=-1)

Perhaps none of the solver options optimize the problem with a threshold, but honestly, I am only familiar with the algorithm in its basic form, so I don't know if that's the case or not.


Solution

  • What you are looking for is the C parameter which is basically inverted lambda in the

    min: 1/n * ||y - X * beta||^2 + lambda * ||beta||

    equation from wiki (the link that you have provided).

    Decreasing C has the same effect as increasing lambda in the above equation (both, increasing lambda in the above equation and decreasing C in your code will lead to more regularization).

    tol is used as a stopping criterion for the optimization algorithm, not for regularization.