Search code examples
pythonmachine-learningscikit-learnlogistic-regression

What does sklearn "RidgeClassifier" do?


I'm trying to understand the difference between RidgeClassifier and LogisticRegression in sklearn.linear_model. I couldn't find it in the documentation.

I think I understand quite well what the LogisticRegression does.It computes the coefficients and intercept to minimise half of sum of squares of the coefficients + C times the binary cross-entropy loss, where C is the regularisation parameter. I checked against a naive implementation from scratch, and results coincide.

Results of RidgeClassifier differ and I couldn't figure out, how the coefficients and intercept are computed there? Looking at the Github code, I'm not experienced enough to untangle it.

The reason why I'm asking is that I like the RidgeClassifier results -- it generalises a bit better to my problem. But before I use it, I would like to at least have an idea where does it come from.

Thanks for possible help.


Solution

  • RidgeClassifier() works differently compared to LogisticRegression() with l2 penalty. The loss function for RidgeClassifier() is not cross entropy.

    RidgeClassifier() uses Ridge() regression model in the following way to create a classifier:

    Let us consider binary classification for simplicity.

    1. Convert target variable into +1 or -1 based on the class in which it belongs to.

    2. Build a Ridge() model (which is a regression model) to predict our target variable. The loss function is MSE + l2 penalty

    3. If the Ridge() regression's prediction value (calculated based on decision_function() function) is greater than 0, then predict as positive class else negative class.

    For multi-class classification:

    1. Use LabelBinarizer() to create a multi-output regression scenario, and then train independent Ridge() regression models, one for each class (One-Vs-Rest modelling).

    2. Get prediction from each class's Ridge() regression model (a real number for each class) and then use argmax to predict the class.