Search code examples
pythonsvmloss-function

Clarification needed for the 'penalty' argument in svm.LinearSVC


In relation to this post, the accepted answer explained the penalty and the loss in the regularisation problem of the SVM. However at the end the terms 'l1-loss', 'l2-loss' are used.

As I understand, the objective function in the regularisation problem is the sum of the loss function, e.g. the hinge loss:
\sum_i [1- y_i * f_i]_+
and the penalty term:
\lambda /2 ||\beta ||^2

By saying 'l1 hinge loss', can I interpreted it as l1-norm specified in argument 'penalty' applying to both the loss and the penalty terms?

In the regularisation problem below from the Elements of Statisical Learning (Hastie et al), is it the l1-loss being used?

enter image description here


Solution

  • No, the L2 indicates what sort of penalty is applied, the hinge describes the nature of the loss term. Selecting L1 or L2 makes no change to the hinge-loss, they only effect the penalty term.

    If you refer to the equation here: https://scikit-learn.org/stable/modules/svm.html#linearsvc for the default loss term of LinearSVC the left part is the penalty and is by default the L2 penalty applied to weights whilst the right part of the equation is the hinge-loss.

    Checking the description of the penalty parameter here: https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

    Specifies the norm used in the penalization.

    In the example you provide above an L2 penalty is being used. An L1 penalty would be the sum of absolute values of the beta terms, what you have above is the sum of squared values.