Evaluating the value of the multipliers of L1 and L2 type

I have a very simple question. When we performs gradient descent with regularization terms of the $L_1$ and/or $L_2$ types, namely expanding the loss function $L$ with

$$ L_r=L+l_1 \sum_i| \pi_i |+l_2 \sum_j ||\pi_j||^2 $$

Why we do not include $l_1$ and $l_2$ variables in the update rule of the gradient descent?

Solution

It's a hyperparameter, you cannot update weights and this parameter simultaneously. If you will optimize it with weights simultaneously, with respect to loss function on training and (or) testing set - yes, this parameter will become 0 and it will zero out penalty part. Because when you train complex model - it can easily overfit your dataset, and predict values perfectly, in this case best thing that optimization process can do to minimize loss, when model labels dataset perfectly - zero out this parameter. So parameter which was designed to prevent overfitting will do nothing useful.

But you can do grid search