L2 Regularization must be added into cost function when using Linear Regression?
Im not adding l2 or taking into account when computing cost. Is that wrong?
The code snippet below should be sufficient :
def gradient(self, X, Y, alpha, minibatch_size):
predictions = None
for batch in self.iterate_minibatches(X, Y, minibatch_size, shuffle=True):
x, y = batch
predictions = x.dot(self.theta)
for it in range(self.theta.size):
temp = x[:, it]
temp.shape = (y.size, 1)
errors_x1 = (predictions - y) * temp
self.theta[it] = self.theta[it] - alpha * (1.0 / y.size) * errors_x1.sum() + self.lambda_l2 * self.theta[it] * self.theta[it].T
print self.cost(X, Y, self.theta)
def cost(self, X, Y, theta, store=True):
predictions = X.dot(theta)
from sklearn.metrics import mean_squared_error
cost = mean_squared_error(Y, predictions, sample_weight=None, multioutput='uniform_average')
if store is True:
self.cost_history.append(cost)
return cost
It is not necessary to add L2 (or L1) regularization to your Linear Regression (LR) implementation.
However, adding L2 regularization term to our cost function has an advantage over LR without a regularization term. Most importantly, regularization term helps you to reduce the model overfitting and improve the generalization of your models. LR with L2 regularization is commonly known as "Ridge Regression".
In addition to Ridge Regression, LR with L1 regularization is know as Lasso Regression. If you build regression models using Lasso Regression you models would be sparse models. Hence, Lasso can be used for feature selection as well.
Good Luck!