I understand regularization normally adds k*w^2 to the loss to penalize large weights. But in Keras there are two regularizer parameters - weight_regularizer and activity_ regularizer. What is the difference?
The difference is that activity_regularizer
is applied to the output from an intermediate layer, it penalizes large layer output.