Suppose I have:
output = Dense(units=12, activation='sigmoid', activity_regularizer=L1(1e-2))(input)
Keras documentation says activity regularizer "apply a penalty on the layer's output", but it does not specify whether "output" means the output of the dense operation only, or that of the entire layer including activation.
For my problem I need the activity regularizer to apply after activation. In case Keras implements it the other way around, how can I fix it?
Keras applies the activity regularization after the entire layer including activation.
If you scroll to the end of the Dense layer call
method you will see that, if defined, the activation is applied on the output before returning it.
The activity regularization is applied after this call
function in the Layer
base class. See here