What is the difference between keras.activations.softmax
and keras.layers.Softmax
? Why are there two definitions of the same activation function?
keras.activations.softmax
: https://keras.io/activations/
keras.layers.Softmax
: https://keras.io/layers/advanced-activations/
They are equivalent to each other in terms of what they do. Actually, the Softmax
layer would call the activations.softmax
under the hood:
def call(self, inputs):
return activations.softmax(inputs, axis=self.axis)
However, their difference is that the Softmax
layer could be directly used as a layer:
from keras.layers import Softmax
soft_out = Softmax()(input_tensor)
But, activations.softmax
could not be used directly as a layer. Rather, you can pass it as the activation function of other layers through activation
argument:
from keras import activations
dense_out = Dense(n_units, activation=activations.softmax)
Further, note that the good thing about using Softmax
layer is that it takes an axis
argument and you can compute the softmax over another axis of the input instead of its last axis (which is the default):
soft_out = Softmax(axis=desired_axis)(input_tensor)