Search code examples
pythonmachine-learningkeraskeras-layersoftmax

In Keras, how to apply softmax function on each row of the weight matrix?


from keras.models import Model
from keras.models import Input
from keras.layers import Dense

a = Input(shape=(3,))
b = Dense(2, use_bias=False)(a)
model = Model(inputs=a, outputs=b)

Suppose that the weights of the Dense layer in the above code is [[2, 3], [3, 1], [-1, 1]]. If we give [[2, 1, 3]] as an input to the model, then the output will be:

no softmax

But I want to apply the softmax function to each row of the Dense layer, so that the output will be:

with softmax

How can I do this?


Solution

  • One way to achieve what you are looking for is to define a custom layer by subclassing the Dense layer and overriding its call method:

    from keras import backend as K
    
    class CustomDense(Dense):
        def __init__(self, units, **kwargs):
            super(CustomDense, self).__init__(units, **kwargs)
    
        def call(self, inputs):
            output = K.dot(inputs, K.softmax(self.kernel, axis=-1))
            if self.use_bias:
                output = K.bias_add(output, self.bias, data_format='channels_last')
            if self.activation is not None:
                output = self.activation(output)
            return output
    

    Test to make sure it works:

    model = Sequential()
    model.add(CustomDense(2, use_bias=False, input_shape=(3,)))
    
    model.compile(loss='mse', optimizer='adam')
    
    import numpy as np
    
    w = np.array([[2,3], [3,1], [1,-1]])
    inp = np.array([[2,1,3]])
    
    model.layers[0].set_weights([w])
    print(model.predict(inp))
    
    # output
    [[4.0610714 1.9389288]]
    

    Verify it using numpy:

    soft_w = np.exp(w) / np.sum(np.exp(w), axis=-1, keepdims=True)
    print(np.dot(inp, soft_w))
    
    [[4.06107115 1.93892885]]