Search code examples
pythonmachine-learningneural-networkkerassoftmax

One hot input to softmax output in keras


I have a neural network with one-hot m*n vectors as input, with rows representing categories, and columns representing position.

I want to train a network to output another (stochastic) vector with the same m*n shape at the output layer, with probabilities at each column summing to one. The idea would be to use a softmax final layer, but do I need to build each column separately and concatenate like here? or is it possible to do this more simply in (e.g. one-liner) in Keras?


Solution

  • If your model has an output shape of (None, m, n) and you want to compute the softmax over the second axis, you can simply use the softmax activation method and pass the axis argument to it (in your case it must be axis=1):

    from keras import activations
    
    def the_softmax(axis):
        def my_softmax(x):
            return activations.softmax(x, axis=axis)
        return my_softmax
    
    # sequential model
    model.add(Dense(..., activation=the_softmax(1)))
    
    # functional model
    output = Dense(..., activation=the_softmax(1))(prev_layer_output)
    

    Alternatively, if you would like to use it as an independent layer, you can use a Lambda layer and the backend softmax function:

    from keras import backend as K
    
    def the_softmax(axis):
        def my_softmax(x):
            return K.softmax(x, axis=axis)
        return my_softmax
    
    # sequential model
    model.add(Lambda(the_softmax(1)))
    
    # functional model
    output = Lambda(the_softmax(1))(prev_layer_output)