Search code examples
kerassoftmax

Apply softmax on a subset of neurons


I'm building a convolutional net in Keras that assigns multiple classes to an image. Given that the image has 9 points of interest that can be classified in one of the three ways I wanted to add 27 output neurons with softmax activation that would compute probability for each consecutive triple of neurons.

Is it possible to do that? I know I can simply add a big softmax layer but this would result in a probability distribution over all output neurons which is too broad for my application.


Solution

  • In the most naive implementation, you can reshape your data and you'll get exactly what you described: "probability for each consecutive triplet".

    You take the output with 27 classes, shaped like (batch_size,27) and reshape it:

    model.add(Reshape((9,3)))
    model.add(Activation('softmax'))
    

    Take care to reshape your y_true data as well. Or add yet another reshape in the model to restore the original form:

    model.add(Reshape((27,))
    

    In more elaborate solutions, you'd probably separate the 9 points of insterest according to their locations (if they have a roughly static location) and make parallel paths. For instance, suppose your 9 locations are evenly spaced rectangles, and you want to use the same net and classes for those segments:

    inputImage = Input((height,width,channels))
    
    #supposing the width and height are multiples of 3, for easiness in this example
    recHeight = height//3
    recWidth = width//3
    
    #create layers here without calling them
    someConv1 = Conv2D(...)
    someConv2 = Conv2D(...)
    flatten = Flatten()
    classificator = Dense(..., activation='softmax')
    
    outputs = []
    for i in range(3):
        for j in range(3):
            fromH = i*recHeight
            toH = fromH + recHeight
            fromW = j*recWidth
            toW = fromW + recWidth
            imagePart = Lambda(
                               lambda x: x[:,fromH:toH, fromW:toW,:], 
                               output_shape=(recHeight,recWidth,channels)
                              )(inputImage)
    
            #using the same net and classes for all segments
            #if this is not true, create new layers here instead of using the same
            output = someConv1(imagePart)
            output = someConv2(output)
            output = flatten(output)
            output = classificator(output)
            outputs.append(output)
    
    outputs = Concatenate()(outputs)
    
    model = Model(inputImage,outputs)