Search code examples
numpytheanoarray-broadcastingsoftmax

masked softmax in theano


I am wondering if it possible to apply a mask before performing theano.tensor.nnet.softmax?

This is the behavior I am looking for:

>>>a = np.array([[1,2,3,4]])
>>>m = np.array([[1,0,1,0]]) # ignore index 1 and 3
>>>theano.tensor.nnet.softmax(a,m)
array([[ 0.11920292,  0. ,  0.88079708,  0.  ]])

Note that a and m are matrices, so I would like the softmax with work on an entire matrix and perform row-wise masked softmax.

Also the output should be the same shape as a, so the solution can not do advanced indexing e.g. theano.tensor.softmax(a[0,[0,2]])


Solution

  • def masked_softmax(a, m, axis):
        e_a = T.exp(a)
        masked_e = e_a * m
        sum_masked_e = T.sum(masked_e, axis, keepdims=True)
        return masked_e / sum_masked_e