Softmax of array with zero and non-zero values results in array with only non-zero values

I have an array with some values that are zero and some that are non-zero. Then I apply a softmax, I want all non-zero values add up to 1. But after the softmax, all values are non-zero and add up to 1.

Here's what I'm trying to do: I have some values

score[0]

<tf.Tensor: shape=(1, 48), dtype=float32, numpy=
array([[ 2.405819  , 27.748499  , 16.080362  ,  8.780167  , 16.615538  ,
        19.353844  , 19.497992  , 16.051327  ,  5.4946175 , 15.927819  ,
        11.512515  , 19.716702  , 15.100697  , 26.370419  , 21.838608  ,
        10.650975  ,  9.212484  , 17.439907  , 14.322778  , 12.001259  ,
        10.433163  , 10.011807  , 15.847178  , 18.343014  , 26.086296  ,
        26.723047  , 17.28703   , -0.7059817 , 26.380203  , 21.49808   ,
        14.828656  , 13.711437  , 19.565845  ,  5.9418716 , 12.614753  ,
        29.56828   ,  1.1372657 , 25.873251  , 36.031494  , -7.397362  ,
        12.691793  ,  4.3349338 , 15.1586275 , 14.650254  , 14.632486  ,
        18.829857  , 21.885925  ,  0.56010276]], dtype=float32)>

and a mask

mask_test[0]

<tf.Tensor: shape=(1, 48), dtype=int32, numpy=
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 1, 1]])>

I multiply the values with the mask

score = tf.multiply(score, tf.cast(mask_test, tf.float32))
score[0]

<tf.Tensor: shape=(1, 48), dtype=float32, numpy=
array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        , -0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        , -0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        18.829857  , 21.885925  ,  0.56010276]], dtype=float32)>

That works fine. Now I want to add a softmax, so that all non-zero values add up to 1. The 0 should stay 0.

attention_weights = tf.nn.softmax(score, axis=-1)
attention_weights[0]

<tf.Tensor: shape=(1, 48), dtype=float32, numpy=
array([[2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 2.9859784e-10, 2.9859784e-10, 2.9859784e-10,
        2.9859784e-10, 4.4956207e-02, 9.5504379e-01, 5.2280064e-10]],
      dtype=float32)>

And the result are all non-zero values. I guess that is from the exponential in the softmax. Is there a way to achieve this with the softmax or is there another way? The mask is not always the same.

thanks in advance

Solution

Softmax does not work that way. Take a look at the formula of softmax

You would need to define a custom function for this.

A simple way of doing this would be:

def custom_soft_max(arr):
    non_zero_indices = np.where(arr != 0)
    arr[non_zero_indices] = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)
    return arr

This will exclude all the indices that have a corresponding value of 0, and then perform softmax on only the non-zero indices.