Search code examples
pythonnumpymatrixmaskingminimum

Mask minimum values in matrix rows


I have this 3x3 matrix:

a=array([[ 1, 11,  5],
   [ 3,  9,  9],
   [ 5,  7, -3]])

I need to mask the minimum values in each row in order to calculate the mean of each row discarding the minimum values. Is there a general solution? I have tried with

a_masked=np.ma.masked_where(a==np.ma.min(a,axis=1),a)

Which masks the minimum value in first and third row, but not the second row?

I would appreciate any help. Thanks!


Solution

  • The issue is because the comparison a == a.min(axis=1) is comparing each column to the minimum value of each row rather than comparing each row to the minimum values. This is because a.min(axis=1) returns a vector rather than a matrix which behaves similarly to an Nx1 array. As such, when broadcasting, the == operator performs the operation in a column-wise fashion to match dimensions.

    a == a.min(axis=1)
    
    # array([[ True, False, False],
    #        [False, False, False],
    #        [False, False,  True]], dtype=bool)
    

    One potential way to fix this is to resize the result of a.min(axis=1) into column vector (e.g. a 3 x 1 2D array).

    a == np.resize(a.min(axis=1), [a.shape[0],1])
    
    # array([[ True, False, False],
    #        [ True, False, False],
    #        [False, False,  True]], dtype=bool)
    

    Or more simply as @ColonelBeuvel has shown:

    a == a.min(axis=1)[:,None]
    

    Now applying this to your entire line of code.

    a_masked = np.ma.masked_where(a == np.resize(a.min(axis=1),[a.shape[0],1]), a)
    
    # masked_array(data =
    #   [[-- 11 5]
    #   [-- 9 9]
    #   [5 7 --]],
    #        mask =
    #           [[ True False False]
    #            [ True False False]
    #            [False False  True]],
    #           fill_value = 999999)