Search code examples
python-3.xpandasnumpyequals-operator

Numpy equal to not working properly


I have a dataframe with values new_df.values

arr = np.array([[ 0.        ,  0.31652875,  0.05650486,  0.11726623,  0.30987541,
     0.30987541,  0.30987541],
   [ 0.31652875,  0.        ,  0.34982559,  0.33382917,  0.00799828,
     0.00799828,  0.00799828],
   [ 0.05650486,  0.34982559,  0.        ,  0.07718834,  0.34384549,
     0.34384549,  0.34384549],
   [ 0.11726623,  0.33382917,  0.07718834,  0.        ,  0.32917553,
     0.32917553,  0.32917553],
   [ 0.30987541,  0.00799828,  0.34384549,  0.32917553,  0.        ,
     0.        ,  0.        ],
   [ 0.30987541,  0.00799828,  0.34384549,  0.32917553,  0.        ,
     0.        ,  0.        ],
   [ 0.30987541,  0.00799828,  0.34384549,  0.32917553,  0.        ,
     0.        ,  0.        ]])

And I found the min other than zeros like i.e

# new_df[new_df != 0].min().values this is want was used to get this

min_arr = np.array([ 0.05650486,  0.00799828,  0.05650486,  0.07718834,  0.00799828,
    0.00799828,  0.00799828])

When I do arr == min_arr and np.isclose(arr,min_arr) I get :

array([[False, False,  True, False, False, False, False],
       [False, False, False, False,  True,  True,  True],
       [ True, False, False,  True, False, False, False],
       [False, False, False, False, False, False, False],
       [False,  True, False, False, False, False, False],
       [False,  True, False, False, False, False, False],
       [False,  True, False, False, False, False, False]], dtype=bool)

Everything is working fine but not the fourth row. May I know why? Is there any work around for this?


Solution

  • Seems like you need to expand the shape of your minimums for broadcasting within np.isclose. Without [:, None], I get the same issue in the 4th row.

    arr[arr == 0] = np.nan
    mins = np.nanmin(arr, axis=1)
    print(np.isclose(arr, mins[:, None]))  # need to expand dim/newaxis
    
    [[False False  True False False False False]
     [False False False False  True  True  True]
     [ True False False False False False False]
     [False False  True False False False False]
     [False  True False False False False False]
     [False  True False False False False False]
     [False  True False False False False False]]
    

    Why the error: when you use just the 1d mins, you're comparing elementwise along rows. The confusing part is that that comparison actually looks a lot like your intended solution, with the exception of one cell.

    For instance, without expanding to the new axis, comparison of the first row would look like:

    arr[0] == mins
    

    Which doesn't seem to be what you want.