Search code examples
pythonarraysnumpymask

how does numpy masking using arr[mask, ...] work?


I read in numpy.delete documentation that given an array arr:

mask = np.ones(len(arr), dtype=bool)
mask[[0,2,4]] = False
result = arr[mask,...]

Is equivalent to np.delete(arr, [0,2,4], axis=0), but allows further use of mask.

From this I can see what arr[mask,...] does, and I have tested how it works and am able to use this to mask arrays. But I'm just curious, what exactly is this arr[mask,...] syntax? i.e. How do I use this syntax generally?


Solution

  • First make sure we understand a 1d case:

    In [106]: arr = np.arange(10)
    In [107]: mask = np.ones(arr.shape, bool)
    In [108]: mask[[0,2,3,7]] = 0
    In [109]: mask
    Out[109]: 
    array([False,  True, False, False,  True,  True,  True, False,  True,
            True])
    In [110]: arr[mask]
    Out[110]: array([1, 4, 5, 6, 8, 9])
    

    The len(arr) bit, and [mask,...] adds a bit of a complication, that I still need to sort out.

    The actual code that implements this kind of delete is:

        slobj = [slice(None)]*ndim
        N = arr.shape[axis]
        ...
        keep = ones(N, dtype=bool)
        ...
        keep[obj, ] = False
        slobj[axis] = keep
        new = arr[slobj]
    

    So in the example case:

    In [112]: arr = np.arange(10).reshape(5,2)
    In [113]: arr
    Out[113]: 
    array([[0, 1],
           [2, 3],
           [4, 5],
           [6, 7],
           [8, 9]])
    In [114]: slobj = [slice(None), slice(None)]
    In [115]: mask = np.ones(5,bool)
    In [116]: mask[[0,2,4]] = 0
    In [117]: mask
    Out[117]: array([False,  True, False,  True, False])
    In [118]: slobj[0] = mask
    In [119]: slobj
    Out[119]: [array([False,  True, False,  True, False]), slice(None, None, None)]
    In [120]: arr[slobj]
    Out[120]: 
    array([[2, 3],
           [6, 7]])