Search code examples
numpymask

I'm using a mask to slice a numpy array, but the output is flattened. How do I retain the number of columns?


Here is what I have so far:

arr = np.round(np.random.uniform(0,1,size = (10,10)),decimals = 0)
print(arr)
arr2 = np.cumsum(arr,axis=0)
print(arr2)
mask = np.where((arr == 1)&(arr2<=3),1,0)
print(mask)
population = np.round(np.random.uniform(0,5,size=(10,10)),decimals=0)
print(population)
maskedPop = population[mask==1]
print(maskedPop)

This outputs a flattened array, is there a way I can keep the 10 columns? So the output would be 3x10?


Solution

  • Your code, reduced in scale:

    In [153]: arr = np.round(np.random.uniform(0,1,size = (5,5)),decimals = 0)
         ...: print(arr)
         ...: arr2 = np.cumsum(arr,axis=0)
         ...: print(arr2)
         ...: mask = np.where((arr == 1)&(arr2<=3),1,0)
         ...: print(mask)
         ...: population = np.round(np.random.uniform(0,5,size=(5,5)),decimals=0)
         ...: print(population)
         ...: print(mask==1)
         ...: maskedPop = population[mask==1]
         ...: print(maskedPop)
    

    The print results - I added the mask==1 line, since that's what's doing the indexing:

    [[0. 1. 1. 0. 1.]
     [1. 0. 1. 1. 1.]
     [1. 0. 0. 1. 1.]
     [1. 1. 0. 0. 1.]
     [0. 0. 0. 0. 0.]]
    [[0. 1. 1. 0. 1.]
     [1. 1. 2. 1. 2.]
     [2. 1. 2. 2. 3.]
     [3. 2. 2. 2. 4.]
     [3. 2. 2. 2. 4.]]
    [[0 1 1 0 1]
     [1 0 1 1 1]
     [1 0 0 1 1]
     [1 1 0 0 0]
     [0 0 0 0 0]]
    [[0. 5. 2. 2. 2.]
     [1. 4. 2. 4. 0.]
     [2. 3. 3. 2. 2.]
     [4. 4. 3. 1. 3.]
     [4. 2. 2. 1. 5.]]
    [[False  True  True False  True]
     [ True False  True  True  True]
     [ True False False  True  True]
     [ True  True False False False]
     [False False False False False]]
    [5. 2. 2. 1. 2. 4. 0. 2. 2. 2. 4. 4.]
    

    Count the number of True per row or column. Tell us how this could retain some sort of 2d result!

    ===

    I see you already display mask, so mask== is the same as

    In [158]: mask.astype(bool)
    Out[158]: 
    array([[False,  True,  True, False,  True],
           [ True, False,  True,  True,  True],
           [ True, False, False,  True,  True],
           [ True,  True, False, False, False],
           [False, False, False, False, False]])
    

    There is a MaskedArray class that lets you work with an array with certain values 'masked-out':

    In [161]: np.ma.masked_array(population, mask!=1)
    Out[161]: 
    masked_array(
      data=[[--, 5.0, 2.0, --, 2.0],
            [1.0, --, 2.0, 4.0, 0.0],
            [2.0, --, --, 2.0, 2.0],
            [4.0, 4.0, --, --, --],
            [--, --, --, --, --]],
      mask=[[ True, False, False,  True, False],
            [False,  True, False, False, False],
            [False,  True,  True, False, False],
            [False, False,  True,  True,  True],
            [ True,  True,  True,  True,  True]],
      fill_value=1e+20)
    

    ===

    Another way to retain masked values in an array is to somehow 'zero-out' values:

    In [162]: mpop = population.copy()
    In [163]: mpop[mask!=1] = np.nan
    In [164]: mpop
    Out[164]: 
    array([[nan,  5.,  2., nan,  2.],
           [ 1., nan,  2.,  4.,  0.],
           [ 2., nan, nan,  2.,  2.],
           [ 4.,  4., nan, nan, nan],
           [nan, nan, nan, nan, nan]])