Search code examples
pythonpandaszero

Getting a different value then zero in pandas dataframe when surrounded by other values in certain radius


In a research with big datasets I created a dataset with zeros (0) and ones (1). However, when the value 0 is surrounded by 1 in all directions, it should get a value of 2.

I work in a Spyder environment with Python 3.7. Nothing too remarkable. I just can't figure out the code.

import pandas as pd

df = pd.read_excel (r'D:\AW 1920 VU\Research Project\Nieuwe map\Proberen.xlsx') #just an example excel sheet
print (df) 

df2= df.replace(range(1,20) , 1)
print (df2)''' 


df = 
[{0 0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   1   0   0   0   0   0   0   0}
{0  0   0   1   11  2   1   1   0    0  0   0   0}
{0  0   0   7   13  1   0   0   0   0   0   0   0}
{0  0   0   2   2   7   0   2   1   0   0   0   0}
{0  0   0   3   5   8   8   2   1   0   0   0   0}
{0  0   0   1   6   7   0   0   1   1   0   0   0}
{0  0   0   1   1   0   0   0   2   0   0   0   0}
{0  0   0   1   1   1   1   0   3   4   0   0   0}
{0  0   0   0   0   1   1   1   2   0   0   0   0}
{0  0   0   0   0   0   1   1   1   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}]

df2=
[{0 0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   1   0   0   0   0   0   0   0}
{0  0   0   1   1   1   1   1   0   0   0   0   0}
{0  0   0   1   1   1   0   0   0   0   0   0   0}
{0  0   0   1   1   1   0   1   1   0   0   0   0}
{0  0   0   1   1   1   1   1   1   0   0   0   0}
{0  0   0   1   1   1   0   0   1   1   0   0   0}
{0  0   0   1   1   0   0   0   1   0   0   0   0}
{0  0   0   1   1   1   1   0   1   1   0   0   0}
{0  0   0   0   0   1   1   1   1   0   0   0   0}
{0  0   0   0   0   0   1   1   1   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}]

All fine so far. But as you can see, there is a spot with values of 0, surrounded by ones. How can I lock/buffer/highlight that area and give it a "special value"(2). So the result will be something like:

df3=
[{0 0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   1   0   0   0   0   0   0   0}
{0  0   0   1   1   1   1   1   0   0   0   0   0}
{0  0   0   1   1   1   0   0   0   0   0   0   0}
{0  0   0   1   1   1   0   1   1   0   0   0   0}
{0  0   0   1   1   1   1   1   1   0   0   0   0}
{0  0   0   1   1   1   2   2   1   1   0   0   0}
{0  0   0   1   1   2   2   2   1   0   0   0   0}
{0  0   0   1   1   1   1   2   1   1   0   0   0}
{0  0   0   0   0   1   1   1   1   0   0   0   0}
{0  0   0   0   0   0   1   1   1   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}
{0  0   0   0   0   0   0   0   0   0   0   0   0}]

Hopefully the table is readable. Looking forward to the responses.


Solution

  • Used code:

    import pandas as pd
    import numpy as np 
    from scipy import ndimage
    #%%
    df = np.array ([
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 1,11, 2, 1, 1, 0, 0, 0, 0, 0],
        [0, 0, 0, 7,13, 1, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 2, 2, 7, 0, 2, 1, 0, 0, 0, 0],
        [0, 0, 0, 3, 5, 8, 8, 2, 1, 0, 0, 0, 0],
        [0, 0, 0, 1, 6, 7, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0, 2, 0, 0, 0, 0],
        [0, 0, 0, 1, 1, 1, 1, 0, 3, 4, 0, 0, 0],
        [0, 0, 0, 0, 0, 1, 1, 1, 2, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
    df2 = np.where(df>=1, 2, df)
    df3 = np.where(df2<1, 1, df2)
    df4 = np.where(df3==2, 0, df3)
    
    labeled_array, num_features = ndimage.label(df4, np.ones((3,3)))
    labeled_array, num_features