Search code examples
pythonpandasgroup-bycounter

I want to get the 'False' Count for every segment


I have a data set that I have broken into segments based on a certain criteria. In another column I have it where the code returns 'True' if the value is equal to the one before it and 'False' if it is not equal. I am able to get the total count of 'False' values for the entire data set, but I am trying to get the total count of 'False' values per segment.

My code:

df['cols2'] = df['cols1'].diff().eq(0).replace({False : 0, True : 1})

counter_obj = Counter(df['cols2'])

false_count = counter_obj[False]

seg = df.groupby('segID')[cols2 , false_count].sum()
print(seg)

Solution

  • import pandas as pd
    
    df = pd.DataFrame({'cols1': [1, 2, 3, 3, 4, 5, 5, 6, 7]})
    df['cols2'] = df['cols1'].diff().eq(0).replace({False: 0, True: 1})
    
    df['cs'] = df['cols2'].cumsum()
    

    I can suggest creating a 'cs' column with a cumulative sum of 'cols2' in order to divide the dataframe into groups. As far as I understood you, you need to count only zeros in each segment.

    Input

       cols1  cols2  cs
    0      1      0   0
    1      2      0   0
    2      3      0   0
    3      3      1   1
    4      4      0   1
    5      5      0   1
    6      5      1   2
    7      6      0   2
    8      7      0   2
    

    To aggregate a count by group:

    agr = df.groupby('cs').apply(lambda x: x.loc[x['cols2'] == 0, 'cols2'].count())
    

    Output

    cs
    0    3
    1    2
    2    2
    

    and if need data for each row:

    df['count'] = df.groupby('cs')['cols2'].transform(lambda x: x[x==0].count())
    

    Output

       cols1  cols2  cs  count
    0      1      0   0      3
    1      2      0   0      3
    2      3      0   0      3
    3      3      1   1      2
    4      4      0   1      2
    5      5      0   1      2
    6      5      1   2      2
    7      6      0   2      2
    8      7      0   2      2