Search code examples
pythonpandasdataframecumsum

How to count the number of occurences before a particular value in dataframe python?


I have a dataframe like below:

A   B   C
1   1   1
2   0   1
3   0   0
4   1   0
5   0   1
6   0   0
7   1   0

I want the number of occurence of zeroes from df['B'] under the following condition:

if(df['B']<df['C']):
  #count number of zeroes in df['B'] until it sees 1.

expected output:

A   B   C  output
1   1   1   Nan
2   0   1   1
3   0   0   Nan
4   1   0   Nan
5   0   1   1
6   0   1   0
7   1   0   Nan

I dont know how to formulate the count part. Any help is really appreciated


Solution

  • IIUC one approach would be using a custom grouper and aggregating with groupby.cumcount:

    c1 = df.B.lt(df.C)
    g = df.B.eq(1).cumsum()
    df['out'] = c1.groupby(g).cumcount(ascending=False).shift().where(c1).sub(1)
    

    print(df)
    
       A  B  C  out
    0  1  1  1  NaN
    1  2  0  1  1.0
    2  3  0  0  NaN
    3  4  1  0  NaN
    4  5  0  1  1.0
    5  6  0  1  0.0
    6  7  1  0  NaN