Search code examples
pythonpandasdataframelambda

Pandas, create column using previous new column value


I am using Python and have the following Pandas Dataframe:

idx result grouping
1 False
2 True
3 True
4 False
5 True
6 True
7 True
8 False
9 True
10 True
11 True
12 True

What I would like is to do the following logic...

if the result is False then I want grouping to be the idx value.

if the result is True then I want the grouping to be the previous grouping value

So the end result will be:

idx result grouping
1 False 1
2 True 1
3 True 1
4 False 4
5 True 4
6 True 4
7 True 4
8 False 8
9 True 8
10 True 8
11 True 8
12 True 8

I have tried all sorts to get this working from using the Pandas shift() command to using lambda, but I am just not getting it.

I know I could iterate through the dataframe and perform the calculation but there has to be a better method.

examples of what I have tried and failed with are:

  • df['grouping'] = df['idx'] if not df['result'] else df['grouping'].shift(1)

  • df['grouping'] = df.apply(lambda x: x['idx'] if not x['result'] else x['grouping'].shift(1), axis=1)

Many Thanks for any assistance you can provide.


Solution

  • mask true values then forward fill

    df['grouping'] = df['idx'].mask(df['result']).ffill(downcast='infer')
    

        idx  result  grouping
    0     1   False         1
    1     2    True         1
    2     3    True         1
    3     4   False         4
    4     5    True         4
    5     6    True         4
    6     7    True         4
    7     8   False         8
    8     9    True         8
    9    10    True         8
    10   11    True         8
    11   12    True         8