Search code examples
pythonrow

how to calculate value using previous other columns' value in python


I'd like to calculate rate of return from Dataframe belows. Using other columns' previous row, even group by id To be specific ,

From

>>> df = pd.DataFrame({'id': ['Blue', 'Blue','Blue','Red','Red'],
                            'a':[100,200,300,1,2], 
                            'b':[10,20,15,3,2],
                            'c':[1,2,3,4,5]})
>>> df
     id    a   b  c
0  Blue  100  10  1
1  Blue  200  20  2
2  Blue  300  15  3
3   Red    1   3  4
4   Red    2   2  5

I want to make following.

df['new_col'] = a / a(previous row value) + b(previous row value) - c(previous row value)

I think pct_change() doen't help since it works only same column.


>>> df
     id    a   b  c   new_col
0  Blue  100  10  1   -
1  Blue  200  20  2   = 200 / (100 + 10 - 1)
2  Blue  300  15  3   = 300 / (200 + 20 - 2)
3   Red    1   3  4   -
4   Red    2   2  5   = 2 / (1 + 3 - 4)

Solution

  • Try:

    def fn(g):
        out = g['a'] / (g['a'].shift() + g['b'].shift() - g['c'].shift())
        g['new_col'] = out
        return g
    
    df = df.groupby('id', group_keys=False).apply(fn)
    print(df)
    

    Prints:

         id    a   b  c   new_col
    0  Blue  100  10  1       NaN
    1  Blue  200  20  2  1.834862
    2  Blue  300  15  3  1.376147
    3   Red    1   3  4       NaN
    4   Red    2   2  5       inf