Search code examples
pythonpandasfor-loopgroup

Create derived columns using recursive data and separating by category in Python


I have a dataframe with the following (code, inbound, sell) and I want to add (inventory, oh_avg), like that:

idx product inbound sell inventory oh_avg
1   1       6       3     3        1         (3/mean(3))
2   1       4       6     1        0.22      (1/mean(3+6))
3   1       4       0     5        1.67      (5/mean(3+6+0))
4   2       20      10    10       1         (10/mean(10))
5   2       10      5     15       2         (15/mean(10+5))

Inventory should be (inbound - sell + previous inventory)

oh_avg should be (inventory/mean(all previous sales))

Both should be by product

How do I do that?


Solution

  • Try (I've added inventory 2 and oh_avg 2 for comparison to desired result):

    g = df.groupby("product")
    
    df["inventory 2"] = g["inbound"].cumsum() - g["sell"].cumsum()
    df["oh_avg 2"] = df["inventory 2"] / g["sell"].expanding().mean().values
    
    print(df)
    

    Prints:

       idx  product  inbound  sell  inventory  oh_avg  inventory 2  oh_avg 2
    0    1        1        6     3          3    1.00            3  1.000000
    1    2        1        4     6          1    0.22            1  0.222222
    2    3        1        4     0          5    1.67            5  1.666667
    3    4        2       20    10         10    1.00           10  1.000000
    4    5        2       10     5         15    2.00           15  2.000000