Search code examples
pandasgroup-byapply

Groupby and apply function multiple times for producing multiple new columns


I would like to create three new columns in an existing pandas dataframe (df) by using the same lambda function on the same column only with different input. I have succeeded that by the following lines of code. Nevertheless, I would like to know whether there is a quicker way to produce the same output with fewer lines of code:

df["1d"] = (
        df
        .groupby(cols)["ln"]
        .apply(lambda x: x - x.shift(1))
        .values
)
df["5d"] = (
        df
        .groupby(cols)["ln"]
        .apply(lambda x: x - x.shift(5))
        .values
)
df["30d"] = (
        df
        .groupby(cols)["ln"]
        .apply(lambda x: x - x.shift(30))
        .values
)

Solution

  • You have to loop over the shifts, but you could reuse the grouper:

    shifts = [1, 5, 30]
    g = df.groupby(cols)['ln']
    for s in shifts:
        df[f'{s}d'] = df['ln']-g.shift(s)
    

    Also, your approach to use groupby.apply + .values is incorrect. This would only work if the rows are already sorted by group. If you use groupby.shift you will maintain a correct index/alignement.

    If you really need to use a custom function, go with groupby.transform:

    shifts = [1, 5, 30]
    g = df.groupby(cols)['ln']
    for s in shifts:
        df[f'{s}d'] = g.transform(lambda x: x-x.shift(s))
    

    Example with a dummy input and cols = ['A', 'B']:

        A  B  ln    1d    5d  30d
    0   0  0   0   NaN   NaN  NaN
    1   0  0   2   2.0   NaN  NaN
    2   0  0  10   8.0   NaN  NaN
    3   0  1   6   NaN   NaN  NaN
    4   0  1   8   2.0   NaN  NaN
    5   0  1   9   1.0   NaN  NaN
    6   0  1  15   6.0   NaN  NaN
    7   1  0   4   NaN   NaN  NaN
    8   1  0  19  15.0   NaN  NaN
    9   1  1   1   NaN   NaN  NaN
    10  1  1   3   2.0   NaN  NaN
    11  1  1  12   9.0   NaN  NaN
    12  1  1  16   4.0   NaN  NaN
    13  1  1  17   1.0   NaN  NaN
    14  1  1  18   1.0  17.0  NaN
    15  2  0   7   NaN   NaN  NaN
    16  2  0  11   4.0   NaN  NaN
    17  2  0  13   2.0   NaN  NaN
    18  2  1   5   NaN   NaN  NaN
    19  2  1  14   9.0   NaN  NaN