python pandas group-by apply rolling-computation

Apply function to a groupby and rolling group

Given a pandas dataframe, group it by 3 levels and apply a function which takes two columns as arguments on a rolling basis:

Code to create the data frame:

import pandas as pd
import numpy as np
df_index = pd.DataFrame({
    'classes': ['A']*3*3 +  ['B']*3*3 +['C']*3*3,
    'names': ['Alex']*3 + ['Alan']*3 + ['Ash']*3 + ['Bart']*3 + ['Blake']*3 + ['Beth']*3 + ['Charlie']*3 + ['Cristine']*3 + ['Cameron']*3,
    'numbers': [0, 1, 2] * 9
})

df_values = pd.DataFrame(data=np.random.normal(), index=pd.date_range(start='2020-01-01', end='2024-01-01'), columns=['net', 'gross']).rename_axis('date').reset_index(drop=False)

df = pd.DataFrame()
for i in range(len(df_index)):
    _df = df_values.copy()
    _df['classes'] = df_index.iloc[i]['classes']
    _df['names'] = df_index.iloc[i]['names']
    _df['numbers'] = df_index.iloc[i]['numbers']
    df = pd.concat((df, _df), axis=0)

df.set_index(['classes','names','numbers','date'], inplace=True)

Some function:

def fun(net, gross):
    return net.mean() / gross.std()

The following does not work. I am looking to groupby, and apply "fun()" in a rolling basis:

df.groupby(['classes', 'names', 'numbers']).rolling(window=500).apply(
    lambda x: fun(net=x['net'], gross=x['gross'])  
)

Thanks.

PS: the real fun() is much complex than the one here so I cannot calculate ".mean()" and ".std()" directly to the groupby.

Solution

rolling.apply operates per column, so it isn't directly possible to use multiple columns.

Instead you could slice one column and use a side effect by slicing the original DataFrame based on the indices of the window:

(df.groupby(['classes', 'names', 'numbers']).rolling(window=500)['net']
   .apply(lambda x: fun(net=df.loc[x.index, 'net'],
                        gross=df.loc[x.index, 'gross']))
)

But be aware that the operation might be very slow!