Search code examples
pythonpandaspandas-groupbymoving-averagesplit-apply-combine

Pandas - directly add moving average columns from group by to dataframe


I have a dataframe with the following columns:

name, date, day_index, value

I want to add a 4th column to the same dataframe which is the exponentially weighted moving average of the 3rd column (value) for each name, sorted by first date and then day_index. I can generate this as a series using the following code.

df.sort_values(['date','day_index'],inplace=True)

ecw_series = df.groupby('name').apply(lambda x: 
x["value"].ewm(halflife=2).mean())

However, if I try and directly add this to the original dataframe, I get the following error:

df['ecw'] =  df.groupby('name').apply(lambda x: 
x["value"].ewm(halflife=2).mean())



incompatible index of inserted column with frame index

And if I try and merge the series with the dataframe I get the following error:

df['index'] = df.index

df = df.merge(ecw_series, left_on=['name','index'],right_index=True)

can not merge DataFrame with instance of type <class 
'pandas.core.series.Series'

At this point, I was considering converting the series to a dataframe, and then merging. But I am sure there must be a better way.


Solution

  • The following approach works:

    df['ecw'] = model_df.groupby('name')['value'].apply(lambda x: 
     x.ewm(halflife=2).mean())
    

    I am still slightly confused by why you cannot reference the 'value' column within the Lambda function.