I have a dataframe with the following columns:
name, date, day_index, value
I want to add a 4th column to the same dataframe which is the exponentially weighted moving average of the 3rd column (value) for each name, sorted by first date and then day_index. I can generate this as a series using the following code.
df.sort_values(['date','day_index'],inplace=True)
ecw_series = df.groupby('name').apply(lambda x:
x["value"].ewm(halflife=2).mean())
However, if I try and directly add this to the original dataframe, I get the following error:
df['ecw'] = df.groupby('name').apply(lambda x:
x["value"].ewm(halflife=2).mean())
incompatible index of inserted column with frame index
And if I try and merge the series with the dataframe I get the following error:
df['index'] = df.index
df = df.merge(ecw_series, left_on=['name','index'],right_index=True)
can not merge DataFrame with instance of type <class
'pandas.core.series.Series'
At this point, I was considering converting the series to a dataframe, and then merging. But I am sure there must be a better way.
The following approach works:
df['ecw'] = model_df.groupby('name')['value'].apply(lambda x:
x.ewm(halflife=2).mean())
I am still slightly confused by why you cannot reference the 'value' column within the Lambda function.