I have the following Python function:
def compute_average_fg_rating(df, mask=''):
df = df[['HorseId', 'FGrating']]
if len(mask) == 0:
df.loc['cumsum'] = df.groupby('HorseId', group_keys=False)['FGrating'].apply(
lambda x: x.shift(fill_value=0).cumsum())
return df.loc['cumsum'] / df.groupby('HorseId')['FGrating'].cumcount()
else:
return df.loc[mask].groupby('HorseId', group_keys=False)['FGrating'].apply(
lambda x: x.shift().expanding().mean())
When I try to run the code, I get the "A value is trying to be set on a copy of a slice from a DataFrame" warning at the line:
df.loc['cumsum'] = df.groupby('HorseId', group_keys=False)['FGrating'].apply(
lambda x: x.shift(fill_value=0).cumsum())
I cannot see where the problematic code is. Can you help me?
This is because df.loc['cumsum'] is not a reference to a specific row in the dataframe. In your if statement change it like this:
cumsum_df = df.groupby('HorseId', group_keys=False)['FGrating'].apply(
lambda x: x.shift(fill_value=0).cumsum())
df.loc[cumsum_df.index, 'cumsum'] = cumsum_df
return df['cumsum'] / df.groupby('HorseId')['FGrating'].cumcount()
This should solve your problem