Search code examples
pythonpandasdataframewarnings

How can I fix the "A value is trying to be set on a copy of a slice from a DataFrame" warning in pandas?


I have the following Python function:

def compute_average_fg_rating(df, mask=''):
    df = df[['HorseId', 'FGrating']]
    if len(mask) == 0:
        df.loc['cumsum'] = df.groupby('HorseId', group_keys=False)['FGrating'].apply(
            lambda x: x.shift(fill_value=0).cumsum())
        return df.loc['cumsum'] / df.groupby('HorseId')['FGrating'].cumcount()
    else:
        return df.loc[mask].groupby('HorseId', group_keys=False)['FGrating'].apply(
            lambda x: x.shift().expanding().mean())

When I try to run the code, I get the "A value is trying to be set on a copy of a slice from a DataFrame" warning at the line:

        df.loc['cumsum'] = df.groupby('HorseId', group_keys=False)['FGrating'].apply(
            lambda x: x.shift(fill_value=0).cumsum())

I cannot see where the problematic code is. Can you help me?


Solution

  • This is because df.loc['cumsum'] is not a reference to a specific row in the dataframe. In your if statement change it like this:

    cumsum_df = df.groupby('HorseId', group_keys=False)['FGrating'].apply(
                lambda x: x.shift(fill_value=0).cumsum())
    df.loc[cumsum_df.index, 'cumsum'] = cumsum_df
            return df['cumsum'] / df.groupby('HorseId')['FGrating'].cumcount()
    

    This should solve your problem