Search code examples
pythonpandasdataframepandas-loc

Change value in pandas after chained loc and iloc


I have the following problem: in a df, I want to select specific rows and a specific column and in this selection take the first n elements and assign a new value to them. Naively, I thought that the following code should do the job:

import seaborn as sns
import pandas as pd

df = sns.load_dataset('tips')
df.loc[df.day=="Sun", "smoker"].iloc[:4] = "Yes"

Both of the loc and iloc should return a view into the df and the value should be overwritten. However, the dataframe does not change. Why?

I know how to go around it -- creating a new df first just with the loc, then changing the value using iloc and updating back the original df (as below).

But a) I do not think it's optimal, and b) I would like to know why the top solution does not work. Why does it return a copy and not a view of a view?

The alternative solution:

df = sns.load_dataset('tips')
tmp = df.loc[df.day=="Sun", "smoker"]
tmp.iloc[:4] = "Yes"
df.loc[df.day=="Sun", "smoker"] = tmp

Note: I have read the docs, this really great post and this question but they don't explain this. Their concern is the difference between df.loc[mask,"z] and the chained df["z"][mask].


Solution

  • I believe df.loc[].iloc[] is a chained assignment case and pandas doesn't guarantee that you will get a view at the end. From the docs:

    Whether a copy or a reference is returned for a setting operation, may depend on the context. This is sometimes called chained assignment and should be avoided.

    Since you have a filtering condition in loc, pandas will create a new pd.Series and than will apply an assignment to it. For example the following will work because you'll get the same series as df["smoker"]:

    df.loc[:, "smoker"].iloc[:4] = 'Yes'
    

    But you will get SettingWithCopyWarning warning.

    You need to rewrite your code so that pandas handles this as a single loc entity.

    Another possible workaround:

    df.loc[df[df.day=="Sun"].index[:4], "smoker"] = 'Yes'