I have a pandas dataframe:
df = pd.DataFrame({'AKey':[1, 9999, 1, 1, 9999, 2, 2, 2],\
'AnotherKey':[1, 1, 1, 1, 2, 2, 2, 2]})
I want to assign a new value to a specific column and for each element having a specific value in that column.
Let say I want to assign the new value 8888
to the elements having value 9999
.
I tried the following:
df[df["AKey"]==9999]["AKey"]=8888
but it returns the following error:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
So I tried to use loc
df.loc[df["AKey"]==9999]["AKey"]=8888
which returned the same error.
I would appreciate some help and some explanation on the error as I really can't wrap my head around it.
You can use loc in this way:
df.loc[df["AKey"]==9999, "AKey"] = 8888
Producing the following output:
With your original code you are first slicing the dataframe with:
df.loc[df["AKey"]==9999]
Then assign a value for the sliced dataframe's column AKey.
["AKey"]=8888
In other words, you were updating the slice, not the dataframe itself.
From Pandas documentatiom:
.loc[] is primarily label based, but may also be used with a boolean array.
Breaking down the code:
df.loc[df["AKey"]==9999, "AKey"]
df["AKey"]==9999 will return a boolean array identifying the rows, and the string "Akey" will identify the column that will receive the new value, at once without slicing.