Search code examples
pythonpandaspandas-loc

assign new value to repeated (or multiple) objective element(s) to a pandas dataframe


I have a pandas dataframe:

df = pd.DataFrame({'AKey':[1, 9999, 1, 1, 9999, 2, 2, 2],\
    'AnotherKey':[1, 1, 1, 1, 2, 2, 2, 2]})

I want to assign a new value to a specific column and for each element having a specific value in that column.

Let say I want to assign the new value 8888 to the elements having value 9999. I tried the following:

df[df["AKey"]==9999]["AKey"]=8888

but it returns the following error:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

So I tried to use loc

df.loc[df["AKey"]==9999]["AKey"]=8888

which returned the same error.

I would appreciate some help and some explanation on the error as I really can't wrap my head around it.


Solution

  • You can use loc in this way:

    df.loc[df["AKey"]==9999, "AKey"] = 8888
    

    Producing the following output:

    enter image description here

    With your original code you are first slicing the dataframe with:

    df.loc[df["AKey"]==9999]
    

    Then assign a value for the sliced dataframe's column AKey.

    ["AKey"]=8888
    

    In other words, you were updating the slice, not the dataframe itself.

    From Pandas documentatiom:

    .loc[] is primarily label based, but may also be used with a boolean array.

    Breaking down the code:

    df.loc[df["AKey"]==9999, "AKey"]
    

    df["AKey"]==9999 will return a boolean array identifying the rows, and the string "Akey" will identify the column that will receive the new value, at once without slicing.