I have code where a function/method accepts a Series (row from df) and is supposed to modify it in-place, such that changes are reflected in the original df. However, I seem unable to force the modification as a view rather than a copy. Information from the documentation and a related question on Stack Overflow do not resolve the issue as given by the example below:
import pandas as pd
pd.__version__ # 0.24.2
ROW_NAME = "r1"
COL_NAME = "B"
NEW_VAL = 100.0
# df I would like to modify in-place
df = pd.DataFrame({"A":[[1], [2], [3,4]], "B": [1.0, 2.0, 3.0]}, index=["r1", "r2", "r3"])
# a row (Series reference) is the input param to a function that should modify df in-place
record = df.loc[ROW_NAME]
record.loc[COL_NAME] = NEW_VAL
assert df.loc[ROW_NAME, COL_NAME] == NEW_VAL #False
The line starting with record.loc
results in the familiar warning:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
, which might make sense, except that record
appears to reference df
and can be modified in-place under some circumstances. An example of this:
record = df.loc[ROW_NAME]
record.loc["A"].append(NEW_VALUE)
assert NEW_VALUE in df.loc["r1", "A"] # True
My question is: how can I force a modification the float value at df.loc[ROW_NAME, COL_NAME]
in-place from the Series record
? Bonus points for clarifying why it is possible to modify column A in-place but not column B in the examples above.
Other related questions:
Based on the sources linked in the question and a thorough reading of the documentation, it does not appear possible to enforce returning a view vs copy of a Series generated from a DataFrame row.
As @Lilith Schneider points out, the original confusion over this comes from the fact that record = df.loc["r1"]
returns a shallow copy - some hybrid of a copy and view that may cause confusion and lead to unexpected behavior.