For the following df
I wish to change the values in columns A
,B
and C
to those in columns X
,Y
and Z
, taking into account a boolean selection on column B
.
columns = {"A":[1,2,3],
"B":[4,pd.NA,6],
"C":[7,8,9],
"X":[10,20,30],
"Y":[40,50,60],
"Z":[70,80,90]}
df = pd.DataFrame(columns)
df
A B C X Y Z
0 1 4 7 10 40 70
1 2 <NA> 8 20 50 80
2 3 6 9 30 60 90
However when I try to do the value reassignment I end up with NULLS.
df.loc[~(df["B"].isna()), ["A","B","C"]] = df.loc[~(df["B"].isna()), ["X","Y","Z"]]
df
A B C X Y Z
0 NaN NaN NaN 10 40 70
1 2.0 <NA> 8.0 20 50 80
2 NaN NaN NaN 30 60 90
My desired result is:
A B C X Y Z
0 10 40 70 10 40 70
1 2 <NA> 8 20 50 80
2 30 60 90 30 60 90
If I do the reassignment on a single column then I get my expected result:
df.loc[~(df["B"].isna()), "A"] = df.loc[~(df["B"].isna()), "X"]
df
A B C X Y Z
0 10 4 7 10 40 70
1 2 <NA> 8 20 50 80
2 30 6 9 30 60 90
However I expected that I should be able to do this on multiple columns at once. Any ideas what I am missing here?
Thanks
Your columns are not matching on the two sides of the assignment.
Since you use a mask on both sides, don't perform index alignment and directly pass the values:
m = ~df['B'].isna()
df.loc[m, ['A', 'B', 'C']] = df.loc[m, ['X', 'Y', 'Z']].values
For your approach to work, you could also rename
the columns. In this case you would not need the mask on both sides and could benefit from the index alignment:
cols_left = ['A', 'B', 'C']
cols_right = ['X', 'Y', 'Z']
df.loc[~df['B'].isna(), cols_left] = df[cols_right].rename(
columns=dict(zip(cols_right, cols_left))
)
Output:
A B C X Y Z
0 10 40 70 10 40 70
1 2 <NA> 8 20 50 80
2 30 60 90 30 60 90