Search code examples
pythonpandasdataframe

Reassigning Values of Multiple Columns to Values of Multiple Other Columns


For the following df I wish to change the values in columns A,B and C to those in columns X,Y and Z, taking into account a boolean selection on column B.

columns = {"A":[1,2,3],
           "B":[4,pd.NA,6],
           "C":[7,8,9],
           "X":[10,20,30],
           "Y":[40,50,60],
           "Z":[70,80,90]}

df = pd.DataFrame(columns)
df

    A   B      C    X   Y   Z
0   1   4      7    10  40  70
1   2   <NA>   8    20  50  80
2   3   6      9    30  60  90

However when I try to do the value reassignment I end up with NULLS.

df.loc[~(df["B"].isna()), ["A","B","C"]] = df.loc[~(df["B"].isna()), ["X","Y","Z"]]
df

    A   B      C    X   Y   Z
0   NaN NaN    NaN  10  40  70
1   2.0 <NA>   8.0  20  50  80
2   NaN NaN    NaN  30  60  90

My desired result is:

    A   B   C   X   Y   Z
0   10  40  70  10  40  70
1   2   <NA>    8   20  50  80
2   30  60  90  30  60  90

If I do the reassignment on a single column then I get my expected result:

df.loc[~(df["B"].isna()), "A"] = df.loc[~(df["B"].isna()), "X"]
df
    A   B      C    X   Y   Z
0   10  4      7    10  40  70
1   2   <NA>   8    20  50  80
2   30  6      9    30  60  90

However I expected that I should be able to do this on multiple columns at once. Any ideas what I am missing here?

Thanks


Solution

  • Your columns are not matching on the two sides of the assignment.

    Since you use a mask on both sides, don't perform index alignment and directly pass the values:

    m = ~df['B'].isna()
    df.loc[m, ['A', 'B', 'C']] = df.loc[m, ['X', 'Y', 'Z']].values
    

    For your approach to work, you could also rename the columns. In this case you would not need the mask on both sides and could benefit from the index alignment:

    cols_left = ['A', 'B', 'C']
    cols_right = ['X', 'Y', 'Z']
    df.loc[~df['B'].isna(), cols_left] = df[cols_right].rename(
        columns=dict(zip(cols_right, cols_left))
    )
    

    Output:

        A     B   C   X   Y   Z
    0  10    40  70  10  40  70
    1   2  <NA>   8  20  50  80
    2  30    60  90  30  60  90