Search code examples
pythonpandasdataframeassign

Pandas DataFrame.assign() doesn't work properly for multiple columns


I am trying to reassign multiple columns in DataFrame with modifications. The below is a simplified example.

import pandas as pd 
d = {'col1':[1,2], 'col2':[3,4]}
df = pd.DataFrame(d)
print(df)
   col1  col2
0     1     3
1     2     4

I use assign() method to add 1 to both 'col1' and 'col2'. However, the result is to add 1 only to 'col2' and copy the result to 'col1'.

df2 = df.assign(**{c: lambda x: x[c] + 1 for c in ['col1','col2']})
print(df2)
   col1  col2
0     4     4
1     5     5

Can someone explain why this is happening, and also suggest a correct way to apply assign() to multiple columns?


Solution

  • I think the lambda here can not be used within the for loop dict

    df.assign(**{c: df[c] + 1 for c in ['col1','col2']})