I have 2 different dataframes with a mix of String & Numerical columns. I want to subtract the numeric columns of these two dataframes into a new one but keep string columns.
df1 = pd.DataFrame({'col1': ['A', 'B', 'C'],
'col2': ['black berry','green apple','red wine'],
'col3':['black', 'green','red' ],
'data1':[2,2,2],
'data2':[2,2,2],
'data3':[5,5,5]
})
df2 = pd.DataFrame({'col1': ['A', 'B', 'C'],
'col2': ['black berry','green apple','red wine'],
'col3':['black', 'green','red' ],
'data1':[1,2,2],
'data2':[2,1,3],
'data3':[3,2,1]
})
df1
col1 col2 col3 data1 data2 data3
A black berry black 2 2 5
B green apple green 2 2 5
C red wine red 2 2 5
df2
col1 col2 col3 data1 data2 data3
A black berry black 1 2 3
B green apple green 2 1 2
C red wine red 2 3 1
I expected df3 = df2-df1
df3
col1 col2 col3 data1 data2 data3
A black berry black 1 0 2
B green apple green 0 1 3
C red wine red 0 -1 4
I tried df3=df2-df1
but I get this error:
unsupported operand type(s) for -: 'str' and 'str' because it's not possible to substract characters columns.
So how to do this?
I think this is all you need
df3 = df1.assign(
data1=df1["data1"] - df2["data1"],
data2=df1["data2"] - df2["data2"],
data3=df1["data3"] - df2["data3"],
)
With this solution you just specify which columns you want to do the subtraction on.