Search code examples
pythonpandasdataframedata-sciencedata-munging

pandas apply subtractions on columns function when indexes are not equal, based on alignment in another columns


I have two dataframes:

df1 =  
    C0   C1. C2.  
4   AB. 1.  2
5   AC. 7   8
6   AD. 9.  9
7   AE. 2.  6
8   AG  8.  9

df2 = 
   C0    C1. C2
8  AB    0. 1
9  AE.   6. 3
10 AD.   1. 2

I want to apply a subtraction between these two dataframes, such that when the value of the columns C0 is the same - I will get the subsraction, and when is not - a bool column will have the value False. notice that current indeics are not aligned. So new df1 should be:

df1 =  
    C0   C1. C2. diff_C1 match  
4   AB.  1.  2.    1.    True
5   AC.  7   8.    0.    False
6   AD.  9.  9.    8.    True
7   AE.  2.  6.    -4.   True
8   AG   8.  9.    0    False

What is the best way to do it?


Solution

  • You can try merging the columns using pandas.DataFrame.merge on column C0 and how as left as shown below

    df1.merge(df2, how='left', on='C0')
       .assign(match=lambda x: x['C1_y'].notna())
       .fillna(0)
    

    Output:

    enter image description here

    then subtract the C1 columns i.e. C1_x and C1_y

    df['C1_diff'] = df['C1_x'] - df['C1_y']
    

    enter image description here