Search code examples
pythonpandasdataframedata-analysis

how to merge two dataframes based on a column in pandas


I have two data frames,

 df1=pd.DataFrame({"Req":["Req 1","Req 2","Req 3"],"Count":[1,2,1]})        
    Req     Count
0   Req 1   1
1   Req 2   2
2   Req 3   1

df2=pd.DataFrame({"Req":["Req 1","Req 2"],"Count":[0,1]})

    Req    Count
0   Req 1   0
1   Req 2   1

I am trying to merge these df's based on "Req" column

My desired output is,

    Req    total  from_1 from_2
    Req 1   1       1     0
    Req 2   3       2     1
    Req 3   1       1     0

I tried pd.merge(df1, df2, on = "Req", ) but it is not giving my desired output, please help, thanks in advance!


Solution

  • You can use merge with left join, replace NaNs, rename columns and last add new column with assign:

    df = (pd.merge(df1, df2, on = "Req", how='left')
            .fillna(0)
            .rename(columns={'Count_x':'from_1','Count_y':'from_2'})
            .assign(total=lambda x: x['from_1'] + x['from_2'])
          )
    print (df)
       from_1    Req  from_2  total
    0       1  Req 1     0.0    1.0
    1       2  Req 2     1.0    3.0
    2       1  Req 3     0.0    1.0