Search code examples
pythonpandasdataframemergefillna

Pandas: How to merge two data frames and fill NaN values using values from the second data frame


I have a pandas dataframe (df1) that looks like this:

No     car          pl.       Value      Expected      
1      Toyota       HK        0.1        0.12      
1      Toyota       NY        0.2        NaN     
2      Saab         LOS       0.3        NaN      
2      Saab         UK        0.4        0.6       
2      Saab         HK        0.5        0.51     
3      Audi         NYU       0.6        NaN      
3      Audi         LOS       0.7        NaN      
4      VW           UK        0.8        NaN   
5      Audi         HK        0.9        NaN    

And I have another dataframe (df2) that looks like this:

No        pl.             Expected              
2         LOS              0.35      
3         NYU              0.62   
3         LOS              0.76    
5         HK               0.91     

I would like my final dataframe to look like this:

No     car          pl.       Value      Expected      
1      Toyota       HK        0.1        0.12      
1      Toyota       NY        0.2        NaN     
2      Saab         LOS       0.3        0.35      
2      Saab         UK        0.4        0.6       
2      Saab         HK        0.5        0.51     
3      Audi         NYU       0.6        0.62      
3      Audi         LOS       0.7        0.76      
4      VW           UK        0.8        NaN   
5      Audi         HK        0.9        0.91    

I tried this:

df = df1.fillna(df1.merge(df2, on=['No','pl.']))

But df1 remains unchanged in the output

The questions that I have seen here have been of dataframes with the same shape. Is there a way to do this when the shapes are different?

Thanks in advance!


Solution

  • Use left join with suffixes parameter and then replace missing values by Series.fillna with DataFrame.pop for use and drop column Expected_:

    df = df1.merge(df2, on=['No','pl.'], how='left', suffixes=('_',''))
    df['Expected'] = df.pop('Expected_').fillna(df['Expected'])
    print (df)
       No     car  pl.  Value  Expected
    0   1  Toyota   HK    0.1      0.12
    1   1  Toyota   NY    0.2       NaN
    2   2    Saab  LOS    0.3      0.35
    3   2    Saab   UK    0.4      0.60
    4   2    Saab   HK    0.5      0.51
    5   3    Audi  NYU    0.6      0.62
    6   3    Audi  LOS    0.7      0.76
    7   4      VW   UK    0.8       NaN
    8   5    Audi   HK    0.9      0.91