Search code examples
pythonpandasdataframemergefillna

Filling na values with merge from another dataframe


I have a column with na values that I want to fill according to values from another data frame according to a key. I was wondering if there is any simple way to do so.

Example: I have a data frame of objects and their colors like this:

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball     **NaN**
4  chair   white
5  chair     **NaN**
6   ball    grey

I want to fill na values in the color column with default color from the following data frame:

  object default_color
0  chair         brown
1   ball          blue
2   door          grey

So the result will be this:

  object   color
0  chair   black
1   ball  yellow
2   door   brown
3   ball     **blue**
4  chair   white
5  chair     **brown**
6   ball    grey

Is there any "easy" way to do this?

Thanks :)


Solution

  • First create Series and then replace NaNs:

    s = df1['object'].map(df2.set_index('object')['default_color'])
    print (s)
    0    brown
    1     blue
    2     grey
    3     blue
    4    brown
    5    brown
    6     blue
    Name: object, dtype: object
    
    df1['color']= df1['color'].mask(df1['color'].isnull(), s)
    

    Or:

    df1.loc[df1['color'].isnull(), 'color'] = s
    

    Or:

    df1['color'] = df1['color'].combine_first(s)
    

    Or:

    df1['color'] = df1['color'].fillna(s)
    

    print (df1)
      object   color
    0  chair   black
    1   ball  yellow
    2   door   brown
    3   ball    blue
    4  chair   white
    5  chair   brown
    6   ball    grey
    

    If unique values in object:

    df = df1.set_index('object')['color']
            .combine_first(df2.set_index('object')['default_color'])
            .reset_index()
    

    Or:

    df = df1.set_index('object')['color']
            .fillna(df2.set_index('object')['default_color'])
            .reset_index()