Search code examples
pythonpandasfillna

Pandas: Fill NaN Values row by row by group ID


I'm trying to fill NaN values row by row according to group ID.

I've tried using fillNA, using the forward and backward fill options, but the fillNA function does not fill up the dataframe row by row. Additionally, I want to make sure that the companies match before the NaN values are filled. In this case, using a forward fill will cause company "Pear" to be filled with data from company "Banana".


appended = appended.sort_values(by=['Company','Intro'],na_position='last')
appended = appended.reset_index(drop=True)

for i in appended.index:

    if i==0:
        pass
    else:
        if appended.at[i,'Company']==appended.at[i-1,'Company']:
            appended.fillna(method='ffill',inplace=True)
        else:
            pass

appended dataframe

Company    Intro          Categories         Headquarters  Founded Date   Funding Stage

 Apple       xyz       Healthcare, Big Data     New York       2018           Series A

 Apple       NaN              NaN                NaN           NaN             NaN

 Apple       NaN              NaN                NaN           NaN             NaN

 Banana     Lier           Government           Europe        2010           Series B

 Pear        NaN              NaN                NaN           NaN             NaN

This is the expected result that I hope to achieve:

Expected Result

Company    Intro          Categories         Headquarters  Founded Date   Funding Stage

 Apple       xyz       Healthcare, Big Data     New York       2018           Series A

 Apple       xyz       Healthcare, Big Data     New York       2018           Series A

 Apple       xyz       Healthcare, Big Data     New York       2018           Series A

 Banana      Lier        Government             Europe        2010           Series B

 Pear         NaN              NaN                NaN           NaN             NaN

Solution

  • Use groupby with ffill

    df.groupby(['Company']).ffill()
    
      Company Intro            Categories Headquarters  Founded Date Funding Stage
    0   Apple   xyz  Healthcare, Big Data     New York        2018.0      Series A
    1   Apple   xyz  Healthcare, Big Data     New York        2018.0      Series A
    2   Apple   xyz  Healthcare, Big Data     New York        2018.0      Series A
    3  Banana  Lier            Government       Europe        2010.0      Series B
    4    Pear   NaN                   NaN          NaN           NaN           NaN