Search code examples
pandasfillna

Pandas fill missing values with groupby


I have a table of various indicators grouped by Date and Code. I am trying to fill missing values with the previous day's data OR if not available - with the next day's data for each Code.

The problem is when I group by 'Code' and 'Date', nothing happens

df = pd.DataFrame([['2019-05-01', 'APL', 15951, 303, 49],  
['2019-05-02', 'APL', 16075, 301, 46],  
['2019-05-03', 'APL', np.nan, 300, 45],  
['2019-05-04', 'APL', 15868, 298.8, 33],  
['2019-05-01', 'MSK', 2222, np.nan, np.nan],  
['2019-05-02', 'MSK', 2224, 243, 53],  
['2019-05-03', 'MSK', 2266, 233, 33],  
['2019-05-04', 'MSK', np.nan, 253, 55]],  
columns=['Date', 'Code', 'Price', 'Volume', 'ATM'])

Here is what I am trying:

df.groupby(['Code','Date'])['Price','Volume', 'ATM'].fillna(method = 'ffill')

Solution

  • You need:

    df.groupby(['Code']).apply(lambda x: x.ffill().bfill())
    

    Output:

        Code      Date  Price   Volume  ATM
    0   APL 2019-05-01  15951.0 303.0   49.0
    1   APL 2019-05-02  16075.0 301.0   46.0
    2   APL 2019-05-03  16075.0 300.0   45.0
    3   APL 2019-05-04  15868.0 298.8   33.0
    4   MSK 2019-05-01  2222.0  243.0   53.0
    5   MSK 2019-05-02  2224.0  243.0   53.0
    6   MSK 2019-05-03  2266.0  233.0   33.0
    7   MSK 2019-05-04  2266.0  253.0   55.0
    

    If you groupby ['Date', 'Code'], each day becomes a group and within that group, there will not be any missing values.