Search code examples
pythonpandasfillna

Is there a function to impute date in pandas based on date in row above?


I have a dataframe with a date column where some of the dates are missing. I would like to fill these with the day after the date from the previous row.

  date        
0 2015-01-31  
1 NaT        
2 2015-02-02  

So for example in the data frame above, row 1 would be imputed with 2015-02-01. I would like to achieve this without reindexing!


Solution

  • Using a global variable with apply helps ur cause:

    import pandas as pd
    import numpy as np
    df = pd.DataFrame({"d":["2015-01-31", np.nan, "2015-02-02", np.nan, np.nan, "2015-02-02", np.nan]})
    
    df["d"] = pd.to_datetime(df["d"])
    
    print(df)
    

    #df

               d
    0 2015-01-31
    1        NaT
    2 2015-02-02
    3        NaT
    4        NaT
    5 2015-02-02
    6        NaT
    

    value = np.nan
    def impute_date_rowise(x):
        global value
        if pd.isnull(value):
            value = x
            return value
        elif not pd.isnull(x):
            value = x
            return x
        else:
            value = (value + pd.DateOffset(days=1)).date()
            return value
    
    df["d"].apply(impute_date_rowise)
    

    0   2015-01-31
    1   2015-02-01
    2   2015-02-02
    3   2015-02-03
    4   2015-02-04
    5   2015-02-02
    6   2015-02-03