Search code examples
pythonpandasdatetimestrftime

Pandas: Fill missing dates in Pandas dataframe


How do I fill the Date column so that when it detects a date it adds that date to the below rows, until it sees a new date starts adding that date?

Reproducible example:

Input:


                Date                                           Headline
0   Mar-20-21 04:03AM  Apple CEO Cook, executives on tentative list o...
1             03:43AM  Apple CEO Cook, execs on tentative list of wit...
2   Mar-19-21 10:19PM  Dow Jones Futures: Why This Market Rally Is So...
3             06:13PM  Zuckerberg: Apples Privacy Move Could Spur Mor...
4             05:45PM  Apple (AAPL) Dips More Than Broader Markets: W...
5             04:17PM  Facebook Stock Jumps As Zuckerberg Changes Tun...
6             04:03PM  Best Dow Jones Stocks To Buy And Watch In Marc...
7             01:02PM  The Nasdaq's on the Rise Friday, and These 2 S...

Desired Output:


                 Date                                           Headline
0   Mar-20-21 04:03AM  Apple CEO Cook, executives on tentative list o...
1   Mar-20-21 03:43AM  Apple CEO Cook, execs on tentative list of wit...
2   Mar-19-21 10:19PM  Dow Jones Futures: Why This Market Rally Is So...
3   Mar-19-21 06:13PM  Zuckerberg: Apples Privacy Move Could Spur Mor...
4   Mar-19-21 05:45PM  Apple (AAPL) Dips More Than Broader Markets: W...
5   Mar-19-21 04:17PM  Facebook Stock Jumps As Zuckerberg Changes Tun...
6   Mar-19-21 04:03PM  Best Dow Jones Stocks To Buy And Watch In Marc...
7   Mar-19-21 01:02PM  The Nasdaq's on the Rise Friday, and These 2 S...

Attempt:

df['Time'] = [x[-7:] for x in df['Date']]
df['Date'] = [x[:-7] for x in df['Date']]
# Some code that fills the date
# Then convert to datetime

Solution

  • Before you use ffill() you need to split the two columns to get the correct time, and only fill in the Date part. You will need to replace spaces with np.nan to use ffill(). Then put the columns back together and wrap that operation in pd.to_datetime to get the correct dtype.

    Lastly you can drop the time column.

    # Imports
    import numpy as np
    import pandas as pd
    
    # Split the column
    df[['Date','Time']] = df['Date'].str.split(' ',expand=True)
    
    # Replace space with nan and use ffill()
    df['Date'] = df['Date'].replace(r'^\s*$', np.nan, regex=True).ffill()
    
    # Put the columns back and convert to datetime
    df['Date'] =  pd.to_datetime(df['Date'] + ' ' + df['Time'])
    
    # Drop the time column
    del(df['Time'])
    

    Will get you back:

    df
                     Date                                           Headline
    0 2021-03-20 04:03:00  Apple CEO Cook, executives on tentative list o...
    1 2021-03-20 03:43:00  Apple CEO Cook, execs on tentative list of wit...
    2 2021-03-19 22:19:00  Dow Jones Futures: Why This Market Rally Is So...
    3 2021-03-19 18:13:00  Zuckerberg: Apples Privacy Move Could Spur Mor...
    4 2021-03-19 17:45:00  Apple (AAPL) Dips More Than Broader Markets: W...
    5 2021-03-19 16:17:00  Facebook Stock Jumps As Zuckerberg Changes Tun...
    6 2021-03-19 16:03:00  Best Dow Jones Stocks To Buy And Watch In Marc...
    7 2021-03-19 13:02:00  The Nasdaq's on the Rise Friday, and These 2 S...
    

    EDIT If you want your 'Date' to show exactly as you have it in your desired outcome, i.e. this format 'Mar-20-21', don't wrap it in pd.to_datetime() and keep it as an object:

    df['Date'] =  df['Date'] + ' ' + df['Time']
    
    df
                    Date                                           Headline
    0  Mar-20-21 04:03AM  Apple CEO Cook, executives on tentative list o...
    1  Mar-20-21 03:43AM  Apple CEO Cook, execs on tentative list of wit...
    2  Mar-19-21 10:19PM  Dow Jones Futures: Why This Market Rally Is So...
    3  Mar-19-21 06:13PM  Zuckerberg: Apples Privacy Move Could Spur Mor...
    4  Mar-19-21 05:45PM  Apple (AAPL) Dips More Than Broader Markets: W...
    5  Mar-19-21 04:17PM  Facebook Stock Jumps As Zuckerberg Changes Tun...
    6  Mar-19-21 04:03PM  Best Dow Jones Stocks To Buy And Watch In Marc...
    7  Mar-19-21 01:02PM  The Nasdaq's on the Rise Friday, and These 2 S...