Search code examples
pythonpandasdateautomationautofill

Auto Fill calendar with missing months


I have a table like this with date and values but, as you can see, some months are not in the list. (5, 6, 8, 10 and 1 from the next year).

Date Any other value
2020-01-01 value
2020-02-01 value
2020-02-04 value
2020-02-04 value
2020-03-11 value
2020-04-04 value
2020-07-04 value
2020-07-04 value
2020-09-01 value
2020-11-06 value
2020-12-02 value
2021-02-04 value
2021-03-11 value

Is any way that I can automatically fill this table with those months, to become?

Date Any other value
2020-01-01 value
2020-02-01 value
2020-02-04 value
2020-02-04 value
2020-03-11 value
2020-04-04 value
2020-05-01 NaN
2020-06-01 NaN
2020-07-04 value
2020-07-04 value
2020-08-01 NaN
2020-09-01 value
2020-10-01 NaN
2020-11-06 value
2020-12-02 value
2021-01-01 NaN
2021-02-04 value
2021-03-11 value

Thank to you all!


Solution

  • Logically it's an outer join to all the months you are interested in.

    import pandas as pd
    
    df = pd.DataFrame({"Date":["2019-12-31","2020-01-31","2020-02-03","2020-02-03","2020-03-10","2020-04-03","2020-07-03","2020-07-03","2020-08-31","2020-11-05","2020-12-01","2021-02-03","2021-03-10"],"Any other value":["value","value","value","value","value","value","value","value","value","value","value","value","value"]})
    df["Date"] = pd.to_datetime(df["Date"])
    
    df["month"] = (df['Date'] - pd.offsets.MonthBegin(1)).dt.floor('d')
    
    
    df = df.merge(
    pd.DataFrame({"month":pd.date_range(df["month"].min(), df["month"].max(), freq="MS")}),
        on="month", how="outer")
    df["Date"].fillna(df["month"], inplace=True)
    df = df.drop(columns="month")
    
    print(df.to_string(index=False))
    

    output

          Date Any other value
    2019-12-31           value
    2020-01-31           value
    2020-02-03           value
    2020-02-03           value
    2020-03-10           value
    2020-04-03           value
    2020-07-03           value
    2020-07-03           value
    2020-08-31           value
    2020-11-05           value
    2020-12-01           value
    2021-02-03           value
    2021-03-10           value
    2020-05-01             NaN
    2020-06-01             NaN
    2020-09-01             NaN
    2020-10-01             NaN
    2020-12-01             NaN
    2021-01-01             NaN