I have a table like this with date and values but, as you can see, some months are not in the list. (5, 6, 8, 10 and 1 from the next year).
Date | Any other value |
---|---|
2020-01-01 | value |
2020-02-01 | value |
2020-02-04 | value |
2020-02-04 | value |
2020-03-11 | value |
2020-04-04 | value |
2020-07-04 | value |
2020-07-04 | value |
2020-09-01 | value |
2020-11-06 | value |
2020-12-02 | value |
2021-02-04 | value |
2021-03-11 | value |
Is any way that I can automatically fill this table with those months, to become?
Date | Any other value |
---|---|
2020-01-01 | value |
2020-02-01 | value |
2020-02-04 | value |
2020-02-04 | value |
2020-03-11 | value |
2020-04-04 | value |
2020-05-01 | NaN |
2020-06-01 | NaN |
2020-07-04 | value |
2020-07-04 | value |
2020-08-01 | NaN |
2020-09-01 | value |
2020-10-01 | NaN |
2020-11-06 | value |
2020-12-02 | value |
2021-01-01 | NaN |
2021-02-04 | value |
2021-03-11 | value |
Thank to you all!
Logically it's an outer join to all the months you are interested in.
import pandas as pd
df = pd.DataFrame({"Date":["2019-12-31","2020-01-31","2020-02-03","2020-02-03","2020-03-10","2020-04-03","2020-07-03","2020-07-03","2020-08-31","2020-11-05","2020-12-01","2021-02-03","2021-03-10"],"Any other value":["value","value","value","value","value","value","value","value","value","value","value","value","value"]})
df["Date"] = pd.to_datetime(df["Date"])
df["month"] = (df['Date'] - pd.offsets.MonthBegin(1)).dt.floor('d')
df = df.merge(
pd.DataFrame({"month":pd.date_range(df["month"].min(), df["month"].max(), freq="MS")}),
on="month", how="outer")
df["Date"].fillna(df["month"], inplace=True)
df = df.drop(columns="month")
print(df.to_string(index=False))
Date Any other value
2019-12-31 value
2020-01-31 value
2020-02-03 value
2020-02-03 value
2020-03-10 value
2020-04-03 value
2020-07-03 value
2020-07-03 value
2020-08-31 value
2020-11-05 value
2020-12-01 value
2021-02-03 value
2021-03-10 value
2020-05-01 NaN
2020-06-01 NaN
2020-09-01 NaN
2020-10-01 NaN
2020-12-01 NaN
2021-01-01 NaN