Search code examples
pythonpandasdatemissing-datareindex

Incorrect reindex when filling missing date gap


I have a group of data where I would like to fill in a specific range of missing dates, and the dates are in the format of period[D] (which I believe to be period type).

The dataset is as follows:

Date           value
2020-05-01      8.2
2020-07-15      8.3
2020-07-23      8.4

My goal is to fill in the date gap between 7/15/2020 to 7/18/2020, and the filled in 'values' should be 'na' or 'NAN'. I have tried using reindex.

I first converted the periodIndex of the dataset to timestamp using

df.index = pd.PeriodIndex.to_timestamp(df.index)

and I did the following:

idx = pd.date_range('2020-07-16', '2013-07-22')
df = df['value']
df1 = df.reindex(idx, fill_value=0)
df1

But the reindex shows me the following result:

Date           value
2020-07-16      0
2020-07-17      0
2020-07-18      0
2020-07-19      0
2020-07-20      0
2020-07-21      0

But my desired output is:

Date           value
2020-05-01      8.2
2020-07-15      8.3
2020-07-16      0
2020-07-17      0
2020-07-18      0
2020-07-19      0
2020-07-20      0
2020-07-21      0
2020-07-23      8.4

Does anyone have any idea about where went wrong?


Solution

  • Instead of reindexing try concatenating

    df = pd.concat([df, pd.DataFrame(0, index=pd.date_range('2020-07-16', '2013-07-22'))]
        ).sort_index()