Search code examples
pythonpandasdate-range

From hourly DatetimeIndex to daily DatetimeIndex


I am trying to convert an hourly DatetimeIndex to a daily DatetimeIndex with no repetitions.

I have two DatetimeIndex:

import pandas as pd

snapshots = pd.date_range("2050-01-01 00:00", "2050-12-31 23:00", freq="H")

days = pd.date_range("2050-01-01", "2050-12-31", freq="d")

However I would like to create days directly from snapshots. I tried to convert to PeriodIndex but then I struggle with unique dates:

days = snapshots.to_period().asfreq('d')

(I have 'YYYY-MM-DD' data, but 8760 for every hour of the year)


Solution

  • Simpliest is pass minimal and maximal value of snapshots to date_range:

    days = pd.date_range(snapshots.min(), snapshots.max(), freq="d")
    print (days)
    DatetimeIndex(['2050-01-01', '2050-01-02', '2050-01-03', '2050-01-04',
                   '2050-01-05', '2050-01-06', '2050-01-07', '2050-01-08',
                   '2050-01-09', '2050-01-10',
                   ...
                   '2050-12-22', '2050-12-23', '2050-12-24', '2050-12-25',
                   '2050-12-26', '2050-12-27', '2050-12-28', '2050-12-29',
                   '2050-12-30', '2050-12-31'],
                  dtype='datetime64[ns]', length=365, freq='D')
    

    Or use DatetimeIndex.normalize for set times to 00:00:00 and remove duplicates - by Index.unique or Index.drop_duplicates:

    days = snapshots.normalize().unique()
    print (days)
    DatetimeIndex(['2050-01-01', '2050-01-02', '2050-01-03', '2050-01-04',
                   '2050-01-05', '2050-01-06', '2050-01-07', '2050-01-08',
                   '2050-01-09', '2050-01-10',
                   ...
                   '2050-12-22', '2050-12-23', '2050-12-24', '2050-12-25',
                   '2050-12-26', '2050-12-27', '2050-12-28', '2050-12-29',
                   '2050-12-30', '2050-12-31'],
                  dtype='datetime64[ns]', length=365, freq=None)
    

    days = snapshots.normalize().drop_duplicates()
    print (days)
    DatetimeIndex(['2050-01-01', '2050-01-02', '2050-01-03', '2050-01-04',
                   '2050-01-05', '2050-01-06', '2050-01-07', '2050-01-08',
                   '2050-01-09', '2050-01-10',
                   ...
                   '2050-12-22', '2050-12-23', '2050-12-24', '2050-12-25',
                   '2050-12-26', '2050-12-27', '2050-12-28', '2050-12-29',
                   '2050-12-30', '2050-12-31'],
                  dtype='datetime64[ns]', length=365, freq=None)