I am trying to convert an hourly DatetimeIndex
to a daily DatetimeIndex
with no repetitions.
I have two DatetimeIndex
:
import pandas as pd
snapshots = pd.date_range("2050-01-01 00:00", "2050-12-31 23:00", freq="H")
days = pd.date_range("2050-01-01", "2050-12-31", freq="d")
However I would like to create days
directly from snapshots
. I tried to convert to PeriodIndex
but then I struggle with unique dates:
days = snapshots.to_period().asfreq('d')
(I have 'YYYY-MM-DD' data, but 8760 for every hour of the year)
Simpliest is pass minimal and maximal value of snapshots
to date_range
:
days = pd.date_range(snapshots.min(), snapshots.max(), freq="d")
print (days)
DatetimeIndex(['2050-01-01', '2050-01-02', '2050-01-03', '2050-01-04',
'2050-01-05', '2050-01-06', '2050-01-07', '2050-01-08',
'2050-01-09', '2050-01-10',
...
'2050-12-22', '2050-12-23', '2050-12-24', '2050-12-25',
'2050-12-26', '2050-12-27', '2050-12-28', '2050-12-29',
'2050-12-30', '2050-12-31'],
dtype='datetime64[ns]', length=365, freq='D')
Or use DatetimeIndex.normalize
for set times to 00:00:00
and remove duplicates - by Index.unique
or Index.drop_duplicates
:
days = snapshots.normalize().unique()
print (days)
DatetimeIndex(['2050-01-01', '2050-01-02', '2050-01-03', '2050-01-04',
'2050-01-05', '2050-01-06', '2050-01-07', '2050-01-08',
'2050-01-09', '2050-01-10',
...
'2050-12-22', '2050-12-23', '2050-12-24', '2050-12-25',
'2050-12-26', '2050-12-27', '2050-12-28', '2050-12-29',
'2050-12-30', '2050-12-31'],
dtype='datetime64[ns]', length=365, freq=None)
days = snapshots.normalize().drop_duplicates()
print (days)
DatetimeIndex(['2050-01-01', '2050-01-02', '2050-01-03', '2050-01-04',
'2050-01-05', '2050-01-06', '2050-01-07', '2050-01-08',
'2050-01-09', '2050-01-10',
...
'2050-12-22', '2050-12-23', '2050-12-24', '2050-12-25',
'2050-12-26', '2050-12-27', '2050-12-28', '2050-12-29',
'2050-12-30', '2050-12-31'],
dtype='datetime64[ns]', length=365, freq=None)