Reading the docs of the pandas Period
objects leaves me confused as to whether it is somehow possible to create a custom Period. By custom I mean a Period that does not follow a certain frequency dtype
but where each Period in a PeriodIndex
represents a start timestamp and an end timestamp, that is manually defined.
To illustrate:
Lets say I have n time intervals t, where the beginning and end of each interval are samples from a discrete uniformly distributed random variable with the constraint that the beginning < end.
The result would somewhat look like this:
[(ts_start0, ts_end0), (ts_start1, ts_end1), (ts_start2, ts_end2))]
Is there any way to encode such "random" intervals/ timespans/ periods with Pandas Period
or something similar?
I think you can use IntervalIndex:
In [18]: pd.IntervalIndex([pd.Interval(1,3), pd.Interval(4, 11), pd.Interval(13, 28)])
Out[18]:
IntervalIndex([(1, 3], (4, 11], (13, 28]]
closed='right',
dtype='interval[int64]')
using timestamps:
In [25]: pd.IntervalIndex([
...: pd.Interval(pd.to_datetime('2018-01-01'), pd.to_datetime('2018-01-13')),
...: pd.Interval(pd.to_datetime('2018-03-08'), pd.to_datetime('2018-04-29')),
...: pd.Interval(pd.to_datetime('2018-05-03'), pd.to_datetime('2018-07-22'))
...: ])
...:
Out[25]:
IntervalIndex([(2018-01-01, 2018-01-13], (2018-03-08, 2018-04-29], (2018-05-03, 2018-07-22]]
closed='right',
dtype='interval[datetime64[ns]]')
UPDATE: we can use pd.IntervalIndex.from_tuples()
constructor:
In [16]: pd.IntervalIndex.from_tuples([(1,3), (4, 11), (13, 28)], closed='right')
Out[16]:
IntervalIndex([(1, 3], (4, 11], (13, 28]]
closed='right',
dtype='interval[int64]')