Search code examples
pythonpandasperiod

Pandas Period with custom duration


Reading the docs of the pandas Period objects leaves me confused as to whether it is somehow possible to create a custom Period. By custom I mean a Period that does not follow a certain frequency dtype but where each Period in a PeriodIndex represents a start timestamp and an end timestamp, that is manually defined.

To illustrate:

Lets say I have n time intervals t, where the beginning and end of each interval are samples from a discrete uniformly distributed random variable with the constraint that the beginning < end.

The result would somewhat look like this:

[(ts_start0, ts_end0), (ts_start1, ts_end1), (ts_start2, ts_end2))]

Is there any way to encode such "random" intervals/ timespans/ periods with Pandas Period or something similar?


Solution

  • I think you can use IntervalIndex:

    In [18]: pd.IntervalIndex([pd.Interval(1,3), pd.Interval(4, 11), pd.Interval(13, 28)])
    Out[18]:
    IntervalIndex([(1, 3], (4, 11], (13, 28]]
                  closed='right',
                  dtype='interval[int64]')
    

    using timestamps:

    In [25]: pd.IntervalIndex([
        ...:   pd.Interval(pd.to_datetime('2018-01-01'), pd.to_datetime('2018-01-13')),
        ...:   pd.Interval(pd.to_datetime('2018-03-08'), pd.to_datetime('2018-04-29')),
        ...:   pd.Interval(pd.to_datetime('2018-05-03'), pd.to_datetime('2018-07-22'))
        ...: ])
        ...:
    Out[25]:
    IntervalIndex([(2018-01-01, 2018-01-13], (2018-03-08, 2018-04-29], (2018-05-03, 2018-07-22]]
                  closed='right',
                  dtype='interval[datetime64[ns]]')
    

    UPDATE: we can use pd.IntervalIndex.from_tuples() constructor:

    In [16]: pd.IntervalIndex.from_tuples([(1,3), (4, 11), (13, 28)], closed='right')
    Out[16]:
    IntervalIndex([(1, 3], (4, 11], (13, 28]]
                  closed='right',
                  dtype='interval[int64]')