Search code examples
pythontime-seriespandasdata-analysis

Reindex time-stamped data with date_range


I have a pandas.Series of time-stamped data - basically a sequence of events:

0      2012-09-05 19:28:52
1      2012-09-05 19:28:52
2      2012-09-05 19:44:37
3      2012-09-05 19:44:37
4      2012-09-05 20:04:53
5      2012-09-05 20:04:53
6      2012-09-05 20:12:59
7      2012-09-05 20:13:15
8      2012-09-05 20:13:15
9      2012-09-05 20:13:15

I'd like to create a pandas.TimeSeries over a specific pandas.date_range (e.g. 15min interval; pandas.date_range(start, end, freq='15T')) which holds the count of events for each period. How can this be accomplished?

thanks, Peter


Solution

  • If you would use the timestamps of the events as index of the series instead of the data, resample can do this. In the example below, the index of series s are the timestamps and data is the event_id, basically the index of your series.

    In [47]: s
    Out[47]:
                          event_id
    timestamp
    2012-09-05 19:28:52          0
    2012-09-05 19:28:52          1
    2012-09-05 19:44:37          2
    2012-09-05 19:44:37          3
    2012-09-05 20:04:53          4
    2012-09-05 20:04:53          5
    2012-09-05 20:12:59          6
    2012-09-05 20:13:15          7
    2012-09-05 20:13:15          8
    2012-09-05 20:13:15          9
    

    resample (this method can also be used on a DataFrame) will give a new series with in this case 15min periods, the end time of a bucket (period) is used to refer to it (you can control this with the label arg).

    In [48]: s.resample('15Min', how=len)
    Out[48]:
                          event_id
    timestamp
    2012-09-05 19:30:00          2
    2012-09-05 19:45:00          2
    2012-09-05 20:00:00          0
    2012-09-05 20:15:00          6