Search code examples
pythontime-seriespandasdata-analysis

Convert interval datetime values to any-frequency time series


I have the following data structure:

2011-01-01 00:00, 2011-01-20 00:00, 200   # days-range
2011-01-20 00:00, 2011-03-08 00:00, 1288  # days-range
2011-04-11 00:00, 2012-01-08 00:00, 5987  # days-range

2012-02-01 00:00, 2012-02-01 01:00, 7     # hourly-range
2012-02-01 02:00, 2012-02-01 02:30, 3     # hourly-range

This is interval with start date, end date and value (some metric recorded between dates).

For further data analysis I need to generate time series with required frequency: monthly/daily/hourly/half-hourly time series. For example, hourly data:

2011-01-01 00:00, 2 
2011-01-01 01:00, 6
2011-01-01 02:00, 5
...

Is there any python lib which can help to implement this kind of data transformation?


Solution

  • import pandas as pd
    
    def stretch(start_date, end_date, value, freq):
        freq_dict = {'d': pd.datetools.day,
                     'h': pd.datetools.Hour(1)}
        dr = pd.DateRange(start_date, end_date, offset=freq_dict[freq])
        return pd.TimeSeries(value / dr.size, index=dr)
    
    
    print stretch('2011-01-01 00:00', '2011-01-20 00:00', 200, 'd')
    

    prints

    2011-01-01    10
    2011-01-02    10
    2011-01-03    10
    2011-01-04    10
    2011-01-05    10
    2011-01-06    10
    2011-01-07    10
    2011-01-08    10
    2011-01-09    10
    2011-01-10    10
    2011-01-11    10
    2011-01-12    10
    2011-01-13    10
    2011-01-14    10
    2011-01-15    10
    2011-01-16    10
    2011-01-17    10
    2011-01-18    10
    2011-01-19    10
    2011-01-20    10