Search code examples
pythonnumpypandastime-seriesdata-analysis

python key value list to panda series


I have the following timeseries list in python:

list = [(datetime.datetime(2008, 7, 15, 15, 0), 0.134),
    (datetime.datetime(2008, 7, 15, 16, 0), 0.0),
    (datetime.datetime(2008, 7, 15, 17, 0), 0.0),
    (datetime.datetime(2008, 7, 15, 18, 0), 0.0),
    (datetime.datetime(2008, 7, 15, 19, 0), 0.0),
    (datetime.datetime(2008, 7, 15, 20, 0), 0.0),
    (datetime.datetime(2008, 7, 15, 21, 0), 0.0),
    (datetime.datetime(2008, 7, 15, 22, 0), 0.0),
    (datetime.datetime(2008, 7, 15, 23, 0), 0.0),
    (datetime.datetime(2008, 7, 16, 0, 0), 0.0)]

This list is a key value pair where key is datetime and value is the one after that separated by comma. I want to create pandas series from keys (datetime) and values (decimal value). Anyone can help me to split the above list of time series value into two list (list1 and list2) so I can creare the pandas Series object for further analysis from the following code?

import pandas as pd
ts = pd.Series(list1, list2)

Solution

  • In [34]: pd.Series(*zip(*((b,a) for a,b in data)))
    Out[34]: 
    2008-07-15 15:00:00    0.134
    2008-07-15 16:00:00    0.000
    2008-07-15 17:00:00    0.000
    2008-07-15 18:00:00    0.000
    2008-07-15 19:00:00    0.000
    2008-07-15 20:00:00    0.000
    2008-07-15 21:00:00    0.000
    2008-07-15 22:00:00    0.000
    2008-07-15 23:00:00    0.000
    2008-07-16 00:00:00    0.000
    dtype: float64
    

    Or, eschewing the insane desire to make one-liners:

    dates, vals = zip(*data)
    s = pd.Series(vals, index=dates)
    

    If the data is extremely long, you can avoid creating the intermediate tuples by using itertools.izip:

    import itertools as IT
    dates, vals = IT.izip(*data)
    s = pd.Series(vals, index=dates)