Search code examples
pythonpandaspandas-groupbypandas-resample

resample time data from list data in pandas python


I have a list data like bellow which has 6 things in each element in list

"timestamp , open , high , low , close, volume"
['1606470300000,244.25,244.25,244.00,244.00,489269', 
 '1606470360000,244.05,244.30,244.00,244.15,452854', 
 '1606470420000,244.15,244.20,244.00,244.10,403043', 
 '1606470480000,244.15,244.15,243.95,244.00,370182', 
 '1606470540000,244.05,244.20,244.00,244.15,445286', 
 '1606470600000,244.15,244.25,244.05,244.20,473342', 
 '1606470660000,244.25,244.35,244.00,244.05,491117', 
 '1606470720000,244.05,244.20,244.00,244.20,298261', 
 '1606470780000,244.20,244.25,244.10,244.25,344172', 
 '1606470840000,244.20,244.35,244.20,244.30,347080', 
 '1606470900000,244.30,244.40,244.25,244.30,447630', 
 '1606470960000,244.30,244.30,244.00,244.00,360666', 
 '1606471020000,244.05,244.15,243.95,243.95,467724',
 '1606471080000,243.95,244.10,243.70,244.00,386080', 
 '1606471140000,244.00,244.20,243.70,244.20,166559']

this is a 1 minute data and I want to convert it in 15 minute data

please help


Solution

  • Here is one way to do it using pandas.

    import pandas as pd
    list_data = ['1606470300000,244.25,244.25,244.00,244.00,489269',
     '1606470360000,244.05,244.30,244.00,244.15,452854',
     '1606470420000,244.15,244.20,244.00,244.10,403043',
     '1606470480000,244.15,244.15,243.95,244.00,370182',
     '1606470540000,244.05,244.20,244.00,244.15,445286',
     '1606470600000,244.15,244.25,244.05,244.20,473342',
     '1606470660000,244.25,244.35,244.00,244.05,491117',
     '1606470720000,244.05,244.20,244.00,244.20,298261',
     '1606470780000,244.20,244.25,244.10,244.25,344172',
     '1606470840000,244.20,244.35,244.20,244.30,347080',
     '1606470900000,244.30,244.40,244.25,244.30,447630',
     '1606470960000,244.30,244.30,244.00,244.00,360666',
     '1606471020000,244.05,244.15,243.95,243.95,467724',
     '1606471080000,243.95,244.10,243.70,244.00,386080',
     '1606471140000,244.00,244.20,243.70,244.20,166559']
    
    # Convert to usable a dataframe  
    df = pd.DataFrame(list_data)\
             .loc[:, 0]\
             .str.rsplit(",", expand=True)
    df.columns = ["timestamp" , "open" , "high" , "low" , "close", "volume"]
    df.loc[:, "timestamp"] = pd.to_datetime(df.timestamp.astype(int)*1e6)
    df = df.set_index("timestamp")
    df = df.astype(float)
    
    # That is how it looks like
    df.head()
                           open    high     low   close    volume
    timestamp                                                    
    2020-11-27 09:45:00  244.25  244.25  244.00  244.00  489269.0
    2020-11-27 09:46:00  244.05  244.30  244.00  244.15  452854.0
    2020-11-27 09:47:00  244.15  244.20  244.00  244.10  403043.0
    2020-11-27 09:48:00  244.15  244.15  243.95  244.00  370182.0
    2020-11-27 09:49:00  244.05  244.20  244.00  244.15  445286.0
    
    
    # resample to 15 min data
    df.resample("15min").mean()