Search code examples
pythonchartscandlestick-chartohlc

Convert Tick data to OHLC with Python (No External Libraries)


Let's say I have data like this :

[
 {'time': 1626459705; 'price': 278.989978}, 
 {'time': 1626459695; 'price': 279.437975}
]

Note : This is just a sample data I created myself. In actual there may be any number of transactions per minute. So, data will vary per minute.

How can I convert it into OHLC Candlestick data for say 1 or 3 or 5 Minutes by using Python without using any external library like Pandas? Is it possible to do in an easy way?

Thanks in Advance


Solution

  • Here is code that generates random data and creates an OHLC table.

    import random
    import pprint
    
    # Generate random walk data..
    
    base = 1626459705
    price = 278.989978
    data = []
    for i in range(600):
        data.append( {'time':base+10*i, 'price':price} )
        price += random.random() * 3 - 1.5
    print(data)
    
    # Produce 3 minute intervals.
    
    ohlc = []
    interval = 180
    
    base = 0
    # start time, open, high, low, close
    rec = [ 0, 0, 0, 99999, 0 ]
    ohlc = []
    for row in data:
        rec[2] = max(rec[2],row['price'])
        rec[3] = min(rec[3],row['price'])
        if row['time'] >= base+interval:
            if rec[0]:
                rec[4] = row['price']
                ohlc.append( dict(zip(('time','open','high','low','close'),rec)) )
            base = rec[0] = row['time']
            rec[1] = rec[2] = rec[3] = row['price']
    
    pprint.pprint(ohlc)
    

    FOLLOWUP

    OK, here's one that works with your data. I just copied that file to "mydata.json" (and removed the first "data ="). Note that this prints the output on actual 3-minute intervals, rather than basing it on each line of the input.

    import pprint
    import json
    import time
    
    # Produce 3 minute intervals.
    
    data = json.load(open('mydata.json'))
    data.reverse()
    
    interval = 180
    base = data[0]['time'] // interval * interval
    
    # start time, open, high, low, close
    rec = [ base, data[0]['price'], data[0]['price'], data[0]['price'], 0 ]
    
    ohlc = []
    
    i = 0
    while i < len(data):
        row = data[i]
    
        # If this sample is beyond the 3 minutes:
        if row['time'] > rec[0]+interval:
            ohlc.append( dict(zip(('time','open','high','low','close'),rec)) )
            rec[0] += interval
            rec[1] = rec[2] = rec[3] = rec[4]
        else:
            rec[2] = max(rec[2],row['price'])
            rec[3] = min(rec[3],row['price'])
            rec[4] = row['price']
            i += 1
    
    for row in ohlc:
        row['ctime'] = time.ctime(row['time'])
        print( "%(ctime)s: %(open)12f %(high)12f %(low)12f %(close)12f" % row )
    

    Sample output:

    Wed Dec 22 22:27:00 2021:   454.427421   454.427421   454.427421   454.427421
    Wed Dec 22 22:30:00 2021:   454.427421   454.427421   454.427421   454.427421
    Wed Dec 22 22:33:00 2021:   454.427421   454.427421   454.427421   454.427421
    Wed Dec 22 22:36:00 2021:   454.427421   457.058452   453.411757   453.411757
    Wed Dec 22 22:39:00 2021:   453.411757   455.199204   452.589304   455.199204
    Wed Dec 22 22:42:00 2021:   455.199204   455.199204   455.199204   455.199204
    Wed Dec 22 22:45:00 2021:   455.199204   455.199204   455.199204   455.199204
    Wed Dec 22 22:48:00 2021:   455.199204   455.768577   455.199204   455.768577
    Wed Dec 22 22:51:00 2021:   455.768577   455.768577   455.768577   455.768577
    Wed Dec 22 22:54:00 2021:   455.768577   455.768577   452.348469   454.374116