Search code examples
pythonperformanceoptimizationstreamstringio

fast way to read from StringIO until some byte is encountered


Suppose I have some StringIO (from cStringIO). I want to read buffer from it until some character/byte is encountered, say 'Z', so:

stringio = StringIO('ABCZ123')
buf = read_until(stringio, 'Z')  # buf is now 'ABCZ'
# strinio.tell() is now 4, pointing after 'Z'

What is fastest way to do this in Python? Thank you


Solution

  • I very disappointed that this question get only one answer on stack overflow, because it is interesting and relevant question. Anyway, since only ovgolovin give solution and I thinked it is maybe slow, I thought a faster solution:

    def foo(stringio):
        datalist = []
        while True:
            chunk = stringio.read(256)
            i = chunk.find('Z')
            if i == -1:
                datalist.append(chunk)
            else:
                datalist.append(chunk[:i+1])
                break
            if len(chunk) < 256:
                break
        return ''.join(datalist)
    

    This read io in chunks (maybe end char found not in first chunk). It is very fast because no Python function called for each character, but on the contrary maximal usage of C-written Python functions.

    This run about 60x faster than ovgolovin's solution. I ran timeit to check it.