how do I clear a stringio object?

I have a stringio object created and it has some text in it. I'd like to clear its existing values and reuse it instead of recalling it. Is there anyway of doing this?


  • TL;DR

    Don't bother clearing it, just create a new one—it’s faster.

    The method

    Python 2

    Here's how I would find such things out:

    >>> from StringIO import StringIO
    >>> dir(StringIO)
    ['__doc__', '__init__', '__iter__', '__module__', 'close', 'flush', 'getvalue', 'isatty', 'next', 'read', 'readline', 'readlines', 'seek', 'tell', 'truncate', 'write', 'writelines']
    >>> help(StringIO.truncate)
    Help on method truncate in module StringIO:
    truncate(self, size=None) unbound StringIO.StringIO method
        Truncate the file's size.
        If the optional size argument is present, the file is truncated to
        (at most) that size. The size defaults to the current position.
        The current file position is not changed unless the position
        is beyond the new file size.
        If the specified size exceeds the file's current size, the
        file remains unchanged.

    So, you want .truncate(0). But it's probably cheaper (and easier) to initialise a new StringIO. See below for benchmarks.

    Python 3

    (Thanks to tstone2077 for pointing out the difference.)

    >>> from io import StringIO
    >>> dir(StringIO)
    ['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '_checkClosed', '_checkReadable', '_checkSeekable', '_checkWritable', 'close', 'closed', 'detach', 'encoding', 'errors', 'fileno', 'flush', 'getvalue', 'isatty', 'line_buffering', 'newlines', 'read', 'readable', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write', 'writelines']
    >>> help(StringIO.truncate)
    Help on method_descriptor:
        Truncate size to pos.
        The pos argument defaults to the current file position, as
        returned by tell().  The current file position is unchanged.
        Returns the new absolute position.

    It is important to note with this that now the current file position is unchanged, whereas truncating to size zero would reset the position in the Python 2 variant.

    Thus, for Python 2, you only need

    >>> from cStringIO import StringIO
    >>> s = StringIO()
    >>> s.write('foo')
    >>> s.getvalue()
    >>> s.truncate(0)
    >>> s.getvalue()
    >>> s.write('bar')
    >>> s.getvalue()

    If you do this in Python 3, you won't get the result you expected:

    >>> from io import StringIO
    >>> s = StringIO()
    >>> s.write('foo')
    >>> s.getvalue()
    >>> s.truncate(0)
    >>> s.getvalue()
    >>> s.write('bar')
    >>> s.getvalue()

    So in Python 3 you also need to reset the position:

    >>> from cStringIO import StringIO
    >>> s = StringIO()
    >>> s.write('foo')
    >>> s.getvalue()
    >>> s.truncate(0)
    >>> s.getvalue()
    >>> s.write('bar')
    >>> s.getvalue()

    If using the truncate method in Python 2 code, it's safer to call seek(0) at the same time (before or after, it doesn't matter) so that the code won't break when you inevitably port it to Python 3. And there's another reason why you should just create a new StringIO object!


    Python 2

    >>> from timeit import timeit
    >>> def truncate(sio):
    ...     sio.truncate(0)
    ...     return sio
    >>> def new(sio):
    ...     return StringIO()

    When empty, with StringIO:

    >>> from StringIO import StringIO
    >>> timeit(lambda: truncate(StringIO()))
    >>> timeit(lambda: new(StringIO()))

    With 3KB of data in, with StringIO:

    >>> timeit(lambda: truncate(StringIO('abc' * 1000)))
    >>> timeit(lambda: new(StringIO('abc' * 1000)))

    And the same with cStringIO:

    >>> from cStringIO import StringIO
    >>> timeit(lambda: truncate(StringIO()))
    >>> timeit(lambda: new(StringIO()))
    >>> timeit(lambda: truncate(StringIO('abc' * 1000)))
    >>> timeit(lambda: new(StringIO('abc' * 1000)))

    So, ignoring potential memory concerns (del oldstringio), it's faster to truncate a StringIO.StringIO (3% faster for empty, 8% faster for 3KB of data), but it's faster ("fasterer" too) to create a new cStringIO.StringIO (8% faster for empty, 10% faster for 3KB of data). So I'd recommend just using the easiest one—so presuming you're working with CPython, use cStringIO and create new ones.

    Python 3

    The same code, just with seek(0) put in.

    >>> def truncate(sio):
    ...     sio.truncate(0)
    ...     return sio
    >>> def new(sio):
    ...     return StringIO()

    When empty:

    >>> from io import StringIO
    >>> timeit(lambda: truncate(StringIO()))
    >>> timeit(lambda: new(StringIO()))

    With 3KB of data in:

    >>> timeit(lambda: truncate(StringIO('abc' * 1000)))
    >>> timeit(lambda: new(StringIO('abc' * 1000)))

    So for Python 3 creating a new one instead of reusing a blank one is 11% faster and creating a new one instead of reusing a 3K one is 5% faster. Again, create a new StringIO rather than truncating and seeking.