Python StringIO is not correctly capturing data from stderr

I have written some unittests that analyze data that is logged with the standard python logging function. Using some of the ideas that I found here: Capture stdout from a script in Python about how to capture data from stderr, I have come up with the following script, which I have simplified down to the bare minimum to illustrate a problem that I have encountered. (the loop below is simulates the fact that this function might be called from various unittests)

import logging, sys
from StringIO import StringIO

def get_stderr():
    saved_stderr = sys.stderr

    stderr_string_io = StringIO()
    sys.stderr = stderr_string_io
    try:
        logging.error("Foobar!!!")

    finally:
        # set the stdout and stderr back to their original values
        sys.stderr = saved_stderr

    err_output = stderr_string_io.getvalue()
    return err_output

for x in [1, 2]:
    err_output = get_stderr()
    print  "Run %d: %s" % (x, err_output)

If you run the script it will give the following output, in which the logging output from the second loop iteration is totally lost:

Run 1: ERROR:root:Foobar!!!
Run 2: 
Process finished with exit code 0

While I would expect it to give the following output:

Run 1: ERROR:root:Foobar!!!
Run 2: ERROR:root:Foobar!!!
Process finished with exit code 0

Note: that executing stderr_string_io.close() at the end of the function does not work, as the script then throws an ValueError the next time the function is executed.

Why does this code not behave as expected, and what is the solution to correct this problem?

Solution

When you call

logging.error

it runs

def error(msg, *args, **kwargs):
    if len(root.handlers) == 0:
        basicConfig()
    root.error(msg, *args, **kwargs)

Since there are no root handlers at the start, it runs basicConfig with no arguments, which does:

def basicConfig():
    _acquireLock()
    try:
        if len(root.handlers) == 0:
            h = StreamHandler(None)
            handlers = [h]
            dfs = None
            style = '%'
            fs = kwargs.get("format", _STYLES[style][1])
            fmt = Formatter(fs, dfs, style)
            for h in handlers:
                if h.formatter is None:
                    h.setFormatter(fmt)
                root.addHandler(h)
    finally:
        _releaseLock()

I've removed code that can't run when you have no arguments.

So this has set handlers = [StreamHandler(None)]:

class StreamHandler(Handler):
    def __init__(self, stream=None):
        Handler.__init__(self)
        if stream is None:
            stream = sys.stderr
        self.stream = stream

which means that you have the top level logger permanantly attached to whatever was stdout at the time you called it.

This causes your problem, because you throw away that output. This means that the output will go to a dead StringIO object, and be lost.

One way to deal with this is to go through handlers when updating stderr and also replace anything that refers to stderr:

import logging, sys
from StringIO import StringIO

def get_stderr():
    saved_stderr = sys.stderr
    stderr_string_io = StringIO()

    for handler in logging.root.handlers:
        if handler.stream is sys.stderr:
            handler.stream = stderr_string_io

    sys.stderr = stderr_string_io

    try:
        logging.error("Foobar!!!")

    finally:
        # set the stdout and stderr back to their original values
        for handler in logging.root.handlers:
            if handler.stream is sys.stderr:
                handler.stream = saved_stderr

        sys.stderr = saved_stderr

    err_output = stderr_string_io.getvalue()
    return err_output

for x in [1, 2]:
    err_output = get_stderr()
    print  "Run %d: %s" % (x, err_output)

I don't know how well this will work. It also won't catch any loggers that aren't root loggers. Personally the idea of capturing sys.stdout by value is absurd, and this seems like an inevitable result.