Search code examples
pythonlinuxwindowsstdout

Difference in buffering of stdout on Linux and Windows


There seems to be a difference in how stdout is buffered on Windows and on Linux when written to console. Consider this small python script:

import time
for i in xrange(10):
    time.sleep(1)
    print "Working" ,

When running this script on Windows we see Workings appearing one after another with a second-long wait in-between. On Linux we have to wait for 10 seconds and then the whole line appears at once.

If we change the last line to print "Working", every line appears individually on Linux as well.

So on Linux, stdout seems to be line-buffered and on Windows not at all. We can switch off the buffering by using the -u-option (in this case the script on Linux has the same behavior as on Windows). The documentation says:

-u Force stdin, stdout and stderr to be totally unbuffered.

So actually, it does not say, that without -u-option stdin and stdout are buffered. And thus my questions:

  1. What is the reason for different behavior on Linux/Windows?
  2. Is there some kind of guarantee, that if redirected to a file, stdout will be buffered, no matter which OS? At least this seems to be the case with Windows and Linux.

My main concern is not (as some answers assume) when the information is flushed, but that if stdout isn't buffered it might be a severe performance hit and one should not rely on it.

Edit: It might be worth noting, that for Python3 the behavior is equal for Linux and Windows (but it is not really surprising, because the behavior is configured explicitly by parameters of the print-method).


Solution

  • Assuming you're talking about CPython (likely), this has to do with the behaviour of the underlying C implementations.

    The ISO C standard mentions (C11 7.21.3 Files /3) three modes:

    • unbuffered (characters appear as soon as possible);
    • fully buffered (characters appear when the buffer is full); and
    • line buffered (characters appear on newline output).

    There are other triggers that cause the characters to appear (such as buffer filling up even if no newline is output, requesting input under some circumstances, or closing the stream) but they're not important in the context of your question.

    What is important is 7.21.3 Files /7 in that same standard:

    As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.

    Note the wiggle room there. Standard output can either be line buffered or unbuffered unless the implementation knows for sure it's not an interactive device.

    In this case (the console), it is an interactive device so the implementation is not permitted to use fully buffered. It is, however allowed to select either of the other two modes (buffered or unbuffered).

    Unbuffered output would see the messages appear as soon as you output them (as per your Windows behaviour). Line-buffered would delay until output of a newline character (your Linux behaviour).

    If you really want to ensure your messages are flushed regardless of mode, you can just flush them yourself with something like:

    import time, sys
    for i in xrange(10):
        time.sleep(1)
        print "Working",
        sys.stdout.flush()
    print
    

    In terms of guaranteeing that output will be buffered when redirecting to a file, that would be covered in the quotes from the standard I've already shown. If the stream can be determined to be using a non-interactive device, it will be fully buffered. That's not an absolute guarantee since it doesn't state how that's determined but I'd be surprised if any implementation couldn't figure that out.

    In any case, you can test specific implementations just by redirecting the output and monitoring the file to see if it flushes once per output or at the end.