Search code examples
pythonpython-2.7cpython

Python read unicode stdin without batching


If I read input from stdin in python, the for loop will collect a number of lines before the body of the loop is run (at least in cpython).

from __future__ import print_function
import sys

for line in sys.stdin:
    print("Echo:", line.strip())

Outputs:

$ python ../test.py 
foo
bar
Echo: foo
Echo: bar

Lines are handled in some kind of batches. I can avoid it like this:

from __future__ import print_function
import sys

for line in iter(sys.stdin.readline, ''):
    print("Echo:", line.strip())

Outputs:

$ python ../test.py 
foo
Echo: foo
bar
Echo: bar

Which is what I need.

My problem is that I have to read utf-8 input and trick with iter() does not work with codecs.getwriter.

from __future__ import print_function
import sys
import codecs

sys.stdin = codecs.getreader('utf-8')(sys.stdin)
for line in iter(sys.stdin.readline, ''):
    print("Echo:", line.strip())

$ python ../test.py 
foo
bar
Echo: foo
Echo: bar

Is there any way to avoid this batching while reading utf8 data from stdin?


Edit: Added import statements for completeness.


Solution

  • Using lambda:

    for line in iter(lambda: sys.stdin.readline().decode('utf-8'), ''):
        print 'Echo:', line.strip()
    

    or, decoding in loop body:

    for line in iter(sys.stdin.readline, ''):
        print "Echo:", line.decode('utf-8').strip()