My python script uses subprocess to call a linux utility that is very noisy. I want to store all of the output to a log file and show some of it to the user. I thought the following would work, but the output doesn't show up in my application until the utility has produced a significant amount of output.
# fake_utility.py, just generates lots of output over time
import time
i = 0
while True:
print(hex(i)*512)
i += 1
time.sleep(0.5)
In the parent process:
import subprocess
proc = subprocess.Popen(['python', 'fake_utility.py'], stdout=subprocess.PIPE)
for line in proc.stdout:
# the real code does filtering here
print("test:", line.rstrip())
The behavior I really want is for the filter script to print each line as it is received from the subprocess, like tee
does but within Python code.
What am I missing? Is this even possible?
I think the problem is with the statement for line in proc.stdout
, which reads the entire input before iterating over it. The solution is to use readline()
instead:
#filters output
import subprocess
proc = subprocess.Popen(['python','fake_utility.py'],stdout=subprocess.PIPE)
while True:
line = proc.stdout.readline()
if not line:
break
#the real code does filtering here
print "test:", line.rstrip()
Of course you still have to deal with the subprocess' buffering.
Note: according to the documentation the solution with an iterator should be equivalent to using readline()
, except for the read-ahead buffer, but (or exactly because of this) the proposed change did produce different results for me (Python 2.5 on Windows XP).