I'm trying to parse sar results in python using subprocess, everytime I call the python code that calls a sar subprocess it produces a different number of lines. Here's a minimal program to reproduce this problem:
import sys
import subprocess
fn = sys.argv[1]
p = subprocess.Popen(str('sar -r -f %s' % fn).split(' '), shell=True, stdout=subprocess.PIPE)
count = 0
while p.poll() is None:
output = p.stdout.readline()
count += 1
print "num lines = %d" % count
Calling this program several times, produces different number of lines each time: like this
for i in {1..10}; do ./sarextractor.py /var/log/sysstat/sa02; done
Calling sar directly on the command line produces constant number of lines:
for i in {1..10}; do sar -r -f /var/log/sysstat/sa02 | wc -l; done
any idea how can this happen?
Suppose p.poll() returns a return code but the p.stdout buffer still holds some data? Try something like this instead:
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, bufsize=0)
for line in p.stdout:
count += 1
That will drain the stdout buffer of all its lines.