Search code examples
pythonpython-3.xcsvsubprocess

Handling live process output data inside of script?


I am trying to figure out how to handling live, ongoing, output of a process in python. I was given a script from my teammates, lets call it applicationAPI.py, that does the following things:

  1. establish tcp connection to a host device
  2. defines a number of metrics to pull from the application on the host device
  3. outputs those metrics to my stdout in CSV format LIVE over a defined time interval, in this case every 60 seconds.

What I am trying to do is capture this live output data from this process, and parse it within my own python script. I tried to use subprocess with the following

import subprocess
import csv
import json

api_call = subprocess.run(['./applicationAPI.py', '-target_ip', '-username', 
'-password', 'metric1', 'metric2', 'metric3', '-u', '60'], capture_output=True, text=True, check=True)

api_response = (api_call.stdout)
f = StringIO(api_response)
reader = csv.reader(f)
for line in reader:
    print(line)

But it's not working. The problem is that the process is ongoing. For the 60 second time interval it will keep outputting data to stdout. But it never returns a return code or finishes executing. I tried to just redirect to a file, but again, because its the same process the output file would just grow and grow forever. I do not know how to capture or manipulate this data so I can parse it within my script.


Solution

  • You could use the lower level Popen class which gives you stdout and stderr pipes to read. Redirect stderr to a file when starting the process then read csv information from stdout until it completes. A final wait for the process to end and read the return code.

    import subprocess
    import csv
    import json
    import tempfile
    
    with tempfile.TemporaryFile() as stderr_file:
    
        api_call = subprocess.Popen(['./applicationAPI.py', '-target_ip', '-username',
            '-password', 'metric1', 'metric2', 'metric3', '-u', '60'],
            stdout=subprocess.PIPE, stderr=stderr_file)
        for row in csv.reader((line.decode() for line in api_call.stdout)):
            print(row)
        api_call.wait()
        if api_call.returncode != 0:
            stderr_file.seek(0)
            print(stderr_file.read())
    

    This isn't real time. Because you are using pipes, not a terminal, the program will buffer stdout. You'll only see output on block size boundaries (perhaps every 2048 bytes, but it's system dependent). Sometimes programs include a flag that tells it to flush stdout on every newline. Perhaps this program has one or it could be implemented?