Handling live process output data inside of script?

I am trying to figure out how to handling live, ongoing, output of a process in python. I was given a script from my teammates, lets call it applicationAPI.py, that does the following things:

establish tcp connection to a host device
defines a number of metrics to pull from the application on the host device
outputs those metrics to my stdout in CSV format LIVE over a defined time interval, in this case every 60 seconds.

What I am trying to do is capture this live output data from this process, and parse it within my own python script. I tried to use subprocess with the following

import subprocess
import csv
import json

api_call = subprocess.run(['./applicationAPI.py', '-target_ip', '-username', 
'-password', 'metric1', 'metric2', 'metric3', '-u', '60'], capture_output=True, text=True, check=True)

api_response = (api_call.stdout)
f = StringIO(api_response)
reader = csv.reader(f)
for line in reader:
    print(line)

But it's not working. The problem is that the process is ongoing. For the 60 second time interval it will keep outputting data to stdout. But it never returns a return code or finishes executing. I tried to just redirect to a file, but again, because its the same process the output file would just grow and grow forever. I do not know how to capture or manipulate this data so I can parse it within my script.

Solution

You could use the lower level Popen class which gives you stdout and stderr pipes to read. Redirect stderr to a file when starting the process then read csv information from stdout until it completes. A final wait for the process to end and read the return code.

import subprocess
import csv
import json
import tempfile

with tempfile.TemporaryFile() as stderr_file:

    api_call = subprocess.Popen(['./applicationAPI.py', '-target_ip', '-username',
        '-password', 'metric1', 'metric2', 'metric3', '-u', '60'],
        stdout=subprocess.PIPE, stderr=stderr_file)
    for row in csv.reader((line.decode() for line in api_call.stdout)):
        print(row)
    api_call.wait()
    if api_call.returncode != 0:
        stderr_file.seek(0)
        print(stderr_file.read())

This isn't real time. Because you are using pipes, not a terminal, the program will buffer stdout. You'll only see output on block size boundaries (perhaps every 2048 bytes, but it's system dependent). Sometimes programs include a flag that tells it to flush stdout on every newline. Perhaps this program has one or it could be implemented?