more efficient way to capture output / maybe not .communicate() from multiple python subprocesses?

I've got the following block of code. cmdTable is a dict where the keys are strings that describe a subprocess to open (like "out_From_hi_mom") and the values are the executable command (like "echo hi mom")... something like:

cmdTable['himom'] : "echo hi there momma"

This ultimately builds procOutput["himom"] : "hi there momma"

This all works just fine, but I'm launching about 100 subprocesses and I'm trying to figure out if it's actually running these in parallel. I'm deeply suspicious that it's not, because the log next to the .communicate() call always shows the subprocesses returning in exactly the same order that they were created.

If the debug timestamps are to be believed, the .communicate()'s also return in batches, which doesn't seem like the expected behavior to me anyway...

I was under the impression that I was launching a bunch of subprocesses here more or less simultaneously. The timestamps on the Popen calls supports this theory, all ~100 of these launch within a second or so.

the various except blocks have been removed for brevity...

def runShowCommands(cmdTable) -> dict:
    """return a dictionary of captured output from commands defined in cmdTable.    """
    procOutput = {}  # dict to store the output text from show commands 
    procHandles = {}
    for cmd in cmdTable.keys():
        try:
            log.debug(f"running subprocess {cmd} -- {cmdTable[cmd]}")
            procHandles[cmd] = subprocess.Popen(cmdTable[cmd], stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    for handle, proc in procHandles.items():
        try:
            procOutput[handle] = proc.communicate(timeout=180)[0].decode("utf-8")  # turn stdout portion into text
            log.debug(f"subprocess returned {handle}")

    return procOutput

I suppose its worth mentioning that all of these subprocesses are thread-safe with respect to each other, I do not care in exactly what order they run and they share no input nor output state. My primary goal is to minimize the total wall-clock execution time, and I'm reasonably sure that I'm missing something and these are all running serially as opposed to in parallel.

Is there something here in the Popen and .communicate() usage that I've got wrong (I inherited this code and will freely admit that it's at the edge of my abilities...)

Solution

Popen is not blocking so all the processes will be running in parallel, proc.communicate will block until the process terminates, you are waiting on them to end sequentially, so in order to wait on the second process the first one has to end, so the log shows them ending sequentially.

to wait on them in parallel you could use a ThreadPool for that

from multiprocessing.pool import ThreadPool

def handle_proc_stdout(handle):
    procOutput[handle] = procHandles[handle].communicate(timeout=180)[0].decode("utf-8")
    log.debug(f"subprocess returned {handle}")


threadpool = ThreadPool()
threadpool.map(handle_proc_stdout, procHandles.keys())

this code should more-or-less replace your communicate for loop.