Which one would be better, multiprocessing or sub processing for this ping script

So for work, I was asked to recreate a ping script that we were using to check on machines. The old one was written in Perl, and I am more comfortable with Python, so I wrote a ping script in Python. I was looking over some of the differences between multiprocessing and subprocesses, and I think multiprocessing would work better, since we have a lot of host names, I think it would be better to use multiple threads, each to ping a set number of host names. But I couldn't quite follow the multiprocessing guides since I was needing to use subprocesses to quiet the console output. My question is how would I go about using multi-threading if I am also using subprocesses?

My working code is as follows:

import subprocess
from datetime import datetime
import re


hostfile = open("hosts.txt", "r")
lines = hostfile.readlines()

#Create an output file to write output to
with open('Ping_Results.txt', 'a') as f:
   
   
    #Append the current day to the end of the recently opened file
    today = datetime.now()
    dateStr = str(today)
    appendDate = "\n------------------------------------" + " " + dateStr + " " + "-----------------------------------\n"
    f.write(appendDate)

    for i in lines:
        #Ping each device on hostfile with 1 packet with size of 10 bytes. 
        #shell=True to enable multiple arguments for the ping command. capture_output is to capture the output, and text=True. 
        p1 = subprocess.run(['ping' ,'-n', '1', '-l', '10', i], shell= True, capture_output=True)
    
        #Error checking for host unreachable vs host does not exist. Bit buggy
        l = p1.stdout.decode()
        unreach = re.search(r'Destination host unreachable', l)
        
        if unreach:
            status = i.rstrip() + " was not reachable"
            unreachable = (p1.stdout.decode())
            f.write(unreachable)
            #print(status) 
        elif ( p1.returncode ==0 ):
            #if host is able to be pinged and is responsive.
            status = i.rstrip() + " was reachable"
            reachable=(p1.stdout.decode())
            f.write(reachable)
            print(status)
            print(reachable)
       
        else:   
            #If host is not reachable. 
            status = i.rstrip() + " was not found. Check logs for more information"

            #Decode the string to get the readable console output infomation. 
            notFound= (p1.stdout.decode())

            #Writes to the Ping_Results file.
            f.write(notFound)
            #print(status)`

This works fine, but it is a lot slower when I have 10-20 hostnames. I tried splitting up the sub-processes, but I couldn't figure it out. I kept finding examples of people using p1 = subprocess.run(['ping' ,'-n', '1', '-l', '10', i], shell= True, capture_output=True) and that does work, and I like how it works, but I need a way to speed it up to be able to handle 200+ hosts. Any links to the proper guides would be appreciated or any clarity on the topic in general. Thank you.

I have tried various multiprocessing guides, but I was wanting to quiet the console output, and subprocesses were the only way that I had found that worked with how I had it set up. I may be misunderstanding exactly how subprocesses work. In my mind, each sub-process acts like a thread, but I can't seem to figure it out.

Solution

You need subprocess to run the external ping command. As for running multiple pings in parallel, the multiprocessing module implements and thread and processes using very similar interfaces to make them easy to interchange. Since you code is mostly I/O bound and the ping is already being done in a separate process, a thread pool would be a good choice. You pick a size for the pool and let it figure out the details of how to make the threads run.

Put all of the ping code in a function that returns a result instead of writing the file. Have the pool use that function to map your input data to the ping output, then write that to a file.

import subprocess
from datetime import datetime
import re
import multiprocessing as mp
import multiprocessing.pool

def ping_host(host):
    #Ping each device on hostfile with 1 packet with size of 10 bytes. 
    #shell=True to enable multiple arguments for the ping command. capture_output is to capture the output, and text=True. 
    p1 = subprocess.run(['ping' ,'-n', '1', '-l', '10', i], shell= True, capture_output=True)

    #Error checking for host unreachable vs host does not exist. Bit buggy
    l = p1.stdout.decode()
    unreach = re.search(r'Destination host unreachable', l)
    
    if unreach:
        status = i.rstrip() + " was not reachable"
        unreachable = (p1.stdout.decode())
        return unreachable

    if ( p1.returncode ==0 ):
        #if host is able to be pinged and is responsive.
        status = i.rstrip() + " was reachable"
        reachable=(p1.stdout.decode())
        print(f"{status}\n{reachable}")
        return reachable
    
    #If host is not reachable. 
    status = i.rstrip() + " was not found. Check logs for more information"

    #Decode the string to get the readable console output infomation. 
    notFound= (p1.stdout.decode())

    #Writes to the Ping_Results file.
    #print(status)`
    return notFound



hostfile = open("hosts.txt", "r")
lines = hostfile.readlines()

#Create an output file to write output to
with open('Ping_Results.txt', 'a') as f:
    #Append the current day to the end of the recently opened file
    today = datetime.now()
    dateStr = str(today)
    appendDate = "\n------------------------------------" + " " + dateStr + " " + "-----------------------------------\n"
    f.write(appendDate)

    with mp.pool.ThreadPool(min(len(lines), 20)) as pool:
        hostfile.writelines(pool.map(ping_host, lines))