Search code examples
pythonpython-multiprocessingparamiko

multiprocessing over dictionary elements


Objective:

  1. ssh to each device listed in an excel file, execute the command, capture the output and present the output in a tabular format (output table Column A: devicename, Column B: command output)
  2. optimize the processing time with the use of multiprocessing to scale for 100s of devices

So far I am able to achieve this using the following code, however I feel there should be a better way of doing this. reference

Problem statement:

  1. my input is a dictionary instead of a list, as a workaround I converted it to list in the code below.
  2. the target function need elements of dictionary as input
  3. the output from each process (switchname and the command output) should be returned to main process to form the dataframe

Sample Table from excel file

DeviceName DeviceIPaddress
router1121 192.168.1.1
router1122 192.168.1.2
router1131 192.168.1.3
router1132 192.168.1.4

Sample Table converted to Dictionary

{0: {'DeviceName': 'router1121', 'DeviceIPaddress': '192.168.1.1'}, 1: {'DeviceName': 'router1122', 'DeviceIPaddress': '192.168.1.2'}, 2: {'DeviceName': 'router1131', 'DeviceIPaddress': '192.168.1.3'}, 3: {'DeviceName': 'router1132', 'DeviceIPaddress': '192.168.1.4'}}

Working code that needs optimization and simplification

import multiprocessing
import paramiko
from collections import defaultdict
import pandas as pd
import stdiomask

ssh = paramiko.SSHClient()

def connect_device(admin_username, admin_password, devicename, ipaddress):
    print(f"Trying to connect to {devicename}")
    
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())

    ssh.connect(hostname=ipaddress, port=22, allow_agent=False, username=admin_username, password=admin_password)

    print(f"Connection established to {devicename}")

    command = "show interface description | i Gi1/0/23"
    stdin, stdout, stderr = ssh.exec_command(command)
    out = stdout.read()
    ssh.close()
    return devicename, out

if __name__ == '__main__':
    
    df = exceltodataframe('Deviceinfo.xlsx', 'Devicedetails')

    # create dictionaries from the dataframes
    dfdict = df.T.to_dict()
    mylist = list(dfdict.items())

    admin_username = input("Enter the username: ")
    admin_password = stdiomask.getpass("Enter the password: ")

    def_dct_all = defaultdict(dict)
    
    with concurrent.futures.ProcessPoolExecutor() as executor:
        
        results = [executor.submit(connect_device, admin_username, admin_password, mylist[idx][1]['DeviceName'], mylist[idx][1]['DeviceIPaddress']) for idx, n in enumerate(mylist)]

        for index, f in enumerate(concurrent.futures.as_completed(results)):
            def_dct_all[index]['DeviceName'] = f.result()[0]
            def_dct_all[index]['Description'] = f.result()[1]

    df = pd.DataFrame(def_dct_all).T
    print(df)

Solution

  • I was able to find another better way to achieve the objective by using Pool class of multiprocessing.

    import multiprocessing
    import paramiko
    from collections import defaultdict
    import pandas as pd
    import stdiomask
    
    ssh = paramiko.SSHClient()
    
    def connect_device(admin_username, admin_password, devicename, ipaddress):
    
        devicename = mylist[1]['DeviceName']
        ipaddress = mylist[1]['DeviceIPaddress']
    
        print(f"Trying to connect to {devicename}")
        
        ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    
        ssh.connect(hostname=ipaddress, port=22, allow_agent=False, username=admin_username, password=admin_password)
    
        print(f"Connection established to {devicename}")
    
        command = "show interface description | i Gi1/0/23"
        stdin, stdout, stderr = ssh.exec_command(command)
        out = stdout.read()
        ssh.close()
        return devicename, out
    
    if __name__ == '__main__':
        
        df = exceltodataframe('Deviceinfo.xlsx', 'Devicedetails')
    
        # create dictionaries from the dataframes
        dfdict = df.T.to_dict()
        mylist = list(dfdict.items())
    
        admin_username = input("Enter the username: ")
        admin_password = stdiomask.getpass("Enter the password: ")
    
        p = multiprocessing.Pool()
        result = p.map(connect_device, mylist)
    
        def_dct_all = defaultdict(dict)
        
        for r in range(len(result)):
            def_dct_all[r]['DeviceName'] = result[r][0]
            def_dct_all[r]['Description'] = result[r][1].decode("utf-8").rstrip()
    
        resultdf = pd.DataFrame(def_dct_all).T
        print(resultdf)