Search code examples
pythonsqlloopsimpalamultiprocess

Running python multiprocesses from a dynamic sql list


Hi im trying to make my code a bit more dynamic and smart, to that end I want to call the functions I want to run through a dynamic list instead of having them hardcoded in. This will clean up the code and help with automatic rerun of failed scripts.

Below is a snippet of the code i have been working on. i generally get this error when running it

TypeError: 'str' object is not callable

 cursor = conn.cursor()
    cursor.execute(
        f'''select distinct caller from {db}.log a where a.log_text like 'Failed:%' and a.log_time > DATE_TRUNC('DAY', NOW()) and caller not in (select caller from {db}.log a where a.log_text like 'Done' and a.log_time > DATE_TRUNC('DAY', NOW()))''')
    df = as_pandas(cursor)
    print('The following scripts will be rerun')
    print(df)

    c = df['caller']

    processes = []
    # Loop over failed scripts/modules
    for mod in (c):  
        print(f'Rerun of {c}')
        p = multiprocessing.Process(target=mod, args=(db,))
        time.sleep(10)
        p.start()
        processes.append(p)

    for process in processes:
        process.join()

The full traceback error

Traceback (most recent call last): File "/home/xxx/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/xxx/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) TypeError: 'str' object is not callable Process Process-2: Traceback (most recent call last): File "/home/xxx/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap self.run() File "/home/xxx/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) TypeError: 'str' object is not callable Process finished with exit code 0


Solution

  • So I found a solution to the problem, and posting it here in case anyone could use it in the future.

    Basically I use the getattr and importlib.import_module functions. Then I start them via a for loop.

    So I use an sql to get the name of all failed module then load these module to a df list which is then iterated via a for loop which will start the failed module.

    cursor.execute(
                f'''select distinct caller from {db}.sch_log_python a where a.log_text like 'Failed:%' and lower(caller) in ('st_%','ctrl_%','ar%') and a.log_time > DATE_TRUNC('DAY', NOW()) and caller not in (select caller from {db}.sch_log_python a where a.log_text like 'Done' and a.log_time > DATE_TRUNC('DAY', NOW()))''')
            df = as_pandas(cursor)
            print('The following scripts will be rerun')
            print(df)
    
            sch_log_func(caller, 'Rerun of failed scripts', db)
    
            c = df
            # Loop over failed scripts/modules
            for i in (c):
                cstr = c.to_string(index=False, header=False)
                print(f'Initialise ' + cstr)
                cstr = cstr.lower()
                print(cstr + ' Module to import')
                print('Trying to run ' + cstr)
                cls = getattr(importlib.import_module(cstr), cstr)
                cls(db)
                print(f'Started {cstr}')