Search code examples
pythonmultiprocessing

Best multiprocessing practice for a dependent loop nested inside another loop


This is a pretty basic question but my understanding of multiprocessing is not great. I have a program that has two generators, both of which generate a very long iterable. My code is of the form

def generator1()
    # Do heavy computations and yield objects

def generator2(results1)
    for result in results1:
        # Do heavy computations and yield objects

if __name__ == '__main__':
    for result in generator2(generator1()):
        # Do some checks. If result is what we want, then terminate the program.

Even though both generators might take longer than the universe's lifespan to terminate, theoretically it should take much less time to obtain a result that I'm looking for.

Ideally I would like n jobs to run generator2 with the first result from generator1, then n jobs to run with the second result, etc, as and when generator1 yields results. What is the best way to do this? I have been playing around with Queues and Processes, but I'm confused as to what approach I should be using.


Original post: For simplicity, let's say we have the following code:

from random import random


def generator1():
    for _ in range(int(1e100)):
        yield random()


def generator2(args):
    num1, num2 = args

    for i in range(int(1e100)):
        yield hash(str(num1**num2))


if __name__ == '__main__':
    for num1 in generator1():
        for num2 in generator1():
            for result in generator2((num1, num2)):
                ...

I would like to use multiprocessing in the code. I suppose a very simple way to do it would be something like

import multiprocessing as mp

...

if __name__ == '__main__':
    with mp.Pool() as pool:
        results = pool.imap(generator2, 
                            ((num1, num2) for num1 in generator1() for num2 in generator1()))
        for r in results:
            ...

... but ideally I would like generator2 with different values for num1 to be run simultaneously. For example, suppose I would like 2 processes to work on arguments (num1_a, ...), 2 of them to work on arguments (num1_b, ...), etc. I'm confused as to what approach I should use: should I be using different processes, or something like pool.apply_async() etc?


Solution

  • I was hoping to be able to ignore your original post, but I believe that it has important details that need to be considered. I also hope that the following is not too overwhelming:

    1. You are generating potentially a very large number of tasks to be computed in parallel, so many in fact that the program would run until doomsday before all the calculations have completed.
    2. You are expecting one of the calculations (more than one?) to yield a result that will cause you to stop doing any more calculations.

    With that as my premise, this is how I would proceed:

    First, let's observe that in your non-multiprocessing code you had:

    def generator1():
        for _ in range(int(1e100)):
            yield random()
    
    ... # Code omitted:
    
    if __name__ == '__main__':
        for num1 in generator1():
            for num2 in generator1():
                for result in generator2((num1, num2)):
                    ....
    

    It seems to me that instead of iterating generator1 twice, you can iterate just once if you rewrite the code as follows:

    def generator1():
        for _ in range(int(1e100)):
            num1 = random()
            for _ in range(int(1e100)):
                num2 = random()
                yield num1, num2
    
    ... # Code omitted:
    
    if __name__ == '__main__':
        for num1, num2 in generator1():
            for result in generator2((num1, num2)):
                ...
    

    To make it clearer how things work, I will be making the following changes:

    1. My generator1 will only be iterating each loop N times where N has been set to 4.

    2. Rather than producing two random numbers, I will be generating for input tuples as follows:

    for num1 in range(N):
        for num2 in range(N):
            # Generate (num1, num2)
    
    1. Likewise generator2 will complete its calculation by also iterating values num3 in the range 0..N-1.

    2. The results that generator2 will generate will be num1 ** num2 + num3, instead of a hash of a string in order for the program to produce repeatable results.

    You will, of course, modify the above to fit your actual problem.

    I am assuming that each tuple that generator1 is producing is relatively inexpensive to produce (e.g. random numbers) since you were not attempting to use use multiprocessing in your original solution that used multiprocessing.pool.imap. Also, since imap returns a task's results in the order in which the task was submitted to the pool, but the method I will be using does not guarantee an order, I will also be returning the three values that went into the result calculation.

    generator1 instead of being an actual generator will be a process that now write its tuples to a multiprocessing.Queue instance. If there are multiprocessing.cpu_count() processors available and generator1 is running on one of them, that leaves n_processes = multiprocesing.cpu_count() - 1 as the number of processes executing generator2 getting tuples from the queue and generating the final results. These results get written to a second output queue. When generator1 has no more tuples to generate (which presumably will be when hell freezes over in your real-world case, but not in my demo), it will write n_processes sentinel items (the value None) to its queue signifying that no more data will be produced by this process. So when each of the n_processes retrieves the sentinel item instead of a tuple, it will terminate. Likewise, each of the n_processes processes will before terminating write a sentinel value to the output result queue so that when the main process has retrieved all n_processes sentinel items it knows that there is no more data coming and it too can terminate.

    The reality of your situation is, however, that when the main process retrieves a result created by generator2 that satisfies some termination criteria, the main process will just terminate all the other running processes. generator2 presumably also knows what a satisfactory result is and should not even bother outputting unsatisfactory results. Again, for demo purposes, I have generator2 putting to the output queue all results and the main process running until all results have been retrieved and printed.

    This is the basic idea:

    from multiprocessing import Process, Queue, cpu_count
    
    N = 4
    
    def generator1(tuples_queue: Queue, n_processes: int) -> None:
        for num1 in range(N):
            for num2 in range(N):
                tuples_queue.put((num1, num2))
    
        for _ in range(n_processes):
            # show generator2 there is no more data:
            tuples_queue.put(None) # show no more data
    
    def generator2(tuples_queue: Queue, results_queue: Queue) -> None
        # Loop until tuples_queue.get() returns None:
        for t in iter(tuples_queue.get, None):
            num1, num2 = t
            value = num1 ** num2
            for num3 in range(N):
                result = value + num3
                # In addition to putting the result, we also for demo
                # purposes put the 3 input values used to get the result:
                results_queue.put((num1, num2, num3, result)) # Show no more data
    
        results_queue.put(None) # Show no more data from this process
    
    
    def main():
        n_processes = cpu_count() - 1
    
        # Queue of finite size because generator1 can generate values
        # faster than generator2 can process them:
        tuples_queue = Queue(n_processes)
        results_queue = Queue()
    
    
        p1 = Process(target=generator1, args=(tuples_queue, n_processes))
        p1.start()
    
        # rest of the processes to work on output from generator1:
        processes = [
            Process(target=generator2, args=(tuples_queue, results_queue))
            for _ in range(n_processes)
        ]
        for p in processes:
            p.start()
    
        # Get the results:
        seen_end_count = 0
        while seen_end_count < n_processes:
            t = results_queue.get()
            if t is None:
                seen_end_count += 1
            else:
                num1, num2, num3, result = t
                # Is this a result we want to terminate on?
                if False:
                    # Got the result we want
                    print(f'num1 = {num1}, num2 = {num2}, num3 = {num3}, result = {result}', flush=True)
                    # Terminate:
                    break
                else:
                    # Normally we would not want to display results
                    # we are not intetested in, but we do so here for demo purposes:
                    print(f'num1 = {num1}, num2 = {num2}, num3 = {num3}, result = {result}', flush=True)
    
        # Kill the processes
        p1.terminate()
        for p in processes:
            p.terminate()
    
    if __name__ == '__main__':
    
        main()
    

    Prints:

    num1 = 0, num2 = 0, num3 = 0, result = 1
    num1 = 0, num2 = 0, num3 = 1, result = 2
    num1 = 0, num2 = 1, num3 = 0, result = 0
    num1 = 0, num2 = 1, num3 = 1, result = 1
    num1 = 0, num2 = 1, num3 = 2, result = 2
    num1 = 0, num2 = 0, num3 = 2, result = 3
    num1 = 0, num2 = 1, num3 = 3, result = 3
    num1 = 1, num2 = 1, num3 = 0, result = 1
    num1 = 1, num2 = 1, num3 = 1, result = 2
    num1 = 1, num2 = 1, num3 = 2, result = 3
    num1 = 1, num2 = 1, num3 = 3, result = 4
    num1 = 3, num2 = 0, num3 = 0, result = 1
    num1 = 0, num2 = 0, num3 = 3, result = 4
    num1 = 3, num2 = 0, num3 = 1, result = 2
    num1 = 3, num2 = 0, num3 = 2, result = 3
    num1 = 0, num2 = 2, num3 = 0, result = 0
    num1 = 3, num2 = 0, num3 = 3, result = 4
    num1 = 0, num2 = 2, num3 = 1, result = 1
    num1 = 0, num2 = 2, num3 = 2, result = 2
    num1 = 0, num2 = 2, num3 = 3, result = 3
    num1 = 0, num2 = 3, num3 = 0, result = 0
    num1 = 0, num2 = 3, num3 = 1, result = 1
    num1 = 3, num2 = 2, num3 = 0, result = 9
    num1 = 3, num2 = 2, num3 = 1, result = 10
    num1 = 3, num2 = 2, num3 = 2, result = 11
    num1 = 3, num2 = 2, num3 = 3, result = 12
    num1 = 0, num2 = 3, num3 = 2, result = 2
    num1 = 0, num2 = 3, num3 = 3, result = 3
    num1 = 1, num2 = 0, num3 = 0, result = 1
    num1 = 1, num2 = 0, num3 = 1, result = 2
    num1 = 1, num2 = 0, num3 = 2, result = 3
    num1 = 1, num2 = 0, num3 = 3, result = 4
    num1 = 1, num2 = 2, num3 = 0, result = 1
    num1 = 1, num2 = 2, num3 = 1, result = 2
    num1 = 1, num2 = 2, num3 = 2, result = 3
    num1 = 1, num2 = 2, num3 = 3, result = 4
    num1 = 1, num2 = 3, num3 = 0, result = 1
    num1 = 1, num2 = 3, num3 = 1, result = 2
    num1 = 1, num2 = 3, num3 = 2, result = 3
    num1 = 1, num2 = 3, num3 = 3, result = 4
    num1 = 2, num2 = 0, num3 = 0, result = 1
    num1 = 2, num2 = 0, num3 = 1, result = 2
    num1 = 2, num2 = 0, num3 = 2, result = 3
    num1 = 2, num2 = 0, num3 = 3, result = 4
    num1 = 2, num2 = 1, num3 = 0, result = 2
    num1 = 2, num2 = 1, num3 = 1, result = 3
    num1 = 2, num2 = 1, num3 = 2, result = 4
    num1 = 2, num2 = 1, num3 = 3, result = 5
    num1 = 2, num2 = 2, num3 = 0, result = 4
    num1 = 2, num2 = 2, num3 = 1, result = 5
    num1 = 2, num2 = 2, num3 = 2, result = 6
    num1 = 2, num2 = 2, num3 = 3, result = 7
    num1 = 2, num2 = 3, num3 = 0, result = 8
    num1 = 2, num2 = 3, num3 = 1, result = 9
    num1 = 2, num2 = 3, num3 = 2, result = 10
    num1 = 2, num2 = 3, num3 = 3, result = 11
    num1 = 3, num2 = 1, num3 = 0, result = 3
    num1 = 3, num2 = 1, num3 = 1, result = 4
    num1 = 3, num2 = 1, num3 = 2, result = 5
    num1 = 3, num2 = 1, num3 = 3, result = 6
    num1 = 3, num2 = 3, num3 = 0, result = 27
    num1 = 3, num2 = 3, num3 = 1, result = 28
    num1 = 3, num2 = 3, num3 = 2, result = 29
    num1 = 3, num2 = 3, num3 = 3, result = 30
    

    You can see that results are not generated in task submission order. If we sort the output, we get:

    num1 = 0, num2 = 0, num3 = 0, result = 1
    num1 = 0, num2 = 0, num3 = 1, result = 2
    num1 = 0, num2 = 0, num3 = 2, result = 3
    num1 = 0, num2 = 0, num3 = 3, result = 4
    num1 = 0, num2 = 1, num3 = 0, result = 0
    num1 = 0, num2 = 1, num3 = 1, result = 1
    num1 = 0, num2 = 1, num3 = 2, result = 2
    num1 = 0, num2 = 1, num3 = 3, result = 3
    num1 = 0, num2 = 2, num3 = 0, result = 0
    num1 = 0, num2 = 2, num3 = 1, result = 1
    num1 = 0, num2 = 2, num3 = 2, result = 2
    num1 = 0, num2 = 2, num3 = 3, result = 3
    num1 = 0, num2 = 3, num3 = 0, result = 0
    num1 = 0, num2 = 3, num3 = 1, result = 1
    num1 = 0, num2 = 3, num3 = 2, result = 2
    num1 = 0, num2 = 3, num3 = 3, result = 3
    num1 = 1, num2 = 0, num3 = 0, result = 1
    num1 = 1, num2 = 0, num3 = 1, result = 2
    num1 = 1, num2 = 0, num3 = 2, result = 3
    num1 = 1, num2 = 0, num3 = 3, result = 4
    num1 = 1, num2 = 1, num3 = 0, result = 1
    num1 = 1, num2 = 1, num3 = 1, result = 2
    num1 = 1, num2 = 1, num3 = 2, result = 3
    num1 = 1, num2 = 1, num3 = 3, result = 4
    num1 = 1, num2 = 2, num3 = 0, result = 1
    num1 = 1, num2 = 2, num3 = 1, result = 2
    num1 = 1, num2 = 2, num3 = 2, result = 3
    num1 = 1, num2 = 2, num3 = 3, result = 4
    num1 = 1, num2 = 3, num3 = 0, result = 1
    num1 = 1, num2 = 3, num3 = 1, result = 2
    num1 = 1, num2 = 3, num3 = 2, result = 3
    num1 = 1, num2 = 3, num3 = 3, result = 4
    num1 = 2, num2 = 0, num3 = 0, result = 1
    num1 = 2, num2 = 0, num3 = 1, result = 2
    num1 = 2, num2 = 0, num3 = 2, result = 3
    num1 = 2, num2 = 0, num3 = 3, result = 4
    num1 = 2, num2 = 1, num3 = 0, result = 2
    num1 = 2, num2 = 1, num3 = 1, result = 3
    num1 = 2, num2 = 1, num3 = 2, result = 4
    num1 = 2, num2 = 1, num3 = 3, result = 5
    num1 = 2, num2 = 2, num3 = 0, result = 4
    num1 = 2, num2 = 2, num3 = 1, result = 5
    num1 = 2, num2 = 2, num3 = 2, result = 6
    num1 = 2, num2 = 2, num3 = 3, result = 7
    num1 = 2, num2 = 3, num3 = 0, result = 8
    num1 = 2, num2 = 3, num3 = 1, result = 9
    num1 = 2, num2 = 3, num3 = 2, result = 10
    num1 = 2, num2 = 3, num3 = 3, result = 11
    num1 = 3, num2 = 0, num3 = 0, result = 1
    num1 = 3, num2 = 0, num3 = 1, result = 2
    num1 = 3, num2 = 0, num3 = 2, result = 3
    num1 = 3, num2 = 0, num3 = 3, result = 4
    num1 = 3, num2 = 1, num3 = 0, result = 3
    num1 = 3, num2 = 1, num3 = 1, result = 4
    num1 = 3, num2 = 1, num3 = 2, result = 5
    num1 = 3, num2 = 1, num3 = 3, result = 6
    num1 = 3, num2 = 2, num3 = 0, result = 9
    num1 = 3, num2 = 2, num3 = 1, result = 10
    num1 = 3, num2 = 2, num3 = 2, result = 11
    num1 = 3, num2 = 2, num3 = 3, result = 12
    num1 = 3, num2 = 3, num3 = 0, result = 27
    num1 = 3, num2 = 3, num3 = 1, result = 28
    num1 = 3, num2 = 3, num3 = 2, result = 29
    num1 = 3, num2 = 3, num3 = 3, result = 30
    

    Some final points:

    1. The generator1 process can presumably generate the tuples that are input to the generator2 processes much faster than the generator2 processes can generate results. On that assumption the queue of tuples can potentially grow extremely large and you will run out of memory. Therefore we throttle generator1 by putting a size on how big the tuples queue can grow before generator1 is blocked until a tuple is taken off the queue. I have set that value to be n_processes.

    2. There is a distinction between the number of logical cores returned by the call multiprocessing.cpu_count() and the number of physical cores you actually have. See So what are logical cpu cores (as opposed to physical cpu cores)?. My desktop has 2 logical cores for every physical core. If your multiprocessing task is 100% CPU (e.g. it never waits for I/O or network requests to complete), which is the case in my demo (except for the print statements), then my experience has been that your multiprocessing level should be the number of physical cores instead of the number of logical cores. So in my case I probably should have used:

    n_processes = multiprocessing.cpu_count() // 2 - 1
    
    1. When you use the multiprocessing.pool.map method, if you do not specify a chunksize argument one will be computed based on the size of the iterable being passed and the pool size. This is because when the iterable is large, as in your case, fewer but larger task queue puts and gets is more efficient and can reduce the running time. This code essentially uses a chunk size of 1 and may not be the most efficient. But if we were to do our own chunking by having generator1 write to the tuples queue a list of chunksize tuples for some appropriate value of chunksize and generator2 iterates the tuple retrieved, then we have:
    from multiprocessing import Process, Queue, cpu_count
    
    N = 4
    CHUNKSIZE = 4 # 4 chunks of size 4
    
    def generator1(tuples_queue: Queue, n_processes: int) -> None:
        chunk = []
        for num1 in range(N):
            for num2 in range(N):
                chunk.append((num1, num2))
                if len(chunk) == CHUNKSIZE:
                    tuples_queue.put(chunk)
                    chunk = []
        # Any small chunk left over?
        if chunk:
            tuples_queue.put(chunk)
    
        # We cannot chunk up the sentinels (1 per generator2 process):
        for _ in range(n_processes):
            # show generator2 there is no more data:
            tuples_queue.put([None]) # show no more data
    
    def generator2(tuples_queue: Queue, results_queue: Queue) -> None:
        while True:
            chunk = tuples_queue.get()
            if len(chunk) == 1 and chunk[0] is None:
                break;
            for num1, num2 in chunk:
                value = num1 ** num2
                for num3 in range(N):
                    result = value + num3
                    # In addition to putting the result, we also for demo
                    # purposes put the 3 input values used to get the result:
                    results_queue.put((num1, num2, num3, result)) # Show no more data
    
        results_queue.put(None) # Show no more data from this process
    ... # Rest of code unmodified
    

    I have not chunked the results queue since in your actual situation I would think you are only writing "satisfactory" results.

    Update

    You said in your original post, "For example, suppose I would like 2 processes to work on arguments (num1_a, ...), 2 of them to work on arguments (num1_b, ...), etc." Then the following should accomplish this. Note that I am also returning back to the main process the process id of the process that produced the solution to demonstrate what process is working on what and I have added a call to 'time.sleep' to simulate processing time required in generator2 to generate a result:

    from multiprocessing import Process, Queue, cpu_count, current_process
    
    N = 4
    
    def generator1(tuples_queue: Queue, n_processes: int) -> None:
        half_N = N // 2
        for num1 in range(N):
            tuples_queue.put((num1, 0, half_N))
            tuples_queue.put((num1, half_N, N))
    
        for _ in range(n_processes):
            # show generator2 there is no more data:
            tuples_queue.put(None) # show no more data
    
    def generator2(tuples_queue: Queue, results_queue: Queue) -> None:
        import time
    
        pid = current_process().pid
        # Loop until tuples_queue.get() returns None:
        for t in iter(tuples_queue.get, None):
            num1, start, end = t
            for num2 in range(start, end):
                value = num1 ** num2
                for num3 in range(N):
                    time.sleep(.1)
                    result = value + num3
                    # In addition to putting the result, we also for demo
                    # purposes put the 3 input values used to get the result:
                    results_queue.put((num1, num2, num3, result, pid)) # Show no more data
    
        results_queue.put(None) # Show no more data from this process
    
    
    def main():
        n_processes = cpu_count() - 1
    
        # Queue of finite size because generator1 can generate values
        # faster than generator2 can process them:
        tuples_queue = Queue(n_processes)
        results_queue = Queue()
    
    
        p1 = Process(target=generator1, args=(tuples_queue, n_processes))
        p1.start()
    
        # rest of the processes to work on output from generator1:
        processes = [
            Process(target=generator2, args=(tuples_queue, results_queue))
            for _ in range(n_processes)
        ]
        for p in processes:
            p.start()
    
        # Get the results:
        seen_end_count = 0
        while seen_end_count < n_processes:
            t = results_queue.get()
            if t is None:
                seen_end_count += 1
            else:
                num1, num2, num3, result, pid = t
                # Is this a result we want to terminate on?
                if False:
                    # Got the result we want
                    print(f'num1 = {num1}, num2 = {num2}, num3 = {num3}, result = {result}, pid = {pid}', flush=True)
                    # Terminate:
                    break
                else:
                    # Normally we would not want to display results
                    # we are not intetested in, but we do so here for demo purposes:
                    print(f'num1 = {num1}, num2 = {num2}, num3 = {num3}, result = {result}, pid = {pid}', flush=True)
    
        # Kill the processes
        p1.terminate()
        for p in processes:
            p.terminate()
    
    if __name__ == '__main__':
        main()
    

    Prints:

    num1 = 0, num2 = 0, num3 = 0, result = 1, pid = 8184
    num1 = 2, num2 = 2, num3 = 0, result = 4, pid = 10076
    num1 = 0, num2 = 2, num3 = 0, result = 0, pid = 14180
    num1 = 2, num2 = 0, num3 = 0, result = 1, pid = 1492
    num1 = 1, num2 = 2, num3 = 0, result = 1, pid = 16716
    num1 = 1, num2 = 0, num3 = 0, result = 1, pid = 15092
    num1 = 3, num2 = 0, num3 = 0, result = 1, pid = 9456
    num1 = 0, num2 = 0, num3 = 1, result = 2, pid = 8184
    num1 = 0, num2 = 2, num3 = 1, result = 1, pid = 14180
    num1 = 1, num2 = 0, num3 = 1, result = 2, pid = 15092
    num1 = 1, num2 = 2, num3 = 1, result = 2, pid = 16716
    num1 = 2, num2 = 0, num3 = 1, result = 2, pid = 1492
    num1 = 2, num2 = 2, num3 = 1, result = 5, pid = 10076
    num1 = 3, num2 = 0, num3 = 1, result = 2, pid = 9456
    num1 = 0, num2 = 0, num3 = 2, result = 3, pid = 8184
    num1 = 0, num2 = 2, num3 = 2, result = 2, pid = 14180
    num1 = 1, num2 = 2, num3 = 2, result = 3, pid = 16716
    num1 = 2, num2 = 2, num3 = 2, result = 6, pid = 10076
    num1 = 2, num2 = 0, num3 = 2, result = 3, pid = 1492
    num1 = 1, num2 = 0, num3 = 2, result = 3, pid = 15092
    num1 = 3, num2 = 0, num3 = 2, result = 3, pid = 9456
    num1 = 0, num2 = 0, num3 = 3, result = 4, pid = 8184
    num1 = 0, num2 = 2, num3 = 3, result = 3, pid = 14180
    num1 = 1, num2 = 2, num3 = 3, result = 4, pid = 16716
    num1 = 2, num2 = 0, num3 = 3, result = 4, pid = 1492
    num1 = 2, num2 = 2, num3 = 3, result = 7, pid = 10076
    num1 = 1, num2 = 0, num3 = 3, result = 4, pid = 15092
    num1 = 3, num2 = 0, num3 = 3, result = 4, pid = 9456
    num1 = 0, num2 = 1, num3 = 0, result = 0, pid = 8184
    num1 = 0, num2 = 3, num3 = 0, result = 0, pid = 14180
    num1 = 2, num2 = 3, num3 = 0, result = 8, pid = 10076
    num1 = 1, num2 = 1, num3 = 0, result = 1, pid = 15092
    num1 = 1, num2 = 3, num3 = 0, result = 1, pid = 16716
    num1 = 2, num2 = 1, num3 = 0, result = 2, pid = 1492
    num1 = 3, num2 = 1, num3 = 0, result = 3, pid = 9456
    num1 = 0, num2 = 1, num3 = 1, result = 1, pid = 8184
    num1 = 2, num2 = 1, num3 = 1, result = 3, pid = 1492
    num1 = 1, num2 = 3, num3 = 1, result = 2, pid = 16716
    num1 = 1, num2 = 1, num3 = 1, result = 2, pid = 15092
    num1 = 2, num2 = 3, num3 = 1, result = 9, pid = 10076
    num1 = 0, num2 = 3, num3 = 1, result = 1, pid = 14180
    num1 = 3, num2 = 1, num3 = 1, result = 4, pid = 9456
    num1 = 0, num2 = 1, num3 = 2, result = 2, pid = 8184
    num1 = 0, num2 = 3, num3 = 2, result = 2, pid = 14180
    num1 = 2, num2 = 3, num3 = 2, result = 10, pid = 10076
    num1 = 1, num2 = 1, num3 = 2, result = 3, pid = 15092
    num1 = 2, num2 = 1, num3 = 2, result = 4, pid = 1492
    num1 = 1, num2 = 3, num3 = 2, result = 3, pid = 16716
    num1 = 3, num2 = 1, num3 = 2, result = 5, pid = 9456
    num1 = 0, num2 = 1, num3 = 3, result = 3, pid = 8184
    num1 = 2, num2 = 1, num3 = 3, result = 5, pid = 1492
    num1 = 2, num2 = 3, num3 = 3, result = 11, pid = 10076
    num1 = 1, num2 = 3, num3 = 3, result = 4, pid = 16716
    num1 = 0, num2 = 3, num3 = 3, result = 3, pid = 14180
    num1 = 1, num2 = 1, num3 = 3, result = 4, pid = 15092
    num1 = 3, num2 = 1, num3 = 3, result = 6, pid = 9456
    num1 = 3, num2 = 2, num3 = 0, result = 9, pid = 8184
    num1 = 3, num2 = 2, num3 = 1, result = 10, pid = 8184
    num1 = 3, num2 = 2, num3 = 2, result = 11, pid = 8184
    num1 = 3, num2 = 2, num3 = 3, result = 12, pid = 8184
    num1 = 3, num2 = 3, num3 = 0, result = 27, pid = 8184
    num1 = 3, num2 = 3, num3 = 1, result = 28, pid = 8184
    num1 = 3, num2 = 3, num3 = 2, result = 29, pid = 8184
    num1 = 3, num2 = 3, num3 = 3, result = 30, pid = 8184
    

    The above sorted:

    num1 = 0, num2 = 0, num3 = 0, result = 1, pid = 14168
    num1 = 0, num2 = 0, num3 = 1, result = 2, pid = 14168
    num1 = 0, num2 = 0, num3 = 2, result = 3, pid = 14168
    num1 = 0, num2 = 0, num3 = 3, result = 4, pid = 14168
    num1 = 0, num2 = 1, num3 = 0, result = 0, pid = 14168
    num1 = 0, num2 = 1, num3 = 1, result = 1, pid = 14168
    num1 = 0, num2 = 1, num3 = 2, result = 2, pid = 14168
    num1 = 0, num2 = 1, num3 = 3, result = 3, pid = 14168
    num1 = 0, num2 = 2, num3 = 0, result = 0, pid = 11408
    num1 = 0, num2 = 2, num3 = 1, result = 1, pid = 11408
    num1 = 0, num2 = 2, num3 = 2, result = 2, pid = 11408
    num1 = 0, num2 = 2, num3 = 3, result = 3, pid = 11408
    num1 = 0, num2 = 3, num3 = 0, result = 0, pid = 11408
    num1 = 0, num2 = 3, num3 = 1, result = 1, pid = 11408
    num1 = 0, num2 = 3, num3 = 2, result = 2, pid = 11408
    num1 = 0, num2 = 3, num3 = 3, result = 3, pid = 11408
    num1 = 1, num2 = 0, num3 = 0, result = 1, pid = 19648
    num1 = 1, num2 = 0, num3 = 1, result = 2, pid = 19648
    num1 = 1, num2 = 0, num3 = 2, result = 3, pid = 19648
    num1 = 1, num2 = 0, num3 = 3, result = 4, pid = 19648
    num1 = 1, num2 = 1, num3 = 0, result = 1, pid = 19648
    num1 = 1, num2 = 1, num3 = 1, result = 2, pid = 19648
    num1 = 1, num2 = 1, num3 = 2, result = 3, pid = 19648
    num1 = 1, num2 = 1, num3 = 3, result = 4, pid = 19648
    num1 = 1, num2 = 2, num3 = 0, result = 1, pid = 10948
    num1 = 1, num2 = 2, num3 = 1, result = 2, pid = 10948
    num1 = 1, num2 = 2, num3 = 2, result = 3, pid = 10948
    num1 = 1, num2 = 2, num3 = 3, result = 4, pid = 10948
    num1 = 1, num2 = 3, num3 = 0, result = 1, pid = 10948
    num1 = 1, num2 = 3, num3 = 1, result = 2, pid = 10948
    num1 = 1, num2 = 3, num3 = 2, result = 3, pid = 10948
    num1 = 1, num2 = 3, num3 = 3, result = 4, pid = 10948
    num1 = 2, num2 = 0, num3 = 0, result = 1, pid = 9208
    num1 = 2, num2 = 0, num3 = 1, result = 2, pid = 9208
    num1 = 2, num2 = 0, num3 = 2, result = 3, pid = 9208
    num1 = 2, num2 = 0, num3 = 3, result = 4, pid = 9208
    num1 = 2, num2 = 1, num3 = 0, result = 2, pid = 9208
    num1 = 2, num2 = 1, num3 = 1, result = 3, pid = 9208
    num1 = 2, num2 = 1, num3 = 2, result = 4, pid = 9208
    num1 = 2, num2 = 1, num3 = 3, result = 5, pid = 9208
    num1 = 2, num2 = 2, num3 = 0, result = 4, pid = 14308
    num1 = 2, num2 = 2, num3 = 1, result = 5, pid = 14308
    num1 = 2, num2 = 2, num3 = 2, result = 6, pid = 14308
    num1 = 2, num2 = 2, num3 = 3, result = 7, pid = 14308
    num1 = 2, num2 = 3, num3 = 0, result = 8, pid = 14308
    num1 = 2, num2 = 3, num3 = 1, result = 9, pid = 14308
    num1 = 2, num2 = 3, num3 = 2, result = 10, pid = 14308
    num1 = 2, num2 = 3, num3 = 3, result = 11, pid = 14308
    num1 = 3, num2 = 0, num3 = 0, result = 1, pid = 18624
    num1 = 3, num2 = 0, num3 = 1, result = 2, pid = 18624
    num1 = 3, num2 = 0, num3 = 2, result = 3, pid = 18624
    num1 = 3, num2 = 0, num3 = 3, result = 4, pid = 18624
    num1 = 3, num2 = 1, num3 = 0, result = 3, pid = 18624
    num1 = 3, num2 = 1, num3 = 1, result = 4, pid = 18624
    num1 = 3, num2 = 1, num3 = 2, result = 5, pid = 18624
    num1 = 3, num2 = 1, num3 = 3, result = 6, pid = 18624
    num1 = 3, num2 = 2, num3 = 0, result = 9, pid = 11408
    num1 = 3, num2 = 2, num3 = 1, result = 10, pid = 11408
    num1 = 3, num2 = 2, num3 = 2, result = 11, pid = 11408
    num1 = 3, num2 = 2, num3 = 3, result = 12, pid = 11408
    num1 = 3, num2 = 3, num3 = 0, result = 27, pid = 11408
    num1 = 3, num2 = 3, num3 = 1, result = 28, pid = 11408
    num1 = 3, num2 = 3, num3 = 2, result = 29, pid = 11408
    num1 = 3, num2 = 3, num3 = 3, result = 30, pid = 11408