Search code examples
pythonpython-multiprocessing

Python multiprocessing takes too long to complete after main process has finished


I'm trying to familiarize myself with multiprocessing so I used a brute force search to find a string.

The script works as expected and due to the use of an iterator RAM usage is pretty good. What I don't understand is what is happening after the "password" has been found. It always takes double the time for the script to exit (in this example 70sec for finding the password and 160sec to complete) where, as far as I understand, the only thing it still has to do is terminate all the processes.

Is this what is happening or there is something else?

import itertools
import multiprocessing as mp
import string
import time

# start timer
tStart = time.time()

userPass = 'mypass'

def getPassword(passList):
    str = ''.join(passList)
    if userPass == str:
        print('\n')
        print('~~~ CRACKED ~~~')
        print('User password is {}'.format(str))
        print('Cracked password in {:.3f} seconds'.format(time.time() - tStart))

if __name__ == '__main__':
    # possible characters used in password
    chars = list(string.ascii_lowercase)    

    # get all character combinations
    allPasswords = itertools.product(chars, repeat=len(userPass))

    # calculate optimum chunk number
    totalComb = len(chars) ** len(userPass)
    nChunks = int(max(1, divmod(totalComb, mp.cpu_count() * 4)[0]))

    with mp.Pool(processes=mp.cpu_count()) as pool:
        for result in pool.imap_unordered(getPassword, allPasswords, chunksize=nChunks):
            if result == userPass:
                pool.terminate()
                break
            del result # trying to reduce memory usage
    tEnd = time.time()
    tElapsed = tEnd - tStart

    print('Total elapsed time {:.3f} seconds'.format(tElapsed))

Solution

  • Make your getPassword function return the string (at least in the success case). Right now it always returns the default None, so result == userPass is never true and pool.terminate() is never executed.

    Also, you might want to use a much smaller chunksize.