Search code examples
pythonnumpypython-multithreadingconcurrent.futures

Why is ThreadPoolExecutor slower than for loop?


Code 1

def feedforward(self,d):
    out = []
    for neuron in self.layer:
        out.append(neuron.feedforward(d))
    return np.array(out)

This the original code I've written for performing a feedforward. I wanted to improve the speed of execution using multithreading so i edited the code to use ThreadPoolExecutor from concurrent.futures module

Code 2

def parallel_feedforward(self,func,param):
    return func.feedforward(param)

def feedforward(self,d):
    out = []
    with ThreadPoolExecutor(max_workers = 4) as executor:
        new_d = np.tile(d,(len(self.layer),1))
        for o in executor.map(self.parallel_feedforward,self.layer,new_d):
            out.append(o)
    return np.array(out)

variable d is a vector, i used np.tile() so that the executor.map takes the input properly

After timing the the speed of execution of both. I found out that the Code 1 is significantly faster than the Code 2 (Code 2 is almost 8-10 times slower). But wouldn't the code using multithreading be faster than it's loop counterpart. Is it because the code I've written is wrong or is it because of something else. If it is because of some mistake in my code, can someone tell me what have i did wrong?.

Thanks for your help in advance.


Solution

  • Hari,

    you should do a quick google on python and threads - notably that python "threads" won't run in parallel because of the python GIL (...google it). So if you function above is CPU-bound, then it won't actually run faster using python threads as you've got above.

    To really run in parallel, you need to be using ProcessPoolExecutor instead - that gets around the python "GIL lock" that is present with threading.


    As to why it might run 8-10 times slower - just one thought is that when you use futures, when you make a call with arguments to an executor, futures will pickle up your arguments to pass to the workers, who will then un-pickle in the thread/process to use there. (if this is new to you, do a quick google on python pickling)

    If you have arguments that are non-trivial in size, this can take a large amount of time.

    So this could be why you're seeing the slowdown. ...I have seen a huge slowdown in my own code because I tried to pass large sized arguments to workers.