Search code examples
pythonmultithreadingperformancesortingmergesort

Sorting seems to be slower with 2 threads instead of 1


I'm beginning with threads in Python, and trying to implement a merge sort where, at the beginning, the job gets splitted into 2 threads. I'm using collections.deque, itertools.islice, threading.Thread.

I create two threads at the beginning, they do each half the job normally, then I join them and merge the result. But I noticed that it takes way longer (almost 2 twice as long) with two threads than it does when I do it normally.

How can this be possible ? Here is a link to the code, I can reproduce the main parts here if needed (I posted that question on Code Review SE too, and I'd rather keep this one short)

Is it linked to this question (seems to be a similar problem in C++) ? Thank you very much.


Solution

  • How can this be possible?

    Unlike C++, Python is quite difficult to parallelize because of the GIL.

    While collections.deque's append and popleft are thread-safe, this does not guarantee that they'll perform well in a non-serial paradigm.

    Is it linked to this question?

    No. The GIL is a property of CPython. It is totally disjoint from false sharing.

    It takes way longer (almost 2 twice as long) with two threads than it does when I do it normally.

    This is because the GIL doesn't support shared memory multithreading. As such, you're essentially running your code serially twice.