Search code examples
pythonpython-multiprocessingpython-multithreading

Comparison between threading module and multiprocessing module


So I am trying to compare whether threading is faster or multiprocessing. Theoretically due to GIL, multiprocessing should be faster than multithreading as only one thread runs at a time. But I am getting opposite results i.e threading is taking less time than multiprocessing, what am I missing please help.

Below is the code of threading

import threading
from queue import Queue
import time

print_lock = threading.Lock()

def exampleJob(worker):
    time.sleep(10)  
    with print_lock:
        print(threading.current_thread().name,worker)


def threader():
    while True:

        worker = q.get()


        exampleJob(worker)


        q.task_done()

q = Queue()

for x in range(4):
     t = threading.Thread(target=threader)

     print(x)
     t.daemon = True


     t.start()

start = time.time()


for worker in range(8):
    q.put(worker)


q.join()


print('Entire job took:',time.time() - start)

Below is the code of multiprocessing

import multiprocessing as mp
import time

def exampleJob(print_lock,worker):                 # function simulating some computation
    time.sleep(10)
    with print_lock:
        print(mp.current_process().name,worker)

def processor(print_lock,q):                       # function where process pick up the job
    while True:
        worker = q.get()
        if worker is None: # flag to exit the process
            break
        exampleJob(print_lock,worker)


if __name__ == '__main__':

    print_lock = mp.Lock()
    q = mp.Queue()
    processes = [mp.Process(target=processor,args=(print_lock,q)) for _ in range(4)]

    for process in processes:
        process.start()    

    start = time.time()
    for worker in range(8):
        q.put(worker)

    for process in processes:
        q.put(None) # quit indicator

    for process in processes:
        process.join()

    print('Entire job took:',time.time() - start)


Solution

  • Adding to @zmbq threading will be slower only when you are doing a computationally intensive task due to the presence of GIL. If your operations are I/O bound and few other similar operations then threading will be definitely faster since there is less overhead involved. Please refer to the following blog for a better understanding of the same.

    Exploiting Multiprocessing and Multithreading in Python as a Data Scientist