So I am trying to compare whether threading is faster or multiprocessing. Theoretically due to GIL, multiprocessing should be faster than multithreading as only one thread runs at a time. But I am getting opposite results i.e threading is taking less time than multiprocessing, what am I missing please help.
Below is the code of threading
import threading
from queue import Queue
import time
print_lock = threading.Lock()
def exampleJob(worker):
time.sleep(10)
with print_lock:
print(threading.current_thread().name,worker)
def threader():
while True:
worker = q.get()
exampleJob(worker)
q.task_done()
q = Queue()
for x in range(4):
t = threading.Thread(target=threader)
print(x)
t.daemon = True
t.start()
start = time.time()
for worker in range(8):
q.put(worker)
q.join()
print('Entire job took:',time.time() - start)
Below is the code of multiprocessing
import multiprocessing as mp
import time
def exampleJob(print_lock,worker): # function simulating some computation
time.sleep(10)
with print_lock:
print(mp.current_process().name,worker)
def processor(print_lock,q): # function where process pick up the job
while True:
worker = q.get()
if worker is None: # flag to exit the process
break
exampleJob(print_lock,worker)
if __name__ == '__main__':
print_lock = mp.Lock()
q = mp.Queue()
processes = [mp.Process(target=processor,args=(print_lock,q)) for _ in range(4)]
for process in processes:
process.start()
start = time.time()
for worker in range(8):
q.put(worker)
for process in processes:
q.put(None) # quit indicator
for process in processes:
process.join()
print('Entire job took:',time.time() - start)
Adding to @zmbq threading will be slower only when you are doing a computationally intensive task due to the presence of GIL. If your operations are I/O bound and few other similar operations then threading will be definitely faster since there is less overhead involved. Please refer to the following blog for a better understanding of the same.
Exploiting Multiprocessing and Multithreading in Python as a Data Scientist