Search code examples
pythonpython-3.xpython-multithreading

How do I use the threading module in python to handle simultaneous http requests?


I have some code that makes two calls to an api for each item it processes. It works as intended, but it takes 1-3 seconds per item. In an attempt to speed it up, I tried to use the threading module to be performing 10 requests at a time, but it seems to be behaving the same way it was before I added the threading. The time it takes to process the data from the api is ~0.2 milliseconds per call, so that should not be causing the holdup.

Here is the relevant portion of my code:

import threading

.
.
.

def secondary():
    global queue
    item = queue.pop()
    queue|=func1(item)# func1 returns data from an api using the requests module
    with open('data/'+item,'w+') as f:
        f.write(func2(item))# func2 also returns data from an api using the requests module
    global num_procs
    num_procs-=1
    
def primary():
    t=[]# threads
    global num_procs
    num_procs+=min(len(queue),10-num_procs)
    for i in range(min(len(queue),10-num_procs)):
        t+=[threading.Thread(target=secondary)]
    for i in t:
        i.start()
        i.join()

queue = {'initial_data'}
num_procs=0# number of currently running processes - when it reaches 10, stop creating new ones
while num_procs or len(queue):
    primary()

What do I need to do to make it run concurrently? I would rather use threading, but if asynchronous is better, how do I implement that?


Solution

  • Immediately after starting each thread, you wait for the thread to finish:

    for i in t:
        i.start()
        i.join()
    

    The threads never get a chance to execute in parallel. Instead, only wait for the threads to finish after you've started all of them.