Search code examples
pythonmultithreadingpython-multithreading

Python - Multiprocessing and Threads for IO


Hello I am trying to come up with an example of taking non async code that uses threads and convert it to something that uses both.

My goal: Spawn off 4 Processes, and and with each process spawn off 10 threads at the same time.

import requests
import multiprocessing
from concurrent import futures


def poll_data_1(data):
    response = requests.get('https://breadcrumbscollector.tech/feed/')
    print(f'Got data of length: {len(response.content)} in just {response.elapsed}')


def thread_set(data):
    max_workers = 10
    concurrent = futures.ThreadPoolExecutor(max_workers)
    with concurrent as ex:
        ex.map(poll_data_1, data)


data =range(40)
data1 =[]
for l in data:
    data1.append([l])

# Mutliprocessing
with multiprocessing.Pool(processes=4, maxtasksperchild=1) as pool:
    pool.imap_unordered(thread_set, data1)     
    pool.close()
    pool.join()

So this code "Works" but it looks like it only opens 1 process at a time. So the 10 threads will run, than 10 more. My goal here would be to run all 40 threads at once.

The reason I am trying to do this is my real application is trying to do 8,000-14,000 IO bound requests. So threading is not scaling that high. If I can say have my real server open process=to CPU, and each process spawn 1000 threads I think it would work better.

Or Im super wrong... Thanks!


Solution

  • You need a loop to block the main thread from closing the pool until all the jobs are finished.

    Replace

    pool.imap_unordered(thread_set, data1)
    

    With

    for result in pool.imap_unordered(thread_set, data1):
        pass
    

    And then run your example again.

    Also you don't need:

    pool.close()
    pool.join()
    

    as the with statement does that automatically.