I want to send HTTP requests to 10,000 hosts, and I have 5 different header values. I want to send 10,000 requests with the 1st header value attached, wait it to finish, and then start the second 10,000 request with the 2nd header value attached, within each 10,000 request I want to use multipthread to fasten the process.
To send just requests to 10,000 hosts with mutilpthread, I can do with the following code:
with concurrent.futures.ThreadPoolExecutor(max_workers = CONNECTIONS) as executor:
args = ((url, header) for url in urls)
executor.map(lambda p: send_request(*p), args)
To implement what I eventually want, I am thinking to wrap the above code this way:
for header in headers:
with concurrent.futures.ThreadPoolExecutor(max_workers = CONNECTIONS) as executor:
args = ((url, header) for url in urls)
executor.map(lambda p: send_request(*p), args)
But it does not seem to work. It runs the loop but seems skipped the whole loop body. I wonder where I did it wrong, and what is the correct way to do it.
You aren't capturing the futures return of the map and waiting for the results. This is the sync point you need so that "all of header 1 finished before 2"
There are a number of ways to do this, such as using as_completed, but in your case all you need to do is dummy evaluate/wait for the results:
for header in headers:
with concurrent.futures.ThreadPoolExecutor(max_workers = CONNECTIONS) as executor:
args = ((url, header) for url in urls)
futures = executor.map(lambda p: send_request(*p), args)
list(futures)
In this case, list
will iterate through the futures generator, waiting for all of them to evaluate, creating a wait sync point, then continuing