i am trying to simulatie multiple user requests in parallel on a flask server to measure its responsetimes. i thought multiprocessing would be the right module for it. Here is my code:
import multiprocessing as mp
import requests
import datetime
from multiprocessing.pool import Pool
HOST = 'http://127.0.0.1:5000'
API_PATH = '/'
ENDPOINT = [HOST + API_PATH]
MAX_Processes = 10
def send_api_request(ENDPOINT):
r = requests.get(ENDPOINT)
print(mp.current_process())
statuscode = r.status_code
elapsedtime = r.elapsed
return statuscode, elapsedtime
def main():
with Pool() as pool:
try:
#define poolsize
pool = mp.Pool(mp.cpu_count())
print(pool)
results= pool.map(send_api_request, ENDPOINT)
print(results)
except KeyboardInterrupt:
pool.terminate()
if __name__ == '__main__':
main()
when i run this code in cli i only get one result printed and i dont know if the 8 processes are being processed. here is the output:
<multiprocessing.pool.Pool state=RUN pool_size=8>
<SpawnProcess name='SpawnPoolWorker-10' parent=19536 started daemon>
200 0:00:00.013491
the target is to run 100 or more requests in parallel on the flask server to get the responsetime of every single requests and put them in an csv sheet.
anyone knows how i can get every result from the processes?
Thank you!
You only get one result because your ENDPOINT
list object's length is 1, so you need to set it like ENDPOINT = [HOST + API_PATH]*8
(8 is number of request)
And also multithreading
is more suitable in your case because you are only send a GET
request that is I/O
bound.
Here is the example by using threadpool approach ;
import requests
import datetime
from concurrent.futures import ThreadPoolExecutor
HOST = 'http://127.0.0.1:5000'
API_PATH = '/'
ENDPOINT = [HOST+API_PATH]*8
def send_api_request(ENDPOINT):
r = requests.get(ENDPOINT)
statuscode = r.status_code
elapsedtime = r.elapsed
return statuscode, elapsedtime
def main():
with ThreadPoolExecutor(max_workers=8) as pool:
iterator = pool.map(send_api_request, ENDPOINT)
for result in iterator:
print(result)
if __name__ == '__main__':
main()
EDIT :
As an answer to OP's question in the comment:
POST
request is also I/O
bound operation. Yes python
threads are not offer parallelism, works sequentially, but it offers concurency. You can work with another task when your main thread is blocked some I/O
operation like time.sleep
, GET/POST
request, writing/reading
to a file)
GIL
is bad when you want to do CPU bound operation, but GIL
is not bad when you want to do I/O
bound operation because GIL
is released by the Cpython
interpreter when you do a I/O
bound operation.
Let's say you want to send 4 request to a web server, how this will be work in python, let's examine first 2 steps ;
First thread will create a socket
object and this thread will ask the kernel using socket
syscall
, "hey kernel connect to 127.0.0.1
address on 5000
port and send this GET
request and also give me the response of this GET
request". Python will also release the GIL before the socket system call. (This is important)
Kernel will accept your request and will handle this operation, and also kernel will block you (if you not to tell kernel to not block your thread/process
) after blocking, kernel will do context switch to let another thread or process run, let's say switched to second thread.
Second thread will try to run. And first, it will check the status of Global Interpreter Lock, it will see that GIL is released. It is well, second thread can run now, this second thread do same thing with first one. It will release the GIL and it will ask to kernel, "hey kernel connect to 127.0.0.1
address on 5000
port and send this GET
request and also give me the response of this GET
request"
Yes these steps are sequential, but very fast, you will send your GET/POST
request quickly.
You will not wait for the result of first request to send second request when you use thread because it offer concurrency
You can also check this answer
Please correct me if I'm wrong somewhere.