Search code examples
pythonpython-requestspython-multiprocessing

simulating multiple user requests on webserer with python multiprocessing


i am trying to simulatie multiple user requests in parallel on a flask server to measure its responsetimes. i thought multiprocessing would be the right module for it. Here is my code:

import multiprocessing as mp
import requests
import datetime
from multiprocessing.pool import Pool

HOST = 'http://127.0.0.1:5000'
API_PATH = '/'
ENDPOINT = [HOST + API_PATH]
MAX_Processes = 10

def send_api_request(ENDPOINT):
    r = requests.get(ENDPOINT)
    print(mp.current_process())
    statuscode = r.status_code
    elapsedtime = r.elapsed
    return statuscode, elapsedtime

def main():
    with Pool() as pool: 
        try:
            #define poolsize
            pool = mp.Pool(mp.cpu_count())
            print(pool)
            results= pool.map(send_api_request, ENDPOINT)
            print(results)
        except KeyboardInterrupt:
            pool.terminate()

if __name__ == '__main__':
    main()

when i run this code in cli i only get one result printed and i dont know if the 8 processes are being processed. here is the output:

<multiprocessing.pool.Pool state=RUN pool_size=8>
<SpawnProcess name='SpawnPoolWorker-10' parent=19536 started daemon>
200 0:00:00.013491

the target is to run 100 or more requests in parallel on the flask server to get the responsetime of every single requests and put them in an csv sheet.

anyone knows how i can get every result from the processes?

Thank you!


Solution

  • You only get one result because your ENDPOINT list object's length is 1, so you need to set it like ENDPOINT = [HOST + API_PATH]*8 (8 is number of request)

    And also multithreading is more suitable in your case because you are only send a GET request that is I/O bound.

    Here is the example by using threadpool approach ;

    import requests
    import datetime
    from concurrent.futures import ThreadPoolExecutor
    
    HOST = 'http://127.0.0.1:5000'
    API_PATH = '/'
    ENDPOINT = [HOST+API_PATH]*8
    
    def send_api_request(ENDPOINT):
        r = requests.get(ENDPOINT)
        statuscode = r.status_code
        elapsedtime = r.elapsed
        return statuscode, elapsedtime
    
    def main():
    
        with ThreadPoolExecutor(max_workers=8) as pool: 
            iterator = pool.map(send_api_request, ENDPOINT)
    
        for result in iterator:
            print(result)
    
    if __name__ == '__main__':
        main()
    

    EDIT :

    As an answer to OP's question in the comment:

    POST request is also I/O bound operation. Yes python threads are not offer parallelism, works sequentially, but it offers concurency. You can work with another task when your main thread is blocked some I/O operation like time.sleep, GET/POST request, writing/reading to a file)

    GIL is bad when you want to do CPU bound operation, but GIL is not bad when you want to do I/O bound operation because GIL is released by the Cpython interpreter when you do a I/O bound operation.

    Let's say you want to send 4 request to a web server, how this will be work in python, let's examine first 2 steps ;

    First thread will create a socket object and this thread will ask the kernel using socket syscall, "hey kernel connect to 127.0.0.1 address on 5000 port and send this GET request and also give me the response of this GET request". Python will also release the GIL before the socket system call. (This is important)

    Kernel will accept your request and will handle this operation, and also kernel will block you (if you not to tell kernel to not block your thread/process) after blocking, kernel will do context switch to let another thread or process run, let's say switched to second thread.

    Second thread will try to run. And first, it will check the status of Global Interpreter Lock, it will see that GIL is released. It is well, second thread can run now, this second thread do same thing with first one. It will release the GIL and it will ask to kernel, "hey kernel connect to 127.0.0.1 address on 5000 port and send this GET request and also give me the response of this GET request"

    Yes these steps are sequential, but very fast, you will send your GET/POST request quickly.

    You will not wait for the result of first request to send second request when you use thread because it offer concurrency

    You can also check this answer

    Please correct me if I'm wrong somewhere.