Search code examples
androidpythonmultithreadingwebsockettornado

Python multithreaded server and asynchronuous websocket communication with Android clients


I have an Android client app that sends some data to a server in Python, where the Python server is supposed to run a long time-consuming operation/computation and return the results to the client.

To do so, I initially started using Flask with Python on the server side, and an asynchronous android http library on the client side to send the data via http POST. However, I quickly noticed that this is not the way to go, because the computation on the server takes time which causes problems such as the client getting timeout errors ... etc.

Then, I started using Tornado's Websockets on the server side, and an android library for websockets on the client side. However, the first main problem is that when the server is running the time-consuming operation for a given client, the other potential clients need to wait ... and it seems a bit of a pain to make tornado work in a multi-threaded setting (as it is originally planned to be single-threaded). Another minor problem, is if the client goes off-line while the server is processing his request, then the client might never get the result when he connects back.

Therefore, I would like to ask if you have any solutions or recommendation on what to use if I want to have such a setting with an asynchronous multi-threaded Python server who is supposed to do heavy-cpu computations with data from a client without making the other potential clients wait for their turn; and potentially making the client able to get the result from the server when he connects back.


Solution

  • FIrst of all, if you're going to do cpu-heavy operations in your backend, you [most probably] need to run it in separate process. Not in thread/coro/etc. The reason is that python is limited to single thread at time (you may read more about GIL). Doing cpu-heavy operation in multithreading gives your backend some availability, but hits performance overall.

    1. Simple/old solution for this — run your backend in multiple process (and threads, preferably). I.e. deploy your flask with gunicorn, give it multiple worker processes. This way, you'll have system that capable of doing number_of_processes - 1 heavy computations and still be available for handling requests. Limit for processes is usually up to cpu_cores * 2, depending on cpu arch.

    2. Slightly more complicated:

      • accept data
      • run heavy function in different process
      • gather result, return

      Great interface for this would be ProcessPoolExecutor. The drawback is — it's harder to handle failures/process hanging over

    3. Another way around is task queue + workers. One of most used is celery. Idea is to

      • open WS connection
      • put task in queue
      • worker (in different process or even different physical node) eventually picks up task, compute it, put result in some DB
      • main process gets callback/result of long polling over result DB
      • main process sends result over WS

      This is more suited for really heavy and not real-time tasks, but gives you out-of-the-box handling for failures/restarts/etc.