Search code examples
pythonmultithreadingpython-asynciopython-anyio

Asyncio async_wrap to convert sync functions to async in python. How does it work?


I huge chunk of legacy sync code ( a huge function which calls other sync functions, includes sync http API calls with requests library, and so on ). This function is run as a part of a Queue worker (picks up tasks from the queue and executes this function ) Facing performance issues and moving to async function seemed as a way out.

To save on time and avoid converting every single function to async def, I used anyio.to_thread.run_sync.Performance looked great until it broke midway(works 9/10 times ) without any errors being thrown. The Queue worker times out with no errors caught.

Im planning to move to https://dev.to/0xbf/turn-sync-function-to-async-python-tips-58nn asyncio (although I dont expect much since anyio already uses asyncio as a backend), ..

But help me understand this. How is converting a sync function to run in a thread speeding up my worker? Will the GIL not block the main thread when the new thread is run to completion? Or perhaps when the second thread is idle doing I/O, it gets context switched with the main thread?

As for my concrete problem, I believe its happening because of the increasing number of total threads in the system. Is it fair to assume so since I've just assigned 0.25vCPU to the worker container? Also, are threads CPU intensive or memory intensive? In the context of python(GIL), it doesnt make sense to have threads for a CPU intensive workload right? Does it mean more threads implies more load on memory and not on CPU per se?


Solution

  • A thread is not inherently CPU-heavy or RAM-heavy, it depends on what it does. If what was slow in your old code was waiting for requests to respond, then it is neither of them, it is IO-heavy.

    Indeed, async can be a solution to performance problems when they are IO related. But threading can too.

    The blog link you shared shows how to wrap a sync task in an async one, such that you can (mostly transparently) run them concurrently.

    I always recommend the RealPython's tutorial on concurrency

    The requests in your task do not magically get asynchronous, it's just the task that appear that way. You could do exactly the same with threading.

    I doubt the speedup you observed was caused by true concurrency achieved thanks to anyio.to_thread.run_sync as it simply says :

    Call the given function with the given arguments in a worker thread.

    I think you are confusing many things together, and fell victim to the async hype. But async is not a miracle solution. You may need to rewrite parts of the code to speed it up. Or not. It depends on exactly what takes time and why is it slow. Without more details on what the tasks does (and exactly how), I can't tell you how to speed things up.