Search code examples
pythonconcurrencyhttprequestpython-asyncio

is it really good idea to use asyncio with files


I've been reading and watching videos a lot about asyncio in python, but there's something I can't wrap my head around.

I understand that asyncio is great when dealing with databases or http requests, because database management systems and http servers can handle multiple rapid requests, but...

How can it be good for files, writing in particular..

Let's discuss this case with this code that is running on a server as fastAPI webApp.

@app.get('/end_point')
async function ()
    do_something()
    await write_to_a_file
    do_more_stuff_and_return

Now, as a server, it may get like 50, 100, or even more rapid requests that will trigger this function, yes await will pause the next instruction inside function() , but if there are multiple requests coming this function will start so many times and execute multiple write attempt at once to the file, which may corrupt the data inside.

That's why I can't really understand how it is recommended to use asyncio for files.

Am I getting something wrong here?
Is there a way to instruct the event_loop to give this higher priority to that function event or some another way to avoid this?

What I'm doing now is delaying writing files until I have X amount of data to write [write in bursts], but I still think that any function that does interaction with files shouldn't be async function.

But then, I don't know how am I going to deal with tons of requests....
Your thoughts will be highly appreciated as I'm having a hard time with this.

thanks, have a good day.

I'm trying to write data to file, but the server will surly receive bursts of requests.
So in a way I don't know how to handle this situation, and I fear the corruption of data.


Solution

  • I'm trying to write data to file, but the server will surly receive bursts of requests. So in a way I don't know how to handle this situation, and I fear the corruption of data.

    Yes, corruption could happen from time to time especially if you use aiofiles. If I'm not wrong it uses threads to achieve this.

    I think you have two options here:

    1. Using lock and background tasks: in FastApi(since you mentioned) you can do the writing in the background (after sending the respond to the user), and there, acquire a lock object for writing to the file.

    2. Queue the messages for writing: instead of directly writing to the file, put them on a queue and let another task/process get do the writing part.

    In both cases user won't wait for writing to be finished and get the response immediately and the writing part will happen in sync.