Search code examples
pythonpython-asyncioaiohttppython-aiofiles

How to limit the number of concurrent read / write with aiofiles?


My program would concurrently download about 10 million pieces of data with aiohttp and then write the data to about 4000 files on disk.

I use the aiofiles library because I want my program to also do other stuff when it is reading/writing a file.

But I worry that if the program try to write to all of the 4000 files at the same time, the hard disk can't do all the writes that quickly.

Is it possible to limit the number of concurrent writes with aiofiles (or other library)? Does aiofiles already do this?

Thanks.

test code:

import aiofiles
import asyncio


async def write_to_disk(fname):
    async with aiofiles.open(fname, "w+") as f:
        await f.write("asdf")


async def main():
    tasks = [asyncio.create_task(write_to_disk("%d.txt" % i)) 
             for i in range(10)]
    await asyncio.gather(*tasks)


asyncio.run(main())

Solution

  • You can use asyncio.Semaphore to limit the number of concurrent tasks. Simply force your write_to_disk function to acquire the semaphore before writing:

    import aiofiles
    import asyncio
    
    
    async def write_to_disk(fname, sema):
        # Edit to address comment: acquire semaphore after opening file
        async with aiofiles.open(fname, "w+") as f, sema:
            print("Writing", fname)
            await f.write("asdf")
            print("Done writing", fname)
    
    
    async def main():
        sema = asyncio.Semaphore(3)  # Allow 3 concurrent writers
        tasks = [asyncio.create_task(write_to_disk("%d.txt" % i, sema)) for i in range(10)]
        await asyncio.gather(*tasks)
    
    
    asyncio.run(main())
    

    Note both the sema = asyncio.Semaphore(3) line as well as the addition of sema, in the async with.

    Output:

    """
    Writing 1.txt
    Writing 0.txt
    Writing 2.txt
    Done writing 1.txt
    Done writing 0.txt
    Done writing 2.txt
    Writing 3.txt
    Writing 4.txt
    Writing 5.txt
    Done writing 3.txt
    Done writing 4.txt
    Done writing 5.txt
    Writing 6.txt
    Writing 7.txt
    Writing 8.txt
    Done writing 6.txt
    Done writing 7.txt
    Done writing 8.txt
    Writing 9.txt
    Done writing 9.txt
    """