Search code examples
python-3.xdownloadiorequestbigdata

How to limit memory cost when request big data files?


I want to download file from my minio server:

response = client.get_object(bucket_name, object_name, version_id)
res_data: str = response.data.decode('utf8')

When I run decode method, all datas will be extracted to memory and if this file is too large(>4GB, i.e.), the python process will crash.

So how to limit memory cost when request big data files?


Solution

  • when dealing with large files, it is important to load the files into chunks.

    For example:

    response = client.get_object(bucket_name, object_name, version_id
    # Define a chunk size (for example 1MB)
    chunk_size = 1024 * 1024
    # Process the file in chunks
    for chunk in response.stream(chunk_size):
        decoded_chunk = chunk.decode('utf8')
        file.write(chunk)