data consistency for large s3 file when using aioboto3 / boto3

async with session.client("s3") as s3:
  object = await s3.get_object(Key=key, Bucket=bucket)
  body = object.get("Body")
  for i in range(1000):
    process_data(await body.read(i + 100))

I need to download some large s3 file with aioboto3. Am I correct to assume that the first get_object call creates a connection, while the read call on body actually reads the data from s3?

If I'm downloading a large s3 file, and a new version is uploaded, would the read call still read from the same version of the object?

Solution

Yes, it creates a connection to S3 and returns a reference to the object. The actual data transfer happens when you use await body.read().

Regarding your second question: once you call get_object and receive the response (which includes the object's Body), that response represents the version of the file that existed at the time the request was made. This means that subsequent read calls on the Body will continue to retrieve data from the version that existed when you made the initial get_object call.

Even if a new version of the object is uploaded to S3 while you're in the process of reading, your read operation will still continue from the original version of the object. In S3, the version you get is "locked in" at the time you request it.