I'm new to Azure and am trying to upload files (in tens of thousands) to Azure blob storage using their Python SDK. All examples that I came across on the web open a file before uploading it:
Why is this necessary? I am concerned that if this will slow down the upload.
Boto3 for AWS S3 doesn't do this. Can you please explain the reason behind this?
Why is this necessary? I am concerned that if this slows down the upload.
To upload data, the Azure Blob Storage client libraries need a file-like
object.
Large files can be uploaded quickly because the file is read in chunks
. Opening a file before uploading it does not necessarily slow down the upload process. In fact, it can be more efficient to read data from a file-like object than to read it from a file on disk.
Code:
with open("./SampleSource.txt", "rb") as data:
blob.upload_blob(data)
The above code opens the file, creates a file-like object, and uploads the contents of the file-like object to the blob storage.
The same process applies when using Boto3 for uploading to AWS S3 - you can use a file object for the upload by upload_fileobj
.
Code:
with open('filename', 'rb') as data:
s3.upload_fileobj(data, 'mybucket', 'mykey')
Reference:
Uploading files - Boto3 1.34.64 documentation (amazonaws.com)