I have a Numpy array in memory that represents an image. I want to upload it to AWS S3 using boto3.
I have tried 4 different approaches:
import numpy as np
import boto3
from PIL import Image
# Mock image
np_image = np.zeros((height, width, channels), dtype=np.uint8)
# 1st approach - using io.BytesIO()
image = io.BytesIO(np_image)
# 2nd approach - using PIL.Image
image = Image.fromarray(np_image)
# 3rd approach - using io.BytesIO() and PIL.Image
image = Image.fromarray(np_image)
byte_io = io.BytesIO()
image.save(byte_io, format="jpeg")
# Upload image using boto3
s3.upload_fileobj(
Fileobj=image,
Bucket=BUCKET_NAME,
ExtraArgs={
"ContentType": "image/jpeg"
}
)
None of that worked as I'm getting error ValueError: Fileobj must implement read..
So far the only thing that's "worked" is the following:
# 4th approach - using tobytes() from numpy.ndarray
io.BytesIO(np_image.tobytes())
With this approach the file is successfully uploaded. However, the file is corrupted as this is shown when I open the image:
One solution is saving the file in disk and then uploading it, but I believe that's not efficient.
You need to both create the image file and upload that file to S3. One method is to call Image.save()
, save the data to a in-memory buffer, than use S3.put_object()
to upload that data:
# Create a simple array, with red in the top left corner,
# and blue in the bottom right corner
import numpy
np_array = numpy.zeros((100, 100, 3), numpy.uint8)
np_array[0:40, 0:40] = (255, 0, 0)
np_array[60:100, 60:100] = (0, 0, 255)
# Generate the image file
from PIL import Image
import io
im = Image.fromarray(np_array)
bits = io.BytesIO()
im.save(bits, "png")
# Upload this file to S3
import boto3
# Seek back to the beginning of the file
bits.seek(0, 0)
s3 = boto3.client('s3')
# Make the file public for this demo, and set the content type
s3.put_object(
Bucket=example_bucket, Key=example_key,
Body=bits,
ACL="public-read", ContentType="image/png",
)