Search code examples
pythonnumpyamazon-s3boto3

How to create JPEG file from Numpy array and upload it to AWS S3 using boto3?


I have a Numpy array in memory that represents an image. I want to upload it to AWS S3 using boto3.

I have tried 4 different approaches:

import numpy as np
import boto3
from PIL import Image

# Mock image
np_image = np.zeros((height, width, channels), dtype=np.uint8) 

# 1st approach - using io.BytesIO()
image = io.BytesIO(np_image)

# 2nd approach - using PIL.Image
image = Image.fromarray(np_image)

# 3rd approach - using io.BytesIO() and PIL.Image
image = Image.fromarray(np_image)
byte_io = io.BytesIO()
image.save(byte_io, format="jpeg")

# Upload image using boto3
s3.upload_fileobj(
    Fileobj=image,
    Bucket=BUCKET_NAME,
    ExtraArgs={
        "ContentType": "image/jpeg"
    }
)

None of that worked as I'm getting error ValueError: Fileobj must implement read..

So far the only thing that's "worked" is the following:

# 4th approach - using tobytes() from numpy.ndarray

io.BytesIO(np_image.tobytes())

With this approach the file is successfully uploaded. However, the file is corrupted as this is shown when I open the image:

Uploaded Image

One solution is saving the file in disk and then uploading it, but I believe that's not efficient.


Solution

  • You need to both create the image file and upload that file to S3. One method is to call Image.save(), save the data to a in-memory buffer, than use S3.put_object() to upload that data:

    # Create a simple array, with red in the top left corner, 
    # and blue in the bottom right corner
    import numpy
    np_array = numpy.zeros((100, 100, 3), numpy.uint8)
    np_array[0:40, 0:40] = (255, 0, 0)
    np_array[60:100, 60:100] = (0, 0, 255)
    
    # Generate the image file
    from PIL import Image
    import io
    im = Image.fromarray(np_array)
    bits = io.BytesIO()
    im.save(bits, "png")
    
    # Upload this file to S3
    import boto3
    # Seek back to the beginning of the file
    bits.seek(0, 0)
    s3 = boto3.client('s3')
    # Make the file public for this demo, and set the content type
    s3.put_object(
        Bucket=example_bucket, Key=example_key, 
        Body=bits, 
        ACL="public-read", ContentType="image/png",
    )