Search code examples
pythonmatplotlibamazon-s3boto3

How to read image file from S3 bucket directly into memory?


I have the following code

import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import boto3
s3 = boto3.resource('s3', region_name='us-east-2')
bucket = s3.Bucket('sentinel-s2-l1c')
object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
object.download_file('B01.jp2')
img=mpimg.imread('B01.jp2')
imgplot = plt.imshow(img)
plt.show(imgplot)

and it works. But the problem it downloads file into current directory first. Is it possible to read file and decode it as image directly in RAM?


Solution

  • I would suggest using io module to read the file directly in to memory, without having to use a temporary file at all.

    For example:

    import matplotlib.pyplot as plt
    import matplotlib.image as mpimg
    import numpy as np
    import boto3
    import io
    
    s3 = boto3.resource('s3', region_name='us-east-2')
    bucket = s3.Bucket('sentinel-s2-l1c')
    object = bucket.Object('tiles/10/S/DG/2015/12/7/0/B01.jp2')
    
    file_stream = io.StringIO()
    object.download_fileobj(file_stream)
    img = mpimg.imread(file_stream)
    # whatever you need to do
    

    You could also use io.BytesIO if your data is binary.