Search code examples
pythonamazon-web-servicesamazon-s3aws-lambdapickle

How to load a pickle file from S3 to use in AWS Lambda?


I am currently trying to load a pickled file from S3 into AWS lambda and store it to a list (the pickle is a list).

Here is my code:

import pickle
import boto3

s3 = boto3.resource('s3')
with open('oldscreenurls.pkl', 'rb') as data:
    old_list = s3.Bucket("pythonpickles").download_fileobj("oldscreenurls.pkl", data)

I get the following error even though the file exists:

FileNotFoundError: [Errno 2] No such file or directory: 'oldscreenurls.pkl'

Any ideas?


Solution

  • As shown in the documentation for download_fileobj, you need to open the file in binary write mode and save to the file first. Once the file is downloaded, you can open it for reading and unpickle.

    import pickle
    import boto3
    
    s3 = boto3.resource('s3')
    with open('oldscreenurls.pkl', 'wb') as data:
        s3.Bucket("pythonpickles").download_fileobj("oldscreenurls.pkl", data)
    
    with open('oldscreenurls.pkl', 'rb') as data:
        old_list = pickle.load(data)
    

    download_fileobj takes the name of an object in S3 plus a handle to a local file, and saves the contents of that object to the file. There is also a version of this function called download_file that takes a filename instead of an open file handle and handles opening it for you.

    In this case it would probably be better to use S3Client.get_object though, to avoid having to write and then immediately read a file. You could also write to an in-memory BytesIO object, which acts like a file but doesn't actually touch a disk. That would look something like this:

    import pickle
    import boto3
    from io import BytesIO
    
    s3 = boto3.resource('s3')
    with BytesIO() as data:
        s3.Bucket("pythonpickles").download_fileobj("oldscreenurls.pkl", data)
        data.seek(0)    # move back to the beginning after writing
        old_list = pickle.load(data)