Search code examples
amazon-s3boto3

how to update metadata on an S3 object larger than 5GB?


I am using the boto3 API to update the S3 metadata on an object.

I am making use of How to update metadata of an existing object in AWS S3 using python boto3?

My code looks like this:

    s3_object = s3.Object(bucket,key)
    new_metadata = {'foo':'bar'}
    s3_object.metadata.update(new_metadata)
    s3_object.copy_from(CopySource={'Bucket':bucket,'Key':key}, Metadata=s3_object.metadata, MetadataDirective='REPLACE')

This code fails when the object is larger than 5GB. I get this error:

botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the CopyObject operation: The specified copy source is larger than the maximum allowable size for a copy source: 5368709120

How does one update the metadata on an object larger than 5GB?


Solution

  • I haven't tried it myself but seems you can use boto3.resource('s3').meta.client.copy method to overwrite the object and specify the metadata in ExtraArgs parameter. This method performs multipart upload so it's suitable for working with files larger than 5 GB.

    EDIT: I've just tried that and it works. The only thing to keep in mind is to add MetadataDirective parameter to ExtraArgs dictionary, otherwise you'll encounter the following error:

    botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the CopyObject operation: This copy request is illegal because it is trying to copy an object to itself without changing the object's metadata, storage class, website redirect location or encryption attributes.
    

    Code example:

    import boto3
    
    
    s3_resource = boto3.resource("s3")
    
    bucket, key = "your-bucket", "your-key"
    
    copy_source = {"Bucket": bucket, "Key": key}
    
    extra_args = {
        "ContentType": "text/csv",
        "MetadataDirective": "REPLACE",
    }
    
    s3_resource.meta.client.copy(copy_source, bucket, key, ExtraArgs=extra_args)