Search code examples
pythonamazon-web-servicesamazon-s3aws-glue

Invalid type for parameter Body in Amazon S3


I'm trying to save python list data_issues (of type <class 'list'>) using Glue to an Amazon S3 bucket but I'm getting below error:

Parameter validation failed:
Invalid type for parameter Body, value: 

[('document_entity_sdi', Exception("For column, int and YES don't match.")), 
    ('account_status_sdi', Py4JJavaError('An error occurred while calling o80.getCatalogSink.\n', JavaObject id=o654)), 
    ('account_transaction_fcs_status', Py4JJavaError('An error occurred while calling o80.getCatalogSink.\n', JavaObject id=o797)),
    ('purchase_order_agreement', Py4JJavaError('An error occurred while calling o80.getCatalogSink.\n', JavaObject id=o21565))], 

type: <class 'list'>, valid types: <class 'bytes'>, <class 'bytearray'>, file-like object

I tried this workaround to convert the list bytes(data_issues) but doesn't work:

BUCKET = 'bucket_name'
s3 = boto3.client('s3')
keyid = 'keyID'

print("Uploading S3 object with SSE-KMS")
s3.put_object(Bucket=BUCKET,
          Key='encrypt-key',
          Body=bytes(data_issues),
          ServerSideEncryption='aws:kms',
          SSEKMSKeyId=keyid)
print("Saving to S3, Done")

Solution

  • Seems that your data_issues is a list of strings. You can convert it to bytes in a number of ways. One would would be to make json of it first and then turn it into bytes:

    Body=bytes(json.dumps(data_issues).encode())
    

    Update:

    If you have non-string values, you can do:

    Body=bytes(json.dumps(data_issues, default=str).encode())
    

    Depending on what exactly do you want to do, you can also pickle your payload. This would allow you to reconstruct the data later on if its not suited for json format.