Search code examples
pythonamazon-web-serviceshttpchalice

Parsing raw_body from AWS Chalice multipart/form-data http request


I have the following simple AWS Chalice route:

@app.route('/submit', methods=['POST'],
           content_types=['multipart/form-data'])
def submit():
    request_info = app.current_request.raw_body

    return request_info

I then use a simple form with multipart data, including a docx file upload:

<form enctype="multipart/form-data" method="POST" action="http://localhost:8000/submit">
    <input name='foo' type="text">
    <br>
    <input name="bar" type="file">
    <br>
    <button type='submit'>
    Submit
    </button>
</form>

The raw_body attribute of the request is just the bytes of the http request, and I'm looking for a pre-existing Python library that will let me extract each form field and also write the bytes of the docx file to disk (in this case a tmp folder in AWS Lambda). Is there a library that will take raw_body as an argument and allow me to parse the individual fields so that I don't have to write such a parser myself? Trying to google for this is difficult, since most of the results that come back all have to do with using python to consume web APIs, which is not what I want.


Solution

  • Below is sample lambda code which will take a multipart/form-data and parse it and get the files and get the file type as well.

    import magic
    from io import BytesIO
    import json
    import cgi
    
    def lambda_handler(event, context):
       content_type_obj = event['params']['header']['content-type']
       content_type, property_dict = cgi.parse_header(content_type_obj)
       form_data = cgi.parse_multipart(BytesIO(event['body-json'].decode('base64')), property_dict)
       form_file = form_data['file'][0]
       file_type = magic.from_buffer(form_file, mime=True)
       file_name = "new_file." + file_type.split('/')[-1] or "txt"
       # process your file 
       # file_type will give you mime type of the file like "image/png"
       print file_type
    
       return {'statusCode': 200,
               'body': json.dumps({"status": "success",
                                   "message": "your request for uploading has been accepted."}),
               'headers': {
               'Content-Type': 'application/json',
              }}
    

    for adding magic to lambda refer packaging

    http://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html