Search code examples
pythonamazon-web-servicesaws-lambdaserverlessfastapi

Python3.8 - FastAPI and Serverless (AWS Lambda) - Unable to process files sent to api endpoint


I've been using FastAPI with Serverless through AWS Lambda functions for a couple of months now and it works perfectly.

I'm creating a new api endpoint which requires one file to be sent.

It works perfectly when using on my local machine, but when I deploy to AWS Lambda, I have the following error when I try to call my endpoint, with the exact same file that is working locally. I'm doing this at the moment as a test through the swagger UI and nothing changes between my serverless or my local machine beside the "machine" the code is run on.

Would you have any idea what is going on ?

Python 3.8 FastAPI 0.54.1

My code:

from fastapi import FastAPI, File, UploadFile
import pandas as pd

app = FastAPI()

@app.post('/process_data_import_quote_file')
def process_data_import_quote_file(file: UploadFile = File(...)): # same error if I put bytes instead of UploadFile
    file = file.file.read()
    print(f"file {file}")
    quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()

It fails on the last line

I've tried to print the file, when I compare the data printed with what I read locally, it is different. I swear it's the same file I'm using on the 2 so I don't know what could explain that ? It's a very basic excel file, nothing special about it.

[ERROR] 2020-05-07T14:25:17.878Z    25ff37a5-e313-4db5-8763-1227e8244457    Exception in ASGI application

Traceback (most recent call last):
  File "/var/task/mangum/protocols/http.py", line 39, in run
    await app(self.scope, self.receive, self.send)
  File "/var/task/fastapi/applications.py", line 149, in __call__
    await super().__call__(scope, receive, send)
  File "/var/task/starlette/applications.py", line 102, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/var/task/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/var/task/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/var/task/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/var/task/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/var/task/starlette/routing.py", line 550, in __call__
    await route.handle(scope, receive, send)
  File "/var/task/starlette/routing.py", line 227, in handle
    await self.app(scope, receive, send)
  File "/var/task/starlette/routing.py", line 41, in app
    response = await func(request)
  File "/var/task/fastapi/routing.py", line 196, in app
    raw_response = await run_endpoint_function(
  File "/var/task/fastapi/routing.py", line 150, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/var/task/starlette/concurrency.py", line 34, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/var/lang/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/var/task/app/quote/processing.py", line 100, in process_data_import_quote_file
    quote_number = pd.read_excel(file, sheet_name='Data').iloc[:, 0].dropna()
  File "/var/task/pandas/io/excel/_base.py", line 304, in read_excel
    io = ExcelFile(io, engine=engine)
  File "/var/task/pandas/io/excel/_base.py", line 821, in __init__
    self._reader = self._engines[engine](self._io)
  File "/var/task/pandas/io/excel/_xlrd.py", line 21, in __init__
    super().__init__(filepath_or_buffer)
  File "/var/task/pandas/io/excel/_base.py", line 355, in __init__
    self.book = self.load_workbook(BytesIO(filepath_or_buffer))
  File "/var/task/pandas/io/excel/_xlrd.py", line 34, in load_workbook
    return open_workbook(file_contents=data)
  File "/var/task/xlrd/__init__.py", line 115, in open_workbook
    zf = zipfile.ZipFile(timemachine.BYTES_IO(file_contents))
  File "/var/lang/lib/python3.8/zipfile.py", line 1269, in __init__
    self._RealGetContents()
  File "/var/lang/lib/python3.8/zipfile.py", line 1354, in _RealGetContents
    fp.seek(self.start_dir, 0)
ValueError: negative seek value -62703616

Solution

  • It was due to AWS API Gateway.

    I have had to proceed allowing multipart/form-data in API Gateway and correct with file = BytesIO(file).read() to be able to use properly the file stream.