I'm working on my first FastAPI and Pydantic project, came across a problem when trying to return a generator from an endpoint. The problem is below, any inputs will be really appreciated!
I have an API endpoint where I first get data records from a database, then format each record using a map function. However, when returning the mapped result to the endpoint, the key-value mapping went wrong. Note, I'd like to keep the return type from the endpoint as a generator for performance sake (big data volume).
My pseudo-code:
@app.get("/records", response_model=Iterable[RecordModel])
async def get_records() -> Iterable[RecordModel]:
# {queried_records} is a generator returned from the database query
queried_records = get_records_from_database()
formatted_records = map(lambda record: __format(record), queried_records)
return formatted_records
async def __format(queried_record: Dict[str, Union[str, HttpUrl]) -> Union[RecordModel, None]:
formatted_record = RecordModel(
key_1 = queried_record[key_a],
key_2 = queried_record[key_b],
key_3 = queried_record[key_c]
)
return formatted_record
By this, I got an error when running the endpoint
ValueError: [ValueError('dictionary update sequence element #0 has length 3; 2 is required'), TypeError('vars() argument must have __dict__ attribute')]
if I change __format method to
async def __format(queried_record: Dict[str, Union[str, HttpUrl]) -> Union[RecordModel, None]:
formatted_record = RecordModel(
key_1 = queried_record[key_a],
key_2 = queried_record[key_b]
)
return formatted_record
From Swagger UI, I could see the endpoint was executed to a response body of
{ key_1: key_2 }
Very strange, I spent quite a while debugging, but couldn't sort it out. How to fix the ValueError mentioned above? Big thanks for your inputs in advance!
After tinkering for a while, though I still didn't solve the ValueError
mentioned in the above post, I found a work-around - when dealing with the database big queries, using pagination for queries instead of returning generators as query results.
So improved and working pseudo-code:
@app.get("/records", response_model=List[RecordModel])
async def get_records(
offset: int = 0, # start position of the query
limit: int = 1000 # size of the query
) -> List[RecordModel]:
queried_records = get_records_from_database(offset, limit)
formatted_records = map(lambda record: __format(record), queried_records)
return list(formatted_records)
def __format(queried_record: Dict[str, Union[str, HttpUrl]) -> Union[RecordModel, None]:
formatted_record = RecordModel(
key_1 = queried_record[key_a],
key_2 = queried_record[key_b],
key_3 = queried_record[key_c]
)
return formatted_record
By this, I hand over to the API users the responsibility of handling big database queries by the use of pagination.