Search code examples
pythonmongodbpymongofastapi

FastAPI issues with MongoDB - TypeError: 'ObjectId' object is not iterable


I am having some issues inserting into MongoDB via FastAPI.

The below code works as expected. Notice how the response variable has not been used in response_to_mongo().

The model is an sklearn ElasticNet model.

app = FastAPI()


def response_to_mongo(r: dict):
    client = pymongo.MongoClient("mongodb://mongo:27017")
    db = client["models"]
    model_collection = db["example-model"]
    model_collection.insert_one(r)


@app.post("/predict")
async def predict_model(features: List[float]):

    prediction = model.predict(
        pd.DataFrame(
            [features],
            columns=model.feature_names_in_,
        )
    )

    response = {"predictions": prediction.tolist()}
    response_to_mongo(
        {"predictions": prediction.tolist()},
    )
    return response

However when I write predict_model() like this and pass the response variable to response_to_mongo():

@app.post("/predict")
async def predict_model(features: List[float]):

    prediction = model.predict(
        pd.DataFrame(
            [features],
            columns=model.feature_names_in_,
        )
    )

    response = {"predictions": prediction.tolist()}
    response_to_mongo(
        response,
    )
    return response

I get an error stating that:

TypeError: 'ObjectId' object is not iterable

From my reading, it seems that this is due to BSON/JSON issues between FastAPI and Mongo. However, why does it work in the first case when I do not use a variable? Is this due to the asynchronous nature of FastAPI?


Solution

  • As per the documentation:

    When a document is inserted a special key, "_id", is automatically added if the document doesn’t already contain an "_id" key. The value of "_id" must be unique across the collection. insert_one() returns an instance of InsertOneResult. For more information on "_id", see the documentation on _id.

    Thus, in the second case of the example you provided, when you pass the dictionary to the insert_one() function, Pymongo will add to your dictionary the unique identifier (i.e., ObjectId) necessary to retrieve the data from the database; and hence, when returning the response from the endpoint, the ObjectId fails getting serialized—since, as described in this answer in detail, FastAPI, by default, will automatically convert that return value into JSON-compatible data using the jsonable_encoder (to ensure that objects that are not serializable are converted to a str), and then return a JSONResponse, which uses the standard json library to serialize the data.

    Solution 1

    Use the approach demonstrated here, by having the ObjectId converted to str by default, and hence, you can return the response as usual inside your endpoint.

    # place these at the top of your .py file
    import pydantic
    from bson import ObjectId
    pydantic.json.ENCODERS_BY_TYPE[ObjectId]=str
    
    return response # as usual
    

    Solution 2

    Dump the loaded BSON to valid JSON string and then reload it as dict, as described here and here.

    from bson import json_util
    import json
    
    response = json.loads(json_util.dumps(response))
    return response
    

    Solution 3

    Define a custom JSONEncoder, as described here, to convert the ObjectId into str:

    import json
    from bson import ObjectId
    
    class JSONEncoder(json.JSONEncoder):
        def default(self, o):
            if isinstance(o, ObjectId):
                return str(o)
            return json.JSONEncoder.default(self, o)
    
    
    response = JSONEncoder().encode(response)
    return response
    

    Solution 4

    You can have a separate output model without the 'ObjectId' (_id) field, as described in the documentation. You can declare the model used for the response with the parameter response_model in the decorator of your endpoint. Example:

    from pydantic import BaseModel
    
    class ResponseBody(BaseModel):
        name: str
        age: int
    
    
    @app.get('/', response_model=ResponseBody)
    def main():
        # response sample
        response = {'_id': ObjectId('53ad61aa06998f07cee687c3'), 'name': 'John', 'age': '25'}
        return response
    

    Solution 5

    Remove the "_id" entry from the response dictionary before returning it (see here on how to remove a key from a dict):

    response.pop('_id', None)
    return response