bson.json_util
provides functions to convert to either canonical or relaxed JSON format. However, both of them stick to the same representation of the ObjectId:
from PyMongo import MongoClient
from bson.objectid import ObjectId
from bson import json_util
from bson.json_util import RELAXED_JSON_OPTIONS
from bson.json_util import CANONICAL_JSON_OPTIONS, DEFAULT_JSON_OPTIONS
db = MongoClient(URL)['DB_NAME']
mongo_query_result = db.collection.find_one({'_id': ObjectId('ID')},
{'_id': 1})
# returns {'_id': ObjectId('ID')}
print(json_util.dumps(mongo_query_result, json_options=RELAXED_JSON_OPTIONS))
print(json_util.dumps(mongo_query_result, json_options=CANONICAL_JSON_OPTIONS))
print(json_util.dumps(mongo_query_result, json_options=DEFAULT_JSON_OPTIONS))
# Results
{"_id": {"$oid": "ID"}}
{"_id": {"$oid": "ID"}}
{"_id": {"$oid": "ID"}}
# Desired Output
{"_id": "ID"}
The problem with that is it doesn't match the results I get in prod env. I am using PyMongo
just to build test cases, the actual prod format is
{'_id': "ID", ..etc}
I looked a bit in the documentation over here, and here are the findings:
uuid_representation=PYTHON_LEGACY
and I cannot seem to find a way around it.Is there something I am missing to convert PyMongo
query result to:
{'_id' : 'ID', ..}
# not
{'_id' : {'$oid' : 'ID'}, ..}
I would hate to extend my code just to handle the different format of the test cases.
As a work around, I was able to accomplish the same result with re
regular expressions:
import re
def remove_oid(string):
while True:
pattern = re.compile('{\s*"\$oid":\s*(\"[a-z0-9]{1,}\")\s*}')
match = re.search(pattern, string)
if match:
string = string.replace(match.group(0), match.group(1))
else:
return string
string = json_dumps(mongo_query_result)
string = remove_oid(string)
This essentially replaces the CANONICAL_JSON to normalized one and remove the key-value to just a value.
Although this gets the job done, it is not ideal, since I am manipulating JSON
as a string, and very much error prone, plus doesn't work in Date
or other format.