I am using python 3.9.12 to query mongodb, I then read the values into variables and continue with my logic. Problem is, some of my values have keys that start with dollar sign. Here is an example of a json I get:
[
{
"_id": {
"$oid": "234876234875236752309823"
},
"createdAt": {
"$date": "2022-11-13T20:50:18.184Z"
},
"moreFields": {
"key1": "blabla1",
"key2": "blabla2",
"key3": "blabla3"
},
"entityId": {
"$binary": {
"base64": "z0kWDTHiSlawpI2wHjyrWA==",
"subType": "04"
}
}
}
]
I understand that those mongodb field types (bson, datetime...). But this makes my life hard in trying to access those values using python.
I was reading and looking but I couldn't find a method to convert them to "normal" keys. Ideally I would want to correct my mongodb query (get datetime as strings and $binary as UUID strings). I have found a stupid workaround in python but unfortunately it is very stupid and I want to correct my ways.
Any ideas? Thanks :)
I would really be happy if the result of my mongodb query would change to:
[
{
"_id": "234876234875236752309823",
"createdAt": "2022-11-13T20:50:18.184Z",
"moreFields": {
"key1": "blabla1",
"key2": "blabla2",
"key3": "blabla3"
},
"entityId": "e87b22b2-ea15-4176-9100-c65f79f0e5b2"
}
]
If your data is in a string format (say, from a file), use loads
from the bson.json_util
module. https://pymongo.readthedocs.io/en/stable/api/bson/json_util.html
For the second part, that is just formatting; but beware, this just creates another string output. Chances are the data you are interested in is actually in the record object.
The following snippet converts the input string, loads it into MongoDB, and then formats it back to a string using a custom encoder:
import datetime
import json
import bson
from bson import json_util
from pymongo import MongoClient
db = MongoClient()['mydatabase']
records = '''[
{
"_id": {
"$oid": "234876234875236752309823"
},
"createdAt": {
"$date": "2022-11-13T20:50:18.184Z"
},
"moreFields": {
"key1": "blabla1",
"key2": "blabla2",
"key3": "blabla3"
},
"entityId": {
"$binary": {
"base64": "z0kWDTHiSlawpI2wHjyrWA==",
"subType": "04"
}
}
}
]'''
db.mycollection.insert_many(json_util.loads(records))
class MyJsonEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime.datetime):
return obj.isoformat() # Format dates as ISO strings
elif isinstance(obj, bson.Binary) and obj.subtype == bson.binary.UUID_SUBTYPE:
return obj.as_uuid() # Format binary data as UUIDs
elif hasattr(obj, '__str__'):
return str(obj) # This will handle ObjectIds
return super(MyJsonEncoder, self).default(obj)
record = db.mycollection.find_one()
print(json.dumps(record, cls=MyJsonEncoder, indent=4))
prints:
{
"_id": "234876234875236752309823",
"createdAt": "2022-11-13T20:50:18.184000",
"moreFields": {
"key1": "blabla1",
"key2": "blabla2",
"key3": "blabla3"
},
"entityId": "cf49160d-31e2-4a56-b0a4-8db01e3cab58"
}