In a PostgreSQL table "private_notion", I have a JSONB column "record_map" that may or may not contain nested objects, E.g.
{
"blocks": {
"7a9abf0d-a066-4466-a565-4e6d7a960a37": {
"name": "block1",
"value": 1,
"child": {
"7a9abf0d-a066-4466-a565-4e6d7a960a37": {
"name": "block2",
"value": 2,
"child": {
"7a9abf0d-a066-4466-a565-4e6d7a960a37": {
"name": "block3",
"value": 3
}
}
},
"7a9abf0d-a066-4466-a565-4e6d7a960a38": {
"name": "block4",
"value": 4,
"child": {
"7a9abf0d-a066-4466-a565-4e6d7a960a39": {
"name": "block5",
"value": 5,
"child": {
"7a9abf0d-a066-4466-a565-4e6d7a960a40": {
"name": "block6",
"value": 6
}
}
}
}
},
}
}
}
}
To retrieve data, We don't know which block has the data we want, we only have the key. Let's assume we are looking for the object with this key "7a9abf0d-a066-4466-a565-4e6d7a960a40", but we don't know that it is located in child block6 of parent block4 and block5. Another request might look for the parent block4 and so on, and I must find the block by it's key.
The entire code looks like this;
async def get_private_notion_page(
site_uuid: str, page_id: str, db_session: AsyncSession
) -> PrivateNotionPage:
page_id_path = f"{page_id}" # page_id looks like this 7a9abf0d-a066-4466-a565-4e6d7a960a37
path = f"$.** ? (@.{page_id_path})"
stmt = text(
f"""
SELECT jsonb_path_query(record_map, {path})
FROM private_notion
WHERE site_id = {site_uuid}
"""
)
result = await db_session.execute(stmt)
result = result.scalars().first()
if result:
return result
else:
raise PrivateNotionSiteWasNotFound
So I came up with the following query statements which use sqlalchemy "text" method to accept raw SQL query, but jsonb_path_query_array
and jsonb_path_query
throw similar errors; syntax error at or near "$"
.
page_id_path = f"{page_id}"
path = f"$.** ? (@.{page_id_path})"
stmt = text(
f"""
SELECT jsonb_path_query(record_map, {path})
FROM private_notion
WHERE site_id = {site_uuid}
"""
)
Error:
sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "$"
[SQL:
SELECT jsonb_path_query(record_map, $.** ? (@.7a9abf0d-a066-4466-a565-4e6d7a960a37))
FROM private_notion
WHERE site_id = 26f52d8e-a380-46ab-9131-e6f7f62c528f
]
I would later learn that "The $** operator is not valid in a SQL query. Instead, you can use the jsonb_path_query_array function to search recursively through all levels of the JSONB object."
Apparently I got the same error after refactoring the code.
page_id_path = f"{page_id}"
path = f"$[*] ? (@ like_regex {page_id_path})"
stmt = text(
f"""
SELECT jsonb_path_query_array(record_map -> 'block', {path})
FROM private_notion
WHERE site_id = {site_uuid}
"""
)
Error:
sqlalchemy.exc.ProgrammingError: (sqlalchemy.dialects.postgresql.asyncpg.ProgrammingError) <class 'asyncpg.exceptions.PostgresSyntaxError'>: syntax error at or near "$"
[SQL:
SELECT jsonb_path_query_array(record_map -> 'block', $[*] ? (@ like_regex 7a9abf0d-a066-4466-a565-4e6d7a960a37))
FROM private_notion
WHERE site_id = 26f52d8e-a380-46ab-9131-e6f7f62c528f
]
My question is two-pronged, what is the error all about? And is there a better way to retrieve a nested object by key in a JSONB column? Thank you for your time.
This extracts entire objects at any level that have your target uuid-based key in them: demo at db<>fiddle
SELECT jsonb_path_query(record_map,
'strict $.**?(@.keyvalue().key==$target_id)',
jsonb_build_object('target_id',
'7a9abf0d-a066-4466-a565-4e6d7a960a37'))
FROM private_notion
WHERE site_id = '45bf37be-ca0a-45eb-838b-015c7a89d47b';
jsonb_path_query |
---|
{ "7a9abf0d-a066-4466-a565-4e6d7a960a37": { "name": "block1", "child": { "7a9abf0d-a066-4466-a565-4e6d7a960a37": { "name": "block2", "child": { "7a9abf0d-a066-4466-a565-4e6d7a960a37": { "name": "block3", "value": 3 } }, "value": 2 }, "7a9abf0d-a066-4466-a565-4e6d7a960a38": { "name": "block4", "child": { "7a9abf0d-a066-4466-a565-4e6d7a960a39": { "name": "block5", "child": { "7a9abf0d-a066-4466-a565-4e6d7a960a40": { "name": "block6", "value": 6 } }, "value": 5 } }, "value": 4 } }, "value": 1 } } |
{ "7a9abf0d-a066-4466-a565-4e6d7a960a37": { "name": "block2", "child": { "7a9abf0d-a066-4466-a565-4e6d7a960a37": { "name": "block3", "value": 3 } }, "value": 2 }, "7a9abf0d-a066-4466-a565-4e6d7a960a38": { "name": "block4", "child": { "7a9abf0d-a066-4466-a565-4e6d7a960a39": { "name": "block5", "child": { "7a9abf0d-a066-4466-a565-4e6d7a960a40": { "name": "block6", "value": 6 } }, "value": 5 } }, "value": 4 } } |
{ "7a9abf0d-a066-4466-a565-4e6d7a960a37": { "name": "block3", "value": 3 } } |
Note the object duplication through unnesting: they appear both alone as well as inside each matched parent structure.
JSONPath expression needs to be single-quoted. This gets rid of the syntax error:
ERROR: syntax error at or near "$"
LINE 2: $.**.7a9abf0d-a066-4466-a565-4e6d7a9...
^
Your uuid-based key inside the JSONPath needs to be double-quoted. This will get rid of a problem inside the expression that would soon follow:
ERROR: trailing junk after numeric literal at or near ".7a" of jsonpath input
LINE 2: '$.**.7a9abf0d-a066-4466-a565-4e6d7a...
^
When using .**
accessor, default to using strict
mode.
You can use the SQLAlchemy JSONPath type to pass the expression.