Search code examples
sqlpostgresqljsonbjsonpath

Postgres jsonpath for referencing either singleton element or array of elements


The Postgres JSONPath notation states that the .key accessor returns an object member with the specified key. It appears that this operator works on both arrays and objects. For example, the following queries both return matches, despite the differences in structure.

select * from (
select '{"a": [{"b": 2}, {"b": 3}]}'::jsonb as pl
) jb where (jsonb_path_exists(jb.pl, '$.a.b ? (@ == 2)'))

and

select * from (
select '{"a": {"b": 2}}'::jsonb as pl
) jb where (jsonb_path_exists(jb.pl, '$.a.b ? (@ == 2)'))

I would expect in the first case, I might have had to write:

select * from (
select '{"a": {"b": 2}}'::jsonb as pl
) jb where (jsonb_path_exists(jb.pl, '$.a[*].b ? (@ == 2)'))

but it doesn't seem to be necessary.

My goal is to be able to query into nested objects like this where the leaf may sometimes be in an array and sometimes not. It appears that I can ignore this distinction but not sure if what I am seeing is the expected behavior, since the Postgres documentation does not (afaik) explicitly say this is the behavior.

Is it safe to assume the key operator will expand lists as I am seeing here?


Solution

  • If you switch to strict mode, the expression will no longer try and apply the key accessor to array elements but that also means it will no longer match the structure and raise an error:
    demo at db<>fiddle

    select jsonb_path_query_array(pl,'$.a.b ? (@ == 2)')
    from(values('{"a": [{"b": 2}, {"b": 3}]}'::jsonb))as jb(pl)
    where pl @? '$.a.b ? (@ == 2)';
    
    jsonb_path_query_array
    [2]
    select jsonb_path_query_array(pl,'lax $.a.b ? (@ == 2)')
    from(values('{"a": [{"b": 2}, {"b": 3}]}'::jsonb))as jb(pl)
    where pl @? 'lax $.a.b ? (@ == 2)';
    
    jsonb_path_query_array
    [2]
    select jsonb_path_query_array(pl,'strict $.a.b ? (@ == 2)')
    from(values('{"a": [{"b": 2}, {"b": 3}]}'::jsonb))as jb(pl)
    where pl @? 'strict $.a.b ? (@ == 2)';
    
    ERROR:  jsonpath member accessor can only be applied to an object
    

    This means you'll have to slightly restructure your JSONPath expression - the fix you suggested works fine:

    select jsonb_path_query_array(pl,'strict $.a[*].b ? (@ == 2)')
    from(values('{"a": [{"b": 2}, {"b": 3}]}'::jsonb))as jb(pl)
    where pl @? 'strict $.a[*].b ? (@ == 2)';
    
    jsonb_path_query_array
    [2]

    As shown in the examples above, by default, you're in lax mode:

    Lax mode facilitates matching of a JSON document and path expression when the JSON data does not conform to the expected schema. If an operand does not match the requirements of a particular operation, it can be automatically wrapped as an SQL/JSON array, or unwrapped by converting its elements into an SQL/JSON sequence before performing the operation. Also, comparison operators automatically unwrap their operands in lax mode, so you can compare SQL/JSON arrays out-of-the-box. An array of size 1 is considered equal to its sole element. Automatic unwrapping is not performed when:

    • The path expression contains type() or size() methods that return the type and the number of elements in the array, respectively.

    • The queried JSON data contain nested arrays. In this case, only the outermost array is unwrapped, while all the inner arrays remain unchanged. Thus, implicit unwrapping can only go one level down within each path evaluation step.

    It's more of a problem when you really don't want to match the alternative path accepted by lax mode, but it's also important to switch between these whenever you extract things. In lax mode, the .** accessor - which might be useful to you if you want to match varying paths and levels of nesting - could match twice: once when it reaches the hit by normal descend down the structure, once again when it attempts the descend with wrapping/unwrapping enabled by the mode.

    select jsonb_path_query(pl,'$.**.b?(@==2)')
    from(values('{"a": [{"b": 2}, {"b": 3}]}'::jsonb))as jb(pl);
    
    jsonb_path_query
    2
    2
    select jsonb_path_query(pl,'strict $.**.b?(@==2)')
    from(values('{"a": [{"b": 2}, {"b": 3}]}'::jsonb))as jb(pl);
    
    jsonb_path_query
    2