Search code examples
arraysjsonpostgresqlcoalesce

Postgresql coalesce columns with JSON array


Let's say that we have the following JSON object:

{"key_to_array":[{"key1":"value1","key2":{"var123":{"xyz":"I_need_version_1"}}},{"key1":"value1","key2":{"htg":{"uy1":"I_need_version_2"}}}]}}

inserted into a table:

drop table if exists modeling.modeling.pl_nested_json_array;
select '{"key_to_array":[{"key1":"value1","key2":{"var123":{"xyz":"I_need_version_1"}}},{"key1":"value1","key2":{"htg":{"uy1":"I_need_version_2"}}}]}' as json_data
into modeling.modeling.pl_nested_json_array
from modeling.modeling.some_other_table
limit 1;

In order to get to the I_need_version_* I must travel semi-constant way. 'Semi' means that some elements are varying (keys: var123, xyz, htg, uy1). There is a finite number of their combinations so I can handle them by writing different ways in my code such as:

json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'var123' ->> 'xyz' as col1

The problem is that my current approach:

/* many columns version - unwanted */
select 
    json_array_elements(cast(json_data as json) -> 'key_to_array') ->> 'key1' as key1,
    json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'var123' ->> 'xyz' as col1,
    json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'htg' ->> 'uy1' as col2
from modeling.modeling.pl_nested_json_array;

produces as many columns as there are ways to get to I_need_version_*. Since for each JSON_data only one way is valid (produces I_need_version_*) the rest of the columns col* are NULL, I decided to use COALESCE() in order to get one column col with the non-NULL value:

select 
    json_array_elements(cast(json_data as json) -> 'key_to_array') ->> 'key1' as key1,
    coalesce(
    json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'var123' ->> 'xyz',
    json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'htg' ->> 'uy1') as col
from modeling.modeling.pl_nested_json_array;

Upon executing this query I get ERROR: set-returning functions are not allowed in COALESCE.

The output I'm currently getting:

key1          col1               col2
value1        I_need_version_1   [NULL]
value1        [NULL]             I_need_version_2

What I'd like to get:

key1     col
value1   I_need_version_1
value1   I_need_version_2

Solution

  • What about just fully expanding the json and picking out the elements that you need? I'm not sure how many rows we're talking about, so performance could be an issue, but at least you wouldn't need to worry about the different paths.

    WITH data as (
      SELECT '{"key_to_array":[{"key1":"value1","key2":{"var123":{"xyz":"I_need_version_1"}}},{"key1":"value1","key2":{"htg":{"uy1":"I_need_version_2"}}}]}'::jsonb as data
    )
    SELECT key1, k3_val
    FROM data
    CROSS JOIN jsonb_array_elements(data.data -> 'key_to_array') arr(elem)
    CROSS JOIN jsonb_to_record(elem) as x(key1 text, key2 jsonb)
    CROSS JOIN jsonb_each(key2) as k2(k2_key, k2_val)
    CROSS JOIN jsonb_each_text(k2_val) as k3(k3_key, k3_val)
    WHERE k3_val like 'I_need_version_%'
    ;
    key1    k3_val
    value1  I_need_version_1
    value1  I_need_version_2