Let's say that we have the following JSON object:
{"key_to_array":[{"key1":"value1","key2":{"var123":{"xyz":"I_need_version_1"}}},{"key1":"value1","key2":{"htg":{"uy1":"I_need_version_2"}}}]}}
inserted into a table:
drop table if exists modeling.modeling.pl_nested_json_array;
select '{"key_to_array":[{"key1":"value1","key2":{"var123":{"xyz":"I_need_version_1"}}},{"key1":"value1","key2":{"htg":{"uy1":"I_need_version_2"}}}]}' as json_data
into modeling.modeling.pl_nested_json_array
from modeling.modeling.some_other_table
limit 1;
In order to get to the I_need_version_*
I must travel semi-constant way. 'Semi' means that some elements are varying (keys: var123
, xyz
, htg
, uy1
). There is a finite number of their combinations so I can handle them by writing different ways in my code such as:
json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'var123' ->> 'xyz' as col1
The problem is that my current approach:
/* many columns version - unwanted */
select
json_array_elements(cast(json_data as json) -> 'key_to_array') ->> 'key1' as key1,
json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'var123' ->> 'xyz' as col1,
json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'htg' ->> 'uy1' as col2
from modeling.modeling.pl_nested_json_array;
produces as many columns as there are ways to get to I_need_version_*
. Since for each JSON_data
only one way is valid (produces I_need_version_*
) the rest of the columns col*
are NULL, I decided to use COALESCE()
in order to get one column col
with the non-NULL value:
select
json_array_elements(cast(json_data as json) -> 'key_to_array') ->> 'key1' as key1,
coalesce(
json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'var123' ->> 'xyz',
json_array_elements(cast(json_data as json) -> 'key_to_array') -> 'key2' -> 'htg' ->> 'uy1') as col
from modeling.modeling.pl_nested_json_array;
Upon executing this query I get ERROR: set-returning functions are not allowed in COALESCE
.
The output I'm currently getting:
key1 col1 col2
value1 I_need_version_1 [NULL]
value1 [NULL] I_need_version_2
What I'd like to get:
key1 col
value1 I_need_version_1
value1 I_need_version_2
What about just fully expanding the json and picking out the elements that you need? I'm not sure how many rows we're talking about, so performance could be an issue, but at least you wouldn't need to worry about the different paths.
WITH data as (
SELECT '{"key_to_array":[{"key1":"value1","key2":{"var123":{"xyz":"I_need_version_1"}}},{"key1":"value1","key2":{"htg":{"uy1":"I_need_version_2"}}}]}'::jsonb as data
)
SELECT key1, k3_val
FROM data
CROSS JOIN jsonb_array_elements(data.data -> 'key_to_array') arr(elem)
CROSS JOIN jsonb_to_record(elem) as x(key1 text, key2 jsonb)
CROSS JOIN jsonb_each(key2) as k2(k2_key, k2_val)
CROSS JOIN jsonb_each_text(k2_val) as k3(k3_key, k3_val)
WHERE k3_val like 'I_need_version_%'
;
key1 k3_val
value1 I_need_version_1
value1 I_need_version_2