I'm trying to parse out test-N-value from my JSON column. I have tried using wildcard but the column created just returns NULL here.
SELECT get_json_object('{"random_field": "ABC", "test-0-value":"1", "test-1-value":"2"}', '$.test-*-value');
Does anyone have any tips as to how to extract out these dynamic objects from this? Expected result would be.. but remember the output has to allow dynamic number of key value pairs.
extracted_out
-------------
{"test-0-value":"1", "test-1-value":"2"}
You can convert json to map using from_json, explode map to get key, value, filter only keys which you need,
collect array of key:value, concatenate array with comma delimiter, use str_to_map to get map containing keys which you need, then you can optionally convert it to JSON string using to_json
Demo:
with src as (select '{"random_field": "ABC", "test-0-value":"1", "test-1-value":"2"}' as original_json)
select s.original_json, to_json( str_to_map(concat_ws(',',collect_list(concat(key,':',value))))) result
from src s
lateral view explode(from_json(s.original_json, 'map<STRING, STRING>')) m as key, value
where key rlike 'test-\\d+-value'
group by s.original_json
""").show(100, false)
Result:
+---------------------------------------------------------------+---------------------------------------+
|original_json |result |
+---------------------------------------------------------------+---------------------------------------+
|{"random_field": "ABC", "test-0-value":"1", "test-1-value":"2"}|{"test-0-value":"1","test-1-value":"2"}|
+---------------------------------------------------------------+---------------------------------------+
Update:
Found much better solution using map_filter
spark.sql("""
with src as (select '{"random_field": "ABC", "test-0-value":"1", "test-1-value":"2"}' as original_json)
select s.original_json, to_json( map_filter(from_json(s.original_json, 'map<STRING, STRING>'), (k, v) -> k rlike 'test-\\d+-value')) result
from src s
""").show(100, false)