I'm translating a BigQuery below to Snowflake.
select event_date,user_pseudo_id,event_name, event_timestamp,
(SELECT COALESCE( value.string_value, CAST(value.int_value AS string), CAST(value.double_value AS string), CAST(value.float_value AS string) ) FROM UNNEST (event_params) WHERE KEY = 'key1' ) AS key1,
(SELECT COALESCE( value.string_value, CAST(value.int_value AS string), CAST(value.double_value AS string), CAST(value.float_value AS string) ) FROM UNNEST (event_params) WHERE KEY = 'key2' ) AS key2
FROM `bigquery.analytics.events_20240812`;
I expect the result as below. The way I'm doing in Snowflake is creating 2 separate tables for key1 and key2, then full join them together. However, I will have hundreds of keys in the future. It would difficult to accomplish the goal with hundreds of joins.
Thank you so much!
Below is what I got from BigQuery and would expect the same for Snowflake.
If your events look like:
with events(event_date, user_pseudo_id, event_timestamp,event_params) as (
select $1, $2, $3, parse_json($4) from values
('2024-08-12'::date, '111.11', 123456, '[
{"key":"key0", "float_value":-1.234},{"key":"key1", "string_value":"Prifilled"},{"key":"key2", "string_value":"[UUID]"}
]'),
('2024-08-12'::date, '111.11', 123456, '[
{"key":"key0", "float_value":4.567},{"key":"key1", "string_value":"Prifilled"},{"key":"key2", "string_value":"ot-group-id-COOO4"}
]')
)
which is too say, a row of values, and some JSON values, then FLATTEN is the most related to UNNEST command:
select
e.event_date,
e.user_pseudo_id,
e.event_timestamp,
f.value:key as k,
COALESCE( f.value:string_value, f.value:int_value::text, f.value:double_value::text, f.value:float_value::text) as v,
f.seq
from events as e,
table(flatten(input=>e.event_params)) as f
where f.value:key in ('key1', 'key2')
gives:
which grouping via the SEQ and using MAX and IFF to pivot the values like:
select
event_date,
user_pseudo_id,
event_timestamp,
min(iff(k='key1', v, null)) as key1,
min(iff(k='key2', v, null)) as key2
from (
select
e.event_date,
e.user_pseudo_id,
e.event_timestamp,
f.value:key as k,
COALESCE( f.value:string_value, f.value:int_value::text, f.value:double_value::text, f.value:float_value::text) as v,
f.seq
from events as e,
table(flatten(input=>e.event_params)) as f
where f.value:key in ('key1', 'key2')
)
group by 1,2,3,seq
gives:
which appears to match your impled data and desired result.