I normally unnest like so :
SELECT h.field1, rp.p.key1, rp.p.key2
FROM
mytable h
CROSS JOIN UNNEST(h.field2) rp (p)
However, I am now looking at querying AWS CloudTrail data. Here the field that I want to unnest varies...
Sometimes it's a single object:
{"principal":{"dataLakePrincipalIdentifier":"arn:aws:iam::......
Other times it contains a wrapper object around an array of objects
{"entries":[{"id":"0","principal":{"dataLakePrincipalIdentifier":"arn:aws:iam::.........
The only way that I can currently think of to tackle this, is to create 2 separate queries; one to unpack the singletons, another to unpack the array entries - and then union the two results together.
I'd be grateful if anyone knows of a more efficient approach ?
You should be able to achieve this by checking on the existence of a specific key inside the object, and do the following:
map(varchar, json)
. Convert it to look like a map(varchar, array(json)
.Let's say, we check on the existence of key entries
:
with data as (
select CAST(json_parse(your_json_string) AS MAP(VARCHAR, json )) as p
from mydataset.mytable
)
select cast(json_extract(j, '$.principle') as map(varchar, integer)) as record from (
select
if(
cardinality(filter(map_keys(p), x -> x = 'entries')) = 1,
p,
MAP(ARRAY['entries'], array[cast(array[p] as json)])
) as x from data2
), unnest(cast(x['entries'] as array(json))) as z(j)
Please note this part, specifically:
if(
cardinality(filter(map_keys(p), x -> x = 'entries')) = 1,
p,
MAP(ARRAY['entries'], array[cast(array[p] as json)])
)
It does the job of converting a single object to look like a regular map(varchar, array(json))
which makes everything look alike.