I have a BigQuery RECORD
user_details
, which holds their details like address, email, etc. I have several records like this, each has several fields underneath them, and they will change over time.
So in a typical BigQuery SQL query, I'll ask something like:
SELECT
user_details.*
FROM `my_table`
This is nice because it unravels the user_details RECORD
to the top level, so that there will now be a column called email
instead of a hierarchical column called user_details.record
. My downstream users / customers are not SQL gurus, so best to keep it simple for them.
But now I have to worry about name clashes. What if I have a RECORD
called spouse_details
which also has an email, address, etc?
Now if I do a query like
SELECT
user_details.*,
spouse_details.*
FROM `my_table`
I have a column called email
and another called email_1
. Not only is this ugly, but I obviously don't know which one aligns to which.
I would like to do something like
SELECT
user_details.* AS user_,
spouse_details.* AS spouse_
FROM `my_table`
such that I get columns labelled user_email
and spouse_email
.
This is not possible, at least not the way that I wrote it.
SELECT user_details.email as user_email, user_details.address as user_address...
because:
Consider below approach
SELECT * FROM (
SELECT id, 'user' AS type, user_details.* FROM your_table UNION ALL
SELECT id, 'spouse', spouse_details.* FROM your_table
)
PIVOT (ANY_VALUE(email) email, ANY_VALUE(address) address FOR type IN ('user', 'spouse'))
it assumes presence of some sort of id as in below data sample
WITH your_table AS (
SELECT 1 AS ID, STRUCT<email STRING, address STRING>('email11', 'address11') AS user_details, STRUCT<email STRING, address STRING>('mail12', 'address12') AS spouse_details UNION ALL
SELECT 2, ('email21', 'address21'), ('mail22', 'address22') UNION ALL
SELECT 3, ('email31', 'address31'), ('mail32', 'address32') UNION ALL
SELECT 4, ('email41', 'address41'), ('mail42', 'address42')
)
and output will be