I have a very wide table that I'm trying to unpivot without literally listing the hundreds of columns.
I thought something like this should work
WITH column_names_table as (
SELECT column_name
FROM `my_project.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE
table_name = 'my_table'
AND NOT column_name = 'time_utc'
)
SELECT time_utc, measurement, identifier
FROM `my_project.dataset.my_table`
UNPIVOT(
measurement FOR identifier IN
(SELECT column_name from column_names_table)
)
but I get a Syntax error: Unexpected keyword SELECT
. If i just replace the SELECT statement with a literal list of the columns this works. I seems like i'm so close here...
Can I make this work? If not, is there another way to unpivot all columns except a couple? I'm also wondering if there's a good reason this isn't possible..
Input data:
time_utc | id1 | id2 | ... | idN |
---|---|---|---|---|
2019-01-24 05:00:00 UTC | 0.5 | 1.2 | 12 | |
2019-01-24 06:00:00 UTC | 0.6 | 1.3 | 1.2 |
Output data:
time_utc | measurement | identifier |
---|---|---|
2019-01-24 05:00:00 UTC | 0.5 | id1 |
2019-01-24 06:00:00 UTC | 0.6 | id1 |
2019-01-24 05:00:00 UTC | 1.2 | id2 |
2019-01-24 06:00:00 UTC | 1.3 | id2 |
... | ... | ... |
Since you can't specify columns to unpivot dynamically, you need to consider a dynamic sql if you use INFORMATION_SCHEMA.COLUMNS view.
EXECUTE IMMEDIATE FORMAT("""
SELECT time_utc, measurement, identifier
FROM `my_project.dataset.my_table`
UNPIVOT (measurement FOR identifier IN (%s))
""", (
SELECT STRING_AGG(column_name)
FROM `my_project.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE table_name = 'my_table' AND NOT column_name = 'time_utc'
));
And you might consider @Mikhail's another approach in below link if INFORMATION_SCHEMA or a dynamic sql is not available to you.
below query is a slightly modified one to suit your case having same output above.
select time_utc, measurement, 'id' || (offset + 1) identifier
from `dataset.my_table` t,
unnest(split(translate(format('%t', (select as struct * except(time_utc) from unnest([t]))), '() ', ''))) measurement with offset;
CREATE SCHEMA IF NOT EXISTS dataset;
CREATE TABLE IF NOT EXISTS dataset.my_table AS
SELECT '2019-01-24 05:00:00 UTC' time_utc, 0.5 id1, 1.2 id2, 1.0 id3, 1.2 id4 UNION ALL
SELECT '2019-01-24 06:00:00 UTC', 0.6, 1.3, 1.2, 1.4;