Search code examples
jsonsnowflake-cloud-data-platformvariantarray-merge

Merging Variant rows in Snowflake


I have a table structure in snowflake with variant data type as shown below, you can see the a single ID is having multiple variant objects.

+-----+--------------------------+
| ID  | STATE_INFO               |
|-----+--------------------------|
| IND | {                        |
|     |   "population": "1000k", |
|     |   "state": "KA"          |
|     | }                        |
| IND | {                        |
|     |   "population": "2000k", |
|     |   "state": "AP"          |
|     | }                        |
| IND | {                        |
|     |   "population": "3000K", |
|     |   "state": "TN"          |
|     | }                        |
| US  | {                        |
|     |   "population": "100k",  |
|     |   "state": "Texas"       |
|     | }                        |
| US  | {                        |
|     |   "population": "200k",  |
|     |   "state": "Florida"     |
|     | }                        |
| US  | {                        |
|     |   "population": "300K",  |
|     |   "state": "Iowa"        |
|     | }                        |
+-----+--------------------------+

I want to combine these variant objects into a single object like below by merging the rows into one array or dictionary object

+-----+---------------------------+
| ID  | STATE_INFO                |
|-----+---------------------------|
| IND | [{                        |
|     |   "population": "1000k",  |
|     |   "state": "KA"           |
|     | },                        |
|     | {                         |
|     |   "population": "2000k",  |
|     |   "state": "AP"           |
|     | },                        |
|     | {                         |
|     |   "population": "3000K",  |
|     |   "state": "TN"           |
|     | }]                        |
| US  | [{                        |
|     |   "population": "100k",   |
|     |   "state": "Texas"        |
|     | },                        |
|     | {                         |
|     |   "population": "200k",   |
|     |   "state": "Florida"      |
|     | },                        |
|     | {                         |
|     |   "population": "300K",   |
|     |   "state": "Iowa"         |
|     | }]                        |
+-----+---------------------------+

Like in SQL terminologies, we can say like below SQL statement

Select id,merge(STATE_INFO) from table group by id;

Solution

  • Like Mike said ARRAY_AGG function is what you need and it works on a variant column

    select id, array_agg(STATE_INFO) within group (order by id) STATE_INFO
    from table
    group by 1
    order by 1