Search code examples
google-bigquery

Concatenate the values of a nested field on BigQuery


Supose I have the following schema:

[
   {
        'name': 'id',
        'type': 'INTEGER' 
   }
   {
        'name': 'record',
        'type': 'RECORD',
        'fields': [
            {
                'name': 'repeated',
                'type': 'STRING',
                'mode': 'REPEATED'
            }
         ]
   }
]

And the following data:

+--------------------+
|id  |record.repeated|
+--------------------+
|1   |'a'            |
|    |'b'            |
|    |'c'            |
+--------------------+
|2   |'a'            |
|    |'c'            |
+--------------------+
|3   |'d'            |
+--------------------+

What I need is to create a query that returns this:

+--------------------+
|id  |record.repeated|
+--------------------+
|1   |'a,b,c'        |
+--------------------+
|2   |'a,c'          |
+--------------------+
|3   |'d'            |
+--------------------+

In other words, I need to query that allows me to concatenate the values of a nested field using a separator (in this case, comma). Something like the GROUP_CONCAT function of MySQL, but on BigQuery.

Related idea: Concat all column values in sql

Is that possible?

Thanks.


Solution

  • It's very simple

    select group_concat(record.repeated) from table
    

    an example from publicdata is

    SELECT group_concat(payload.shas.encoded)
    FROM [publicdata:samples.github_nested]
    WHERE repository.url='https://github.com/dreamerslab/workspace'