Search code examples
google-bigquerypivotrowmultiple-columnstranspose

Bigquery - Transposing Rows into Multiple Cells but not using cell values name as column name


Previously I've asked a question within this link and has been successfully answered by JayTiger's Answer and Mikhail's Answer

Their answers helped my issue successfully, but I have another case that cannot be solved by those ones.

For example, I have this kind of data:

transaction_id item_name
123 snacks
123 marbles
124 tooth_paste
124 tooth_brush
124 pen

By using [JayTiger's Answer] (Bigquery - Best way to transpose rows into multiple columns) and Mikhail's Answer it will generate list of columns like below

transaction_id item_name_snacks item_name_marbles item_name_tooth_paste item_name_tooth_brush item_name_pen

However, what I desired is something like this where I can define the name of the columns by using a sequence of number for example:

transaction_id item_name_1 item_name_2 item_name_3
123 snacks marbles
124 tooth_paste tooth_brush pen

Since in my sample data the maximum number of item per transaction_id is 3 items, then the generated column is also 3.

Is there any way to pull this off? Thanks!


Solution

  • You might consider below and I think you can generalize the query using a dynamic sql answered by Mikhail and me in previous answers.

    WITH sample_table AS (
      SELECT '123' transaction_id, 'snacks' item_name UNION ALL
      SELECT '123' transaction_id, 'marbles' item_name UNION ALL
      SELECT '124' transaction_id, 'tooth_paste' item_name UNION ALL
      SELECT '124' transaction_id, 'tooth_brush' item_name UNION ALL
      SELECT '124' transaction_id, 'pen' item_name
    )
    SELECT * 
      FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY transaction_id) rn FROM sample_table)
     PIVOT (ANY_VALUE(item_name) item_name FOR rn IN (1, 2, 3));
    

    Query results

    enter image description here

    For a dynamic sql, you can refere to the following.

    CREATE TEMP TABLE sample_table AS
      SELECT '123' transaction_id, 'snacks' item_name UNION ALL
      SELECT '123' transaction_id, 'marbles' item_name UNION ALL
      SELECT '124' transaction_id, 'tooth_paste' item_name UNION ALL
      SELECT '124' transaction_id, 'tooth_brush' item_name UNION ALL
      SELECT '124' transaction_id, 'pen' item_name;
    
    EXECUTE IMMEDIATE FORMAT("""
      SELECT * 
        FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY transaction_id) rn FROM sample_table)
       PIVOT (ANY_VALUE(item_name) item_name FOR rn IN (%s))
      """, (SELECT STRING_AGG('' || rn, ',') FROM (
              SELECT DISTINCT ROW_NUMBER() OVER (PARTITION BY transaction_id) rn 
                FROM sample_table ORDER BY rn
           )));
    
    • '' || rn is same as CAST(rn AS STRING)