Search code examples
google-bigquerybigquery-udf

All possible unique combination of a single column value partitioned by groups


I have a table like the following in Google BigQuery. I am trying to get all possible unique combination(all subsets except the null subset) of the Item column partitioned on Group.

Group   Item
1       A
1       B
1       C
2       X
2       Y
2       Z

I am looking for an output like the following:

Group   Item
1        A
1        B
1        C
1        A,B
1        B,C
1        A,C
1        A,B,C
2        X
2        Y
2        Z
2        X,Y
2        Y,Z
2        X,Z
2        X,Y,Z

I have tried to use this accepted answer to incorporate Group to no avail: How to get combination of value from single column?


Solution

  • Consider below approach

    CREATE TEMP FUNCTION generate_combinations(a ARRAY<STRING>) 
    RETURNS ARRAY<STRING>
    LANGUAGE js AS '''
      var combine = function(a) {
        var fn = function(n, src, got, all) {
          if (n == 0) {
            if (got.length > 0) {
              all[all.length] = got;
            } return;
          }
          for (var j = 0; j < src.length; j++) {
            fn(n - 1, src.slice(j + 1), got.concat([src[j]]), all);
          } return;
        }
        var all = []; for (var i = 1; i < a.length; i++) {
          fn(i, a, [], all);
        }
        all.push(a);
        return all;
      } 
      return combine(a)
    ''';
    with your_table as (   
      select 1 as _Group,'A' as Item union all
      select 1, 'B' union all
      select 1, 'C' union all
      select 2, 'X' union all
      select 2, 'Y' union all
      select 2, 'Z' 
    )
    select _group, item
    from (
      select _group, generate_combinations(array_agg(item)) items
      from your_table
      group by _group
    ), unnest(items) item                  
    

    with output

    enter image description here