Search code examples
sqlclickhouse

How to group sum Array column by another Array column's elements in Clickhouse


I have a Clickhouse table having multiple Array columns, such as

timestamp  sensor_type                    priority                values
10:00:00   ['a','b','b','a','c','a','c']  [3, 2, 1, 5, 1, 2, 1]   [7, 4, 1, 12, 3, 9, 2]
10:01:00   ['c','e','g','e','g']          [2, 4, 1, 2, 4]         [23, 3, 5, 8, 6]
...

The timestamps are unique and monotonically increasing. The sensors that have values recorded are dynamically changing at each timestamp. I'm trying to group and sum up the values array by sensor_type or priority for each timestamp, so the expected aggregated columns are like:

timestamp  sensor_type_sorted  sum_val_by_sensor_type  priority_sorted  sum_val_by_priority
10:00:00   ['a', 'b', 'c']     [28, 5, 5]              [1, 2, 3, 5]     [6, 13, 7, 12]
10:01:00   ['c', 'e', 'g']     [23, 11, 11]            [1, 2, 4]        [5, 31, 9]
...

How to achieve this?


Solution

  • Try this way:

    WITH data AS
      (
        SELECT timestamp, sensor_type, priority, values
        FROM VALUES('timestamp DateTime, sensor_type Array(String), priority Array(Int32), values Array(Int32)', 
          ('2024-01-01 10:00:00', ['a','b','b','a','c','a','c'], [3, 2, 1, 5, 1, 2, 1], [7, 4, 1, 12, 3, 9, 2]), 
          ('2024-01-01 10:01:00', ['c','e','g','e','g'],         [2, 4, 1, 2, 4],       [23, 3, 5, 8, 6]))
      )
    SELECT
        timestamp,
        arrayReduce('sumMap', [sensor_type], [values]) AS sensor_result,
        sensor_result.1 AS sensor_type_sorted,
        sensor_result.2 AS sum_val_by_sensor_type,
        arrayReduce('sumMap', [priority], [values]) AS priority_result,
        priority_result.1 AS priority_sorted,
        priority_result.2 AS sum_val_by_priority    
    FROM data
    
    /*
    ┌───────────timestamp─┬─sensor_result──────────────┬─sensor_type_sorted─┬─sum_val_by_sensor_type─┬─priority_result─────────┬─priority_sorted─┬─sum_val_by_priority─┐
    │ 2024-01-01 10:00:00 │ (['a','b','c'],[28,5,5])   │ ['a','b','c']      │ [28,5,5]               │ ([1,2,3,5],[6,13,7,12]) │ [1,2,3,5]       │ [6,13,7,12]         │
    │ 2024-01-01 10:01:00 │ (['c','e','g'],[23,11,11]) │ ['c','e','g']      │ [23,11,11]             │ ([1,2,4],[5,31,9])      │ [1,2,4]         │ [5,31,9]            │
    └─────────────────────┴────────────────────────────┴────────────────────┴────────────────────────┴─────────────────────────┴─────────────────┴─────────────────────┘
    */