I have a Clickhouse table having multiple Array columns, such as
timestamp sensor_type priority values
10:00:00 ['a','b','b','a','c','a','c'] [3, 2, 1, 5, 1, 2, 1] [7, 4, 1, 12, 3, 9, 2]
10:01:00 ['c','e','g','e','g'] [2, 4, 1, 2, 4] [23, 3, 5, 8, 6]
...
The timestamps are unique and monotonically increasing. The sensors that have values recorded are dynamically changing at each timestamp. I'm trying to group and sum up the values
array by sensor_type
or priority
for each timestamp, so the expected aggregated columns are like:
timestamp sensor_type_sorted sum_val_by_sensor_type priority_sorted sum_val_by_priority
10:00:00 ['a', 'b', 'c'] [28, 5, 5] [1, 2, 3, 5] [6, 13, 7, 12]
10:01:00 ['c', 'e', 'g'] [23, 11, 11] [1, 2, 4] [5, 31, 9]
...
How to achieve this?
Try this way:
WITH data AS
(
SELECT timestamp, sensor_type, priority, values
FROM VALUES('timestamp DateTime, sensor_type Array(String), priority Array(Int32), values Array(Int32)',
('2024-01-01 10:00:00', ['a','b','b','a','c','a','c'], [3, 2, 1, 5, 1, 2, 1], [7, 4, 1, 12, 3, 9, 2]),
('2024-01-01 10:01:00', ['c','e','g','e','g'], [2, 4, 1, 2, 4], [23, 3, 5, 8, 6]))
)
SELECT
timestamp,
arrayReduce('sumMap', [sensor_type], [values]) AS sensor_result,
sensor_result.1 AS sensor_type_sorted,
sensor_result.2 AS sum_val_by_sensor_type,
arrayReduce('sumMap', [priority], [values]) AS priority_result,
priority_result.1 AS priority_sorted,
priority_result.2 AS sum_val_by_priority
FROM data
/*
┌───────────timestamp─┬─sensor_result──────────────┬─sensor_type_sorted─┬─sum_val_by_sensor_type─┬─priority_result─────────┬─priority_sorted─┬─sum_val_by_priority─┐
│ 2024-01-01 10:00:00 │ (['a','b','c'],[28,5,5]) │ ['a','b','c'] │ [28,5,5] │ ([1,2,3,5],[6,13,7,12]) │ [1,2,3,5] │ [6,13,7,12] │
│ 2024-01-01 10:01:00 │ (['c','e','g'],[23,11,11]) │ ['c','e','g'] │ [23,11,11] │ ([1,2,4],[5,31,9]) │ [1,2,4] │ [5,31,9] │
└─────────────────────┴────────────────────────────┴────────────────────┴────────────────────────┴─────────────────────────┴─────────────────┴─────────────────────┘
*/