I have the following code:
drop table if exists statusCodes;
create table statusCodes (statusCode UInt16) engine=Memory;
insert into statusCodes values (0);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (201);
insert into statusCodes values (301);
insert into statusCodes values (301);
insert into statusCodes values (301);
insert into statusCodes values (301);
insert into statusCodes values (304);
insert into statusCodes values (404);
insert into statusCodes values (500);
select
multiIf(statusCode >= 500, '5xx', statusCode >= 400, '4xx', statusCode >= 300, '3xx', statusCode >= 200, '2xx', 'errored'),
groupArray(statusCode),
count(1) as count
from statusCodes
group by 1;
fiddle here: https://fiddle.clickhouse.com/b3b3dfb9-1826-47a8-b201-831498c941f2
It gives me
1 | 2 | 3 |
---|---|---|
4xx |
[404] |
1 |
3xx |
[301, 301, 301, 301, 304] |
5 |
2xx |
[200, 200, 200, 200, 200, 201] |
6 |
5xx |
[500] |
1 |
errored |
[0] |
1 |
But I want to have something like
1 | 2 | 3 |
---|---|---|
2xx |
(200, 5), (201, 1) |
6 |
3xx |
(301, 3), (304, 1) |
5 |
4xx |
(404, 1) |
1 |
5xx |
(500, 1) |
1 |
errored |
(0, 1) |
1 |
So, to summarize, I want to group all status code by main category (2xx, 3xx, 4xx, 5xx) and in row, another grouping by real status code of this category.
Here is a way to do it using two aggregations, the first by error group and error code, and the second by error only :
with cte as (
select
multiIf(statusCode >= 500, '5xx', statusCode >= 400, '4xx', statusCode >= 300, '3xx', statusCode >= 200, '2xx', 'errored') as error,
statusCode,
count(2) as cnt
from status_code
group by 1, 2
)
select error, groupArray(tuple(statusCode, cnt)) as arr, sum(cnt) as total_count
from cte
group by error
order by 1;
Results :
2xx [(200,5),(201,1)] 6
3xx [(301,4),(304,1)] 5
4xx [(404,1)] 1
5xx [(500,1)] 1
errored [(0,1)] 1