Search code examples
clickhouse

How can I count and group by values in a groupArray cell in clickhouse


I have the following code:

drop table if exists statusCodes;

create table statusCodes (statusCode UInt16) engine=Memory;

insert into statusCodes values (0);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (200);
insert into statusCodes values (201);
insert into statusCodes values (301);
insert into statusCodes values (301);
insert into statusCodes values (301);
insert into statusCodes values (301);
insert into statusCodes values (304);
insert into statusCodes values (404);
insert into statusCodes values (500);

select
    multiIf(statusCode >= 500, '5xx', statusCode >= 400, '4xx', statusCode >= 300, '3xx', statusCode >= 200, '2xx', 'errored'),
    groupArray(statusCode),
    count(1) as count
from statusCodes
group by 1;

fiddle here: https://fiddle.clickhouse.com/b3b3dfb9-1826-47a8-b201-831498c941f2

It gives me

1 2 3
4xx [404] 1
3xx [301, 301, 301, 301, 304] 5
2xx [200, 200, 200, 200, 200, 201] 6
5xx [500] 1
errored [0] 1

But I want to have something like

1 2 3
2xx (200, 5), (201, 1) 6
3xx (301, 3), (304, 1) 5
4xx (404, 1) 1
5xx (500, 1) 1
errored (0, 1) 1

So, to summarize, I want to group all status code by main category (2xx, 3xx, 4xx, 5xx) and in row, another grouping by real status code of this category.


Solution

  • Here is a way to do it using two aggregations, the first by error group and error code, and the second by error only :

    with cte as (
      select
        multiIf(statusCode >= 500, '5xx', statusCode >= 400, '4xx', statusCode >= 300, '3xx', statusCode >= 200, '2xx', 'errored') as error,
        statusCode,
        count(2) as cnt
      from status_code
      group by 1, 2
    )
    select error, groupArray(tuple(statusCode, cnt)) as arr, sum(cnt) as total_count
    from cte
    group by error
    order by 1;
    

    Results :

    2xx     [(200,5),(201,1)]   6
    3xx     [(301,4),(304,1)]   5
    4xx     [(404,1)]           1
    5xx     [(500,1)]           1
    errored [(0,1)]             1
    

    Demo here