I can't seem to get the groupArrayInsertAt aggregation function to work on Clickhouse. I am passing zero as the second argument which is position, but it keeps throwing an error.
I am trying to run this sample query, and it executes quite alright, with the result as below.
with dat_ as
(select 'K' as f, 'P' as g, 2 as h
union all
select 'K' as f, 'P' as g, 7 as h
union all
select 'K' as f, 'P' as g, 5 as h
union all
select 'G' as f, 'P' as g, 1 as h
union all
select 'G' as f, 'K' as g, 3 as h
union all
select 'G' as f, 'K' as g, 8 as h
union all
select 'P' as f, 'Y' as g, 2 as h
union all
select 'P' as f, 'Y' as g, 2 as h
union all
select 'P' as f, 'Y' as g, 2 as h
union all
select 'P' as f, 'J' as g, 5 as h)
,recs_ as (
select f, g, h, row_number() over(partition by f, g order by h desc) as rnk
from dat_
)
select f, g, groupArrayInsertAt(h, rnk) as arr_
from recs_
group by f, g
g f arr_
------------------
P G [0,1]
J P [0,5]
Y P [0,2,2,2]
K G [0,8,3]
P K [0,7,5,2]
However, because the index for clickhouse arrays start at zero, I am getting 0 at the start of the arrays in the result, which is not the output I want. So, I try deducting 1 from the rank as below to prevent the resulting arrays from starting with zeros.
with dat_ as
(select 'K' as f, 'P' as g, 2 as h
union all
select 'K' as f, 'P' as g, 7 as h
union all
select 'K' as f, 'P' as g, 5 as h
union all
select 'G' as f, 'P' as g, 1 as h
union all
select 'G' as f, 'K' as g, 3 as h
union all
select 'G' as f, 'K' as g, 8 as h
union all
select 'P' as f, 'Y' as g, 2 as h
union all
select 'P' as f, 'Y' as g, 2 as h
union all
select 'P' as f, 'Y' as g, 2 as h
union all
select 'P' as f, 'J' as g, 5 as h)
,recs_ as (
select f, g, h, row_number() over(partition by f, g order by h desc) - 1 as rnk
from dat_
)
select f, g, groupArrayInsertAt(h, rnk) as arr_
from recs_
group by f, g
However, when I run this, I get the following error.
SQL Error [43] [07000]: Code: 43. DB::Exception: Second argument of aggregate function groupArrayInsertAt must be unsigned integer. (ILLEGAL_TYPE_OF_ARGUMENT) (version 23.8.9.54 (official build))
The output I expect to have is as below.
g f arr_
------------------
P G [1]
J P [5]
Y P [2,2,2]
K G [8,3]
P K [7,5,2]
I have checked Clickhouse Docs and confirmed that zero is a valid UInt32 which is the argument type expected by the groupArrayInsertAt function. So, I don't see why this shouldn't work.
ClickHouse inferred the type for rnk as Int64:
WITH
dat_ AS
(
SELECT
f,
g,
h
FROM VALUES('f String, g String, h Int32', ('K', 'P', 2), ('K', 'P', 7), ('K', 'P', 5), ('G', 'P', 1), ('G', 'K', 3), ('G', 'K', 8), ('P', 'Y', 2), ('P', 'Y', 2), ('P', 'Y', 2), ('P', 'J', 5))
),
recs_ AS
(
SELECT
f,
g,
h,
row_number() OVER (PARTITION BY f, g ORDER BY h DESC) - 1 AS rnk
FROM dat_
)
SELECT toTypeName(rnk)
FROM recs_
LIMIT 1
/*
┌─toTypeName(rnk)─┐
│ Int64 │
└─────────────────┘
*/
It needs to do the explicit conversion to UInt32 or other unsigned types:
WITH
dat_ AS
(
SELECT
f,
g,
h
FROM VALUES('f String, g String, h Int32', ('K', 'P', 2), ('K', 'P', 7), ('K', 'P', 5), ('G', 'P', 1), ('G', 'K', 3), ('G', 'K', 8), ('P', 'Y', 2), ('P', 'Y', 2), ('P', 'Y', 2), ('P', 'J', 5))
),
recs_ AS
(
SELECT
f,
g,
h,
row_number() OVER (PARTITION BY f, g ORDER BY h DESC) - 1 AS rnk
FROM dat_
)
SELECT
f,
g,
groupArrayInsertAt(h, toUInt32(rnk)) AS arr_
FROM recs_
GROUP BY
f,
g
/*
┌─f─┬─g─┬─arr_────┐
│ G │ P │ [1] │
│ P │ J │ [5] │
│ P │ Y │ [2,2,2] │
│ G │ K │ [8,3] │
│ K │ P │ [7,5,2] │
└───┴───┴─────────┘
*/