Search code examples
clickhouse

Clickhouse combine multiple row vectors/arrays as an average


In Clickhouse, I have a table like so:

    ID | frame | vector
    123, 0, [-0.1,0.3,0.2]
    123, 1, [0.3, 0.2, -0.1]
    456, 0, [-0.2, 0.1, 0.2]

I want to query this data so that it outputs:

    ID | avg_vector
    123, [0.1, 0.25, 0.05]
    456, [-0.2, 0.1, 0.2]

but the functions i seem to find only provide averaging an array, not averaging two arrays together.

i tried:

    SELECT
    id,
    arrayMap(
        i -> arrayReduce('avg', arrayMap(x -> x[i], groupArray(vector))),
        range(3)
    ) AS avg_vector
    FROM embeddings
    GROUP BY id

but this I believe is working across each element, not between arrays


Solution

  • avgForEach function

    WITH vectors AS
       (
           SELECT c1 AS vector
           FROM VALUES([1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12])
       )
    SELECT avgForEach(vector) AS centroid
    FROM vectors
    
    ┌─centroid──┐
    │ [5,6,7,8] │
    └───────────┘