Search code examples
sqlclickhouseclickhouse-client

ClickHouse view - AggregateFunction column vs usual column


I have declared my aggregation table as:

CREATE TABLE test.system_aggr
(
         system_id String,

         aggr_val AggregateFunction(sum, Int64)
)
ENGINE = AggregatingMergeTree
ORDER BY (system_id);

correspondingly my mat.view is:

CREATE MATERIALIZED VIEW test.system_aggr_mv
TO test.system_aggr
AS
SELECT
    
    system_id,

    sumState(raw_val) as aggr_val

FROM test.system_base
GROUP BY
    system_id;

now my question is, what's the advantage of using AggregateFunction(...), instead of just declaring aggr_val as Int64 and using sum(...) in materialized view? see below:

CREATE TABLE test.system_aggr
(
         system_id String,

         aggr_val Int64
)
ENGINE = AggregatingMergeTree
ORDER BY (system_id);

and

CREATE MATERIALIZED VIEW test.system_aggr_mv
TO test.system_aggr
AS
SELECT
    
    system_id,

    sum(raw_val) as aggr_val

FROM test.system_base
GROUP BY
    system_id;

UPD. One thing that I noticed is that I need to use GROUP BY while selecting from aggregation table in case if it has at least one column of type AggregateFunction, so all columns that are not AggregateFunction should be in GROUP BY. This further makes me doubt about usefulness of having AggregateFunction columns


Solution

  • Will answer by myself, seems that the values don't get aggregated properly if usual aggregation functions are used instead of <function_name>State(...) in mat.views (and correspondingly AggregateFunction in aggregation tables)

    For example, if I insert 4, 5, 6 sequentially (i.e. with three separate inserts) into base table, and then run OPTIMIZE TABLE over aggregation table, I see only 4, instead of expected 15