I have declared my aggregation table as:
CREATE TABLE test.system_aggr
(
system_id String,
aggr_val AggregateFunction(sum, Int64)
)
ENGINE = AggregatingMergeTree
ORDER BY (system_id);
correspondingly my mat.view is:
CREATE MATERIALIZED VIEW test.system_aggr_mv
TO test.system_aggr
AS
SELECT
system_id,
sumState(raw_val) as aggr_val
FROM test.system_base
GROUP BY
system_id;
now my question is, what's the advantage of using AggregateFunction(...)
, instead of just declaring aggr_val
as Int64
and using sum(...)
in materialized view? see below:
CREATE TABLE test.system_aggr
(
system_id String,
aggr_val Int64
)
ENGINE = AggregatingMergeTree
ORDER BY (system_id);
and
CREATE MATERIALIZED VIEW test.system_aggr_mv
TO test.system_aggr
AS
SELECT
system_id,
sum(raw_val) as aggr_val
FROM test.system_base
GROUP BY
system_id;
UPD. One thing that I noticed is that I need to use GROUP BY
while selecting from aggregation table in case if it has at least one column of type AggregateFunction
, so all columns that are not AggregateFunction
should be in GROUP BY
. This further makes me doubt about usefulness of having AggregateFunction
columns
Will answer by myself,
seems that the values don't get aggregated properly if usual aggregation functions are used instead of <function_name>State(...)
in mat.views (and correspondingly AggregateFunction
in aggregation tables)
For example, if I insert 4, 5, 6 sequentially (i.e. with three separate inserts) into base table, and then run OPTIMIZE TABLE
over aggregation table, I see only 4, instead of expected 15