Search code examples
clickhouse

Unexpected result of sumMerge & maxMerge on empty initialization


SELECT 
  uniqMerge (uniq_state) as uniq_merge  -- 0: ok
, sumMerge  (sum_state)  as sum_merge   -- 0: why not NULL?
, maxMerge  (max_state)  as max_merge   -- 0: why not NULL?
FROM 
(
  SELECT 
    initializeAggregation ('uniqStateIf', '', 0) as uniq_state
  , initializeAggregation ('sumStateIf',   0, 0) as sum_state
  , initializeAggregation ('maxStateIf',   0, 0) as max_state
)
uniq_merge sum_merge max_merge
0 0 0

It's not clear why I get 0 instead of NULL on sumMerge & maxMerge.
Is this expected?

Update:
Seems, it depends on the base column's nullability:

SELECT 
  sumMerge (sum_state_null)  as sum_merge_null
, sumMerge (sum_state_0)     as sum_merge_0
FROM
(
  SELECT 
    sumState (cast (null as Nullable(UInt8)))   as sum_state_null
  , sumState (0)                                as sum_state_0
/*  
    UNION ALL
  SELECT 
    sumState (cast (1 as Nullable(UInt8)))
  , sumState (0)
*/  
)
sum_merge_null sum_merge_0
0

Solution

  • By default Clickhouse server avoids Nulls (Nullable types), because processing of Nullable types can be up to twice slower.

    There is a settings for more SQL ANSI compatibility

    select sum(x) sx from (select 1 x where 0);
    ┌─sx─┐
    │  0 │
    └────┘
    
    set aggregate_functions_null_for_empty=1;
    
    select sum(x) sx from (select 1 x where 0);
    ┌───sx─┐
    │ ᴺᵁᴸᴸ │
    └──────┘
    

    Unfortunately this functionality is not implemented for State/Merge.

    And actually you don't need AggregateFunction for SUM/MAX because these two functions does not need state. So you can use SimpleAggregateFunction.

    see for example Clickhouse Materialized View on ReplicatedAggregatingMergeTree

    So in real-life examples with AggregatingMergeTree+SimpleAggregateFunction+Nullable types it's possible to get Nulls here.


    Upd:

    also it's possible like this

    CREATE TABLE test
    ENGINE = AggregatingMergeTree
    ORDER BY i AS
    SELECT
        1 AS i,
        uniqState('a') AS uniq_state,
        sumState(CAST('10', 'Nullable(Int64)')) AS sum_state,
        maxState(CAST('10', 'Nullable(Int64)')) AS max_state
    WHERE 0
    
    SELECT
        uniqMerge(uniq_state) AS uniq_merge,
        sumMerge(sum_state) AS sum_merge,
        maxMerge(max_state) AS max_merge
    FROM test
    ┌─uniq_merge─┬─sum_merge─┬─max_merge─┐
    │          0 │      ᴺᵁᴸᴸ │      ᴺᵁᴸᴸ │
    └────────────┴───────────┴───────────┘