Search code examples
postgresqlnullsumaggregate-functions

How to detect NULL rows in PostgreSQL sum()


I want to aggregate the sum of a column, while tracking the presence of NULL values which signal an error condition. E.g., take the table numbers:

# select * from numbers;
  n   |  l  
------+-----
    1 | foo
    2 | foo
 NULL | bar
    4 | bar

Given a label l, I want to compute the sum of numbers n with that label, provided there are no NULL values. Ideally, for a label without any rows, the sum would be 0. So I'm looking for some query q such that q('foo') = 3, q('baz') = 0 and q('bar') somehow signals an error, e.g. by returning NULL.

I started with the sum() aggregate function, but that converts NULL rows to 0. One solution would be a variant that returns NULL provided there are any NULL values.

sum() gives

# select sum(n) from numbers where l = 'bar';
 sum 
-----
   4

but I'd rather have sumnull() with

# select sumnull(n) from numbers where l = 'bar';
 sumnull 
---------
  NULL

The best solution I've found so far is to also count non-NULL rows and compare to the total count:

# select sum(n), count(*), count(n) as notnull from numbers;
 sum | count | notnull 
-----+-------+---------
   7 |     4 |       3

Then if count is unequal to notnull, I know the result is not valid.


Solution

  • Is an empty set good enough?

    create table numbers (n int);
    insert into numbers values (1),(2),(null),(4);
    
    select sum(n)
    from numbers
    having bool_and(n is not null);
     sum 
    -----
    (0 rows)
    

    If you really need a null value it is a bit more complex:

    with sum_null as (
        select sum(n) as sum_n
        from numbers
        having bool_and(n is not null)
    )
    select case
        when not exists (select 1 from sum_null) then null
        else (select sum_n from sum_null) end
    ;
     sum_n 
    -------
    
    (1 row)
    

    Replacing the having line for:

    having not bool_or(n is null)
    

    is less readable but potentially much faster since it can stop searching at the first null found.

    https://www.postgresql.org/docs/current/static/functions-aggregate.html#FUNCTIONS-AGGREGATE-TABLE