Search code examples
sqlsql-serveraverage

How do I Average the values across a row when some values might be Null?


long time listener, first time caller here...

I am trying to use SQL to calculate the average of several values across a row. All suggestions I've seen thus far say to simply Sum the values and divide by the number of values used, which would work except that it is common for some of my values to be NULL, in which case this calculation does not work. Is there a SQL statement I can use to get the average per Identifier per date across a row of values when some of those values may be NULL?

in attached image from Excel - Excel properly captures the average and ignores the NULL values. Can SQL do the same and get the same values?

enter image description here

Tried known suggestions for summing the values and dividing by the number of values, which does not work when those values are NULL.


Solution

  • You can use a combination of sum, coalesce and conditional aggregation.

    This has been UPDATED to account for all null values. Basically, wrap the denominator with a NULLIF.

    create table my_data (
      identifier integer, 
      date date, 
      value1 double precision, 
      value2 double precision, 
      value3 double precision, 
      value4 double precision, 
      value5 double precision, 
      value6 double precision
    );
      
    insert into my_data values 
    (123, '2023-08-17', 4.9176, 5.0943, 5.0925, 3.6750, 3.7625, 3.3895), 
    (987, '2023-08-17', 5.2808, null, 5.3000, 5.8750, null, 6.4117), 
    (222, '2023-08-17', null, null, null, null, null, null); 
    
    select identifier, date, 
        sum(
        coalesce(value1, 0) + 
        coalesce(value2, 0) + 
        coalesce(value3, 0) + 
        coalesce(value4, 0) + 
        coalesce(value5, 0) + 
        coalesce(value6, 0)
        ) / 
        nullif(sum(
        case when value1 is null then 0 else 1 end +
        case when value2 is null then 0 else 1 end + 
        case when value3 is null then 0 else 1 end + 
        case when value4 is null then 0 else 1 end + 
        case when value5 is null then 0 else 1 end + 
        case when value6 is null then 0 else 1 end
        ), 0) as average
    from my_data
    group by identifier, date
    order by identifier
    
    identifier date average
    123 2023-08-17 4.3219
    222 2023-08-17 null
    987 2023-08-17 5.716875

    fiddle