long time listener, first time caller here...
I am trying to use SQL to calculate the average of several values across a row. All suggestions I've seen thus far say to simply Sum the values and divide by the number of values used, which would work except that it is common for some of my values to be NULL, in which case this calculation does not work. Is there a SQL statement I can use to get the average per Identifier per date across a row of values when some of those values may be NULL?
in attached image from Excel - Excel properly captures the average and ignores the NULL values. Can SQL do the same and get the same values?
Tried known suggestions for summing the values and dividing by the number of values, which does not work when those values are NULL.
You can use a combination of sum, coalesce and conditional aggregation.
This has been UPDATED to account for all null values. Basically, wrap the denominator with a NULLIF
.
create table my_data (
identifier integer,
date date,
value1 double precision,
value2 double precision,
value3 double precision,
value4 double precision,
value5 double precision,
value6 double precision
);
insert into my_data values
(123, '2023-08-17', 4.9176, 5.0943, 5.0925, 3.6750, 3.7625, 3.3895),
(987, '2023-08-17', 5.2808, null, 5.3000, 5.8750, null, 6.4117),
(222, '2023-08-17', null, null, null, null, null, null);
select identifier, date,
sum(
coalesce(value1, 0) +
coalesce(value2, 0) +
coalesce(value3, 0) +
coalesce(value4, 0) +
coalesce(value5, 0) +
coalesce(value6, 0)
) /
nullif(sum(
case when value1 is null then 0 else 1 end +
case when value2 is null then 0 else 1 end +
case when value3 is null then 0 else 1 end +
case when value4 is null then 0 else 1 end +
case when value5 is null then 0 else 1 end +
case when value6 is null then 0 else 1 end
), 0) as average
from my_data
group by identifier, date
order by identifier
identifier | date | average |
---|---|---|
123 | 2023-08-17 | 4.3219 |
222 | 2023-08-17 | null |
987 | 2023-08-17 | 5.716875 |