Search code examples
sqlsumcaseparentheses

Why does the parenthesis make a different in this sql query


The objective:

Find the percentage of high elevation airports (elevation >= 2000) by state from the airports table.

In the query, alias the percentage column as percentage_high_elevation_airports.

Could someone explain why the following 2 SQL statements give different results:

Correct result:

SELECT state,
100.0 * sum(CASE WHEN elevation >= 2000 THEN 1 ELSE 0 END) / count(*)  as percentage_high_elevation_airports 
FROM airports 
GROUP BY state;

sample result:

MS  0.0
MT  100.0
NC  11.1111111111111
ND  10.0

and wrong result:

select 
state,
100.0 * (sum(case when elevation >= 2000 then 1 else 0 end)/count(*)) as percentage_high_elevation_airports
from airports
group by 1;

sample result:

MS  0.0
MT  100.0
NC  0.0
ND  0.0

Only difference is the additional placement of () around the sum.


Solution

  • I would write this as:

    SELECT state,
           AVG(CASE WHEN elevation >= 2000 THEN 100.0 ELSE 0 END) as percentage_high_elevation_airports 
    FROM airports 
    GROUP BY state;
    

    The issue is integer arithmetic. Some databases do an integer division and return an integer. So, 1/2 is 0 rather than 0.5. Some databases also apply this to avg() (but even some that do integer division to numeric averages).

    I should note that this is database-specific.