Search code examples
sqlpostgresql

Conditions in aggregate functions


I was solving some SQL tasks and got confused with the following thing:

CREATE TABLE data (
  len INT
);

INSERT INTO data (len)
VALUES 
  (NULL),
  (100), 
  (200),
  (300),
  (400);

suppose we've created such a table and we wanna count all values that are less than 200. I am using this online compiler to do it.

my first attempt gives 1 as expected

SELECT COUNT(*) FILTER (WHERE len < 200)
FROM data

then I thought whether I can simplify it and just do

SELECT COUNT(len < 200) FROM data

but it gives 4 as an answer.

So my question is: can we use logic expressions inside of aggregate functions? If the answer is NO, why?


Solution

  • COUNT(expression) computes the number of rows for which expression is not NULL. COUNT(*) computes the total number of rows because * is the row value, which is never NULL for any row, even if every column in a row is NULL. The < operator can return one of three results: TRUE, FALSE, or NULL. Keep in mind that NULL is not a value, it is a state. The expression COUNT(len < 200) only returns NULL when len is NULL; therefore, any valued len will contribute to the count. COUNT(NULL) will always return 0 regardless of the number of rows.