The COUNT
of GROUPS
and NULL
s don't add up to total rows in the table.
I first wondered, maybe the SQL Server total row count could be off (I'm new), so I thought to do a double-check by running a COUNT
on the RentalID
column, then checked for DISTINCT
values just to verify that the total number of rows in the table called AllData
was accurate. All results concurred; the table has a total of 3,471,007 rows.
Then I ran COUNT
on all values in the Gender
column using GROUP BY
, and then separately ran a COUNT
of the NULL
s in that column. I expected when I added the three together, the figure would match the total number of rows in the table, but it doesn't. It's off by 3,444.
Here's a copy of my code:
-- HOW MANY TOTAL ROWS IN THE TABLE? -------------------------------
SELECT *
FROM AllData
-- 3,471,007 rows
-- HOW MANY total RentalIDs ARE THERE? -----------------------------
SELECT RentalID
FROM AllData
-- 3,471,007 rows.
-- How many UNIQUE RentalIDs ARE THERE? ----------------------------
SELECT COUNT(DISTINCT(RentalID))
FROM AllData
-- 3,471,007 rows. Means no missing RentalIDs.
-- HOW IS GENDER POPULATED?
SELECT Gender, COUNT(Gender) AS GenderCount
FROM AllData
GROUP BY Gender
/* Results:
Gender GenderCount
NULL 0
Male 2174743
Female 737307
*/
-- HOW MANY ROWS ARE MISSING GENDER? -------------------------------
SELECT Gender
FROM AllData
WHERE Gender IS NULL
-- 562,401 rows are empty.
/* Why don't these numbers add up to total rows in table?
2,174,743 Count of Gender = Male
737,307 Count of Gender = Female
562,401 Count of Gender = NULL
3,474,451 This should equal total rows in table, but it doesn't
3,471,007 Total rows in table
3,444 Difference: these rows not accounted for... */
SELECT Gender, COUNT(Gender) AS GenderCount
FROM AllData
GROUP BY Gender
The above statement is giving 0 against NULL
in gender because the COUNT
function will not count the NULL
values. So Instead, You can go for any of the below approaches
Recommended Approach
SELECT Gender, COUNT(1) AS GenderCount
FROM AllData
GROUP BY Gender
Alternative #1
SELECT Gender, COUNT(*) AS GenderCount
FROM AllData
GROUP BY Gender
Alternative #2
SELECT Gender, COUNT(ISNULL(Gender,'')) AS GenderCount
FROM AllData
GROUP BY Gender