Search code examples
sqlsql-servert-sqlcountaudit

COUNT of GROUPS and NULLs don't add up to total rows in the table


The COUNT of GROUPS and NULLs don't add up to total rows in the table.

I first wondered, maybe the SQL Server total row count could be off (I'm new), so I thought to do a double-check by running a COUNT on the RentalID column, then checked for DISTINCT values just to verify that the total number of rows in the table called AllData was accurate. All results concurred; the table has a total of 3,471,007 rows.

Then I ran COUNT on all values in the Gender column using GROUP BY, and then separately ran a COUNT of the NULLs in that column. I expected when I added the three together, the figure would match the total number of rows in the table, but it doesn't. It's off by 3,444.

Here's a copy of my code:

-- HOW MANY TOTAL ROWS IN THE TABLE?  -------------------------------
SELECT  *
FROM AllData
-- 3,471,007 rows 

-- HOW MANY total RentalIDs ARE THERE?  -----------------------------
SELECT RentalID
FROM AllData
-- 3,471,007 rows. 

-- How many UNIQUE RentalIDs ARE THERE?  ----------------------------
SELECT COUNT(DISTINCT(RentalID))
FROM AllData
-- 3,471,007 rows. Means no missing RentalIDs.


-- HOW IS GENDER POPULATED?
SELECT Gender, COUNT(Gender) AS GenderCount
FROM AllData
GROUP BY Gender
/* Results:
   Gender  GenderCount
   NULL    0
   Male    2174743
   Female  737307
*/

-- HOW MANY ROWS ARE MISSING GENDER?  -------------------------------
SELECT  Gender
FROM AllData
WHERE Gender IS NULL
-- 562,401 rows are empty.


/* Why don't these numbers add up to total rows in table?
   2,174,743 Count of Gender = Male
     737,307 Count of Gender = Female
     562,401 Count of Gender = NULL
   3,474,451 This should equal total rows in table, but it doesn't

   3,471,007 Total rows in table
       3,444 Difference: these rows not accounted for...  */

Solution

  • SELECT Gender, COUNT(Gender) AS GenderCount
    FROM AllData
    GROUP BY Gender
    

    The above statement is giving 0 against NULL in gender because the COUNT function will not count the NULL values. So Instead, You can go for any of the below approaches

    Recommended Approach

    SELECT Gender, COUNT(1) AS GenderCount
    FROM AllData
    GROUP BY Gender
    

    Alternative #1

    SELECT Gender, COUNT(*) AS GenderCount
        FROM AllData
        GROUP BY Gender
    

    Alternative #2

    SELECT Gender, COUNT(ISNULL(Gender,'')) AS GenderCount
        FROM AllData
        GROUP BY Gender