Search code examples
sqlcountmariadbdistinct

MariaDB: is COUNT DISTINCT bugged?


I have 3 requests :

SELECT COUNT(DISTINCT origine_client_id,  annee_imputation)
         FROM dossier d1;

34438

SELECT COUNT(DISTINCT d2.origine_client_id,  d2.annee_imputation)
         FROM (SELECT origine_client_id,  annee_imputation
                  FROM dossier) as d2;

34438

SELECT COUNT(*)
         FROM (SELECT DISTINCT origine_client_id,  annee_imputation
                 FROM dossier) as d3;

34478

But i haven't the same result, why ? (I am using Mariadb)

EDIT :

@jarlh

SELECT COUNT(DISTINCT origine_client_id) FROM dossier; => 19 488

SELECT COUNT(DISTINCT annee_imputation) FROM dossier; => 42

@a_horse_with_no_name

yes, there is null value

SELECT COUNT(id) FROM dossier WHERE annee_imputation IS NULL; => 1

SELECT COUNT(id) FROM dossier WHERE origine_client_id IS NULL; => 289 711

SELECT COUNT(id) FROM dossier WHERE origine_client_id IS NULL AND annee_imputation IS NULL; => 1


Solution

  • Based on doc:

    COUNT(DISTINCT expr,[expr...])

    Returns a count of the number of different non-NULL values.

    This syntax is non-standard MySQL/MariaDB extension. It seems to treat "non-NULL" to be all not NULL to be counted.

    Demo:

    CREATE TABLE dossier
    AS
    SELECT 1 origine_client_id, 2 annee_imputation  UNION ALL  -- both values provided
    SELECT NULL, NULL UNION ALL
    SELECT NULL origine_client_id, 1 annee_imputation UNION ALL
    SELECT 1 origine_client_id, NULL annee_imputation;
    

    Queries:

    SELECT COUNT(DISTINCT origine_client_id,  annee_imputation) FROM dossier d1;
    -- 1
    
    SELECT COUNT(DISTINCT d2.origine_client_id,  d2.annee_imputation) 
    FROM (SELECT origine_client_id,  annee_imputation FROM dossier) as d2;
    -- 1
    
    SELECT COUNT(*) 
    FROM (SELECT DISTINCT origine_client_id,  annee_imputation FROM dossier) as d3;
    -- 4
    

    db<>fiddle demo