Search code examples
mysqlmysql-5.7

MySQL query results were confusing. When selected with year separately, it gives count greater than 1, but overall it is not


mysql> select count(distinct message_id) from table1 where message_notification=0 and YEAR(created_at) = '2024';

| count(distinct message_id) |
|                         33 |


mysql> select count(distinct message_id) from table1 where message_notification=0 and YEAR(created_at) = '2023';

| count(distinct message_id) |
|                       5796 |

1 row in set (0.07 sec)

mysql> select count(distinct message_id) from table1 where message_notification=0 and YEAR(created_at) = '2022';

| count(distinct message_id) |
|                        431 |

1 row in set (0.03 sec)

Adding 431+5796+33=6260

but

mysql> select count(distinct message_id) from table1 where message_notification=0;
| count(distinct message_id) |
|                       6259 |

1 row in set (0.05 sec)

There is no record with created_at NULL or message_notification NULL.
There is no record with created_at 2021 or lesser than that.
There is no record with created_at 2024 or greater than that.
Checked by selecting created_at asc and desc


Solution

  • I think the problem here is the distinct part. If you have two (or more) rows with the same message_id but in different years, they will appear separately in the queries for the individual years, but will counted as one when you query across the years.

    If you add the year to the distinct clause, you should get the "expected" 6260:

    select count(distinct message_id, YEAR(created_at)) 
    -- Here --------------------------^
    from   table1 
    where  message_notification=0;