Search code examples
sqlmysqlindexingquery-optimizationexplain

index is used with SELECT COUNT(1) but not with SELECT *


Here's my query:

SELECT COUNT(1)
FROM nhd
WHERE gnis_id IN (01372336);

The table definition has an INDEX gnis_id (gnis_id) on it and, indeed, that index is used with this query. BUT when I replace the COUNT(1) with * the index is not used, at least according to EXPLAIN. Why is that? Why would changing the columns I'm getting back change how MySQL is deciding which INDEX to use for the WHERE part of the query?

Here's the output of the EXPLAIN with *:

id: 1
select_type: SIMPLE
table: nhd
partitions: NULL
type: ALL
possible_keys: gnis_id
key: NULL
key_len: NULL
ref: NULL
rows: 2752840
filtered: 10.00
Extra: Using where

Here's the output of SHOW INDEX FROM nhd:

https://pastebin.com/KXpqBeJm

I'm running MySQL 8.0.33


Solution

  • A guess. Is this the declaration of the column?

    gnis_id VARCHAR(8)
    

    Meanwhile, you are checking those strings against a number:

    WHERE gnis_id IN (01372336)
    

    That means it has to convert each gnis_id to a number before testing.

    Solution:

    WHERE gnis_id IN ("01372336")
    

    With that change, it will look at only 96 rows, not 2,758,717 rows. (Actually, it will be 97 rows -- the extra one is to discover that it is finished.)

    Discussion

    SELECT COUNT(1) can be done entirely in the INDEX's BTree. That is the index is "covering"; EXPLAIN will say "Using index". And, because the index is ordered by gnis_id, the rows it needs to look at are consecutive once the datatypes match.

    SELECT * needs to fetch all the columns, which are found only in the data's BTree.

    The clustered index on id does not need to be involved.