Here's my query:
SELECT COUNT(1)
FROM nhd
WHERE gnis_id IN (01372336);
The table definition has an INDEX gnis_id (gnis_id)
on it and, indeed, that index is used with this query. BUT when I replace the COUNT(1)
with *
the index is not used, at least according to EXPLAIN
. Why is that? Why would changing the columns I'm getting back change how MySQL is deciding which INDEX to use for the WHERE part of the query?
Here's the output of the EXPLAIN with *
:
id: 1
select_type: SIMPLE
table: nhd
partitions: NULL
type: ALL
possible_keys: gnis_id
key: NULL
key_len: NULL
ref: NULL
rows: 2752840
filtered: 10.00
Extra: Using where
Here's the output of SHOW INDEX FROM nhd
:
I'm running MySQL 8.0.33
A guess. Is this the declaration of the column?
gnis_id VARCHAR(8)
Meanwhile, you are checking those strings against a number:
WHERE gnis_id IN (01372336)
That means it has to convert each gnis_id
to a number before testing.
Solution:
WHERE gnis_id IN ("01372336")
With that change, it will look at only 96 rows, not 2,758,717 rows. (Actually, it will be 97 rows -- the extra one is to discover that it is finished.)
Discussion
SELECT COUNT(1)
can be done entirely in the INDEX's
BTree. That is the index is "covering"; EXPLAIN
will say "Using index". And, because the index is ordered by gnis_id
, the rows it needs to look at are consecutive once the datatypes match.
SELECT *
needs to fetch all the columns, which are found only in the data's BTree.
The clustered index on id
does not need to be involved.