Search code examples
sqlpostgresqlsql-execution-planpostgresql-performance

How do I know if any index is used in a query | PostgreSQL 11?


I am little bit confused and need some advice. I use PostgreSQL 11 database. I have such pretty simple sql statement:

SELECT DISTINCT "CITY", "AREA", "REGION"
    FROM youtube
WHERE
    "CITY" IS NOT NULL
AND
    "AREA" IS NOT NULL
AND
    "REGION" IS NOT NULL

youtube table which I use in sql statement has 25 million records. I think for thats why query takes 15-17 seconds to complete. For web project where I use that query it's too long. I'm trying to speed up the request.

I create such index for youtube table:

CREATE INDEX youtube_location_idx ON public.youtube USING btree ("CITY", "AREA", "REGION");

After this step I run query again but it takes the same time to complete. It seems like query don't use index. How do I know if any index is used in a query?

EXPLAIN ANALYZE return: enter image description here


Solution

  • You answered the question in the title yourself by running EXPLAIN. The query plan shows which indexes are used and how. For details see the chapter "Using EXPLAIN" in the manual.

    As for why the query uses a sequential scan and no indexes: 25 million row, 992781 rows removed. You are fetching 24709900 rows, that's almost all rows.

    This is never going to be fast. And it's only going to use an index in special situations.

    Using an index typically only makes sense for a small fraction of all rows. Else it would just add additional cost. Depending on a number of co-factors, the Postgres query planner starts to consider a btree index for around 5% of all rows or less. Related:

    Well, if your table rows are substantially wider than the three columns in your SELECT list, a (partial) covering index might help somewhat if you get index-only scans out of it. Again, needs to meet some preconditions. And every index also adds storage and maintenance costs.

    Aside: A comment claimed NULL values couldn't be indexed. This is incorrect, NULL values can be indexed. Not as efficient as other values, but doesn't make much of a difference. Also irrelevant to the case at hand.