Search code examples
databasepostgresqlindicespostgresql-9.5

Postgresql - Multiple Indexes with the Same Columns


Suppose I

  • have a table with columns a, b, c, and d
  • want SELECT queries with combinations of these columns to be blazing fast
  • expect a to be in every query
  • don't care about slow UPDATEs and INSERTs

What indexes should I create and is creating multiple indexes with the same columns in differing order a horrible idea?

(a, b)

(a, c)

(a, b, c)

(a, c, b)

...

(a, b, c, d)

(a, d, c, b)

...


Solution

  • Theoretically, if you totally don't care about update/insert speed and excessive indexes size (disk space), then you will need all possible combinations of columns used in WHERE clause of the query, and the query planner will decide which one to use. But whether the index will be useful at all depends on the table data.

    The order of index columns plays very important role. The columns should be ordered by cardinality. Let's look at the example:

    We have a table of people (id, surname, firstname, year_of_birth, sex). What indexes are appropriate here?

    1. Filtering by name

    Which index should we add - (surname, firstname) or (firstname, surname)? The right answer is (surname, firstname), because if you group records by surname, it will certainly have higher number of records.

    1. Filtering by name and year of birth

    Let's say we already have (surname, firstname) index. Should we change it to (surname, firstname, year)? There might be some benefit, but I'm doubtful about that. For any given surname & firstname, there will be probably just a few records with different age. The point is that if we have almost unique combination (like surname + first name), then adding more columns to the index won't help much, if at all.

    1. Filtering by sex

    No index needed. Because there are only two possible values: male/female. So the index will not be efficient.


    Apart from that indexes are great and essential, a few things I'd like to note:

    • indexes take up extra disk space
    • indexes affect update/insert speed
    • not all indexes are efficient (for a small set of records sequential scan is faster, because index-lookup isn't free performance-wise)
    • eventually, which index is going to be used is decided by query planner, and it depends on A LOT of factors. Sometimes it may prefer sequential scan even if you have indexes. So you never know until you test it.

    A good point from the documentation: Combining Multiple Indexes

    In all but the simplest applications, there are various combinations of indexes that might be useful, and the database developer must make trade-offs to decide which indexes to provide. Sometimes multicolumn indexes are best, but sometimes it's better to create separate indexes and rely on the index-combination feature.