Search code examples
mysqlinnodb

Overlapping multi-column indices


Consider a table like so (contrived example; I am trying to make it vaguely relatable):

CREATE TABLE `roles` (
  `host` varchar(50) NOT NULL,
  `port` smallint(6) NOT NULL,
  `user_id` varchar(50) NOT NULL,
  `role` varchar(50) NOT NULL
) ENGINE=InnoDB

Typical data:

INSERT INTO TABLE `roles`
(host, port, user_id, role)
VALUES
('localhost', 8080, 'root', 'admin'),
('localhost', 8080, 'guest', 'readonly');

I know that the data is unique and non-null on:
(host, port, user_id)
I intend to use that multi-column index as my primary key.

But I would also like the covering index optimization for my SELECTs, which select all columns. To that end, I am considering adding also a wider multi-column index on:
(host, port, user_id, role)

So here's the question… if I create two indices, (a,b,c) and (a,b,c,d): is MySQL/InnoDB smart enough to implement that as a single index under-the-hood?
Or will it create & maintain both indices, wasting space and time — and fail to notice that (a,b,c,d) is a superset of (a,b,c)?

I am tempted to avoid the problem and just make the primary key (a,b,c,d), so that there's only one index. But that seems like a sub-optimal primary key (it considers more columns than necessary to guarantee uniqueness).


Solution

  • (a,b,c) may be used in preference (by the Optimizer) over (a,b,c,d) because the index is smaller.

    Having both is a waste of disk space and leads to wasting room in cache when parts of both indexes are cached. And this can lead to slower queries.

    Instead of having both, keep the longer one. It will handle both situations.

    If d is a "big" column, then the index with it will be "big". In this case, (a,b,c) may be the preferred one to keep.

    On the other hand, (a,b,c) and (b,c,a) handle different queries, so you may need both.

    Cookbook on building indexes