Search code examples
mysqlinnodb

Big mysql innodb table - how optimize and partitioning


There is a table Mysql InnoDB:

  • The table contains 22 columns, a part of which may be NULL
  • The first column phone is required and always contains the phone number of 10 digits (any).
  • The column phone is not unique and can be more than one different strings containing the same phone
  • The table is static and does not assume adding a row
  • The size of the table near 500 million rows

How can I optimize SELECT by the phone on the table? Will partitioning help?


Solution

  • The partitioning will be useful if your data (phone numbers) are distributed uniformly. It will be useless to have 10 partitions and 90% of your data in the first one...

    First, if at all you should use partitioning, it would be ideally be a RANGE to avoid the overhead of the HASH but it has drawbacks:

    Phone numbers usually contain some structure (country code, area code,...) which makes partitioning by RANGE a biased estimate.

    What I would suggest if you have the possibility to, is (1) to store and index the phone numbers in reverse order (right to left) therefore the last digits would come first which are much more likely to be uniformly distributed.

    Then, (2) if you can, I would split by as many tables as required (10 for a start) with numbers starting with the X first digits. This would thus limit the number of rows in each table to a subset of the data.

    On the server configuration, store the innodb data per table, which helps indexing with concurrency.

    Of course, the above only applies if you have some flexibility in your schema and if you do not need to run consolidated queries against the entire dataset.