Search code examples
mysqlpivot-tablepercona

Better structure of Mysql table for Mysql performance


I'm evaluating better table structure in terms of Mysql performance, let's say i have 2 table structures mentioned below

Reference table structure 1 :

CREATE TABLE `references_1` (
  `id` bigint(30) NOT NULL AUTO_INCREMENT,
  `entity_id` int(11) DEFAULT NULL,
  `reference_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `index_on_entity_id` (`entity_id`),
  KEY `index_mappings_on_reference_id_and_reference_type` (`reference_id`,`reference_type`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8

mysql> select * from references_1 where entity_id = 1;
+----+-----------+--------------+------------------+
| id | entity_id | reference_id | reference_type   |
+----+-----------+--------------+------------------+
|  1 |         1 | 25636826     | reference_type_1 |
|  2 |         1 | 2563XCDA6826 | reference_type_2 |
|  3 |         1 | 16992176     | reference_type_3 |
|  4 |         1 | 4521882      | reference_type_4 |
+----+-----------+--------------+------------------+
4 rows in set (0.00 sec)


Reference table structure 2 :


CREATE TABLE `references_2` (
  `id` bigint(30) NOT NULL AUTO_INCREMENT,
  `entity_id` int(11) DEFAULT NULL,
  `reference_type_1` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type_2` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type_3` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  `reference_type_4` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `index_on_entity_id` (`entity_id`),
  KEY `index_on_reference_type_1` (`reference_type_1`),
  KEY `index_on_reference_type_2` (`reference_type_2`),
  KEY `index_on_reference_type_3` (`reference_type_3`),
  KEY `index_on_reference_type_4` (`reference_type_4`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8

mysql> select * from references_2 where entity_id = 1;
+----+-----------+------------------+------------------+------------------+------------------+
| id | entity_id | reference_type_1 | reference_type_2 | reference_type_3 | reference_type_4 |
+----+-----------+------------------+------------------+------------------+------------------+
|  2 |         1 | 25636826         | 2563XCDA6826     | 16992176         | 4521882          |
+----+-----------+------------------+------------------+------------------+------------------+

       
  • Which structure will help in better Query performance with data growth?
  • How does mysql IO works? Does increase in no of rows fetched will affect IO performance?
  • What other factors to consider here, if i am missing any.

Please share your views, Thanks in advance.

Edition:

Queries :

select * from references_1 where entity_id = 1; //assuming entity_id has index.

How would INSERT (Write operations) performance w.r.t both table structures?


Solution

  • MySQL's InnoDB storage engine (the default) stores rows in pages of fixed size (default is 16KB per page). Some number of rows fit on a single page, depending on the row size. I.e. if rows are smaller, more rows fit per page.

    Pages are the increment of data loaded from storage to RAM. So if your query references one row on that page, the whole page is loaded into RAM, and then all rows on the same page are faster to access.

    A single row will not be split. It will be stored in the same page (except for very long varchar or text/blob columns, those can expand to other pages).

    Assuming rows for the same entity_id are probably grouped together, the difference in storage and performance between your two table designs is really very close. It's true that in the first design, you have extra rows so there will be extra instances of id and entity_id. But those are only one bigint and one int, so not much overhead. The other columns will use identical storage.

    Other considerations:

    Will you will ever expect to extend the reference types to 5 or higher? The second design would require you to add a column with ALTER TABLE.

    Do you need a varchar for the reference type? Could you encode it as a tinyint or an ENUM? That would save space.

    On the other hand, using your second design saves even more space because the reference types are only part of the metadata. Therefore they take space only once, not on every row.