I'm evaluating better table structure in terms of Mysql performance, let's say i have 2 table structures mentioned below
Reference table structure 1 :
CREATE TABLE `references_1` (
`id` bigint(30) NOT NULL AUTO_INCREMENT,
`entity_id` int(11) DEFAULT NULL,
`reference_id` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`reference_type` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `index_on_entity_id` (`entity_id`),
KEY `index_mappings_on_reference_id_and_reference_type` (`reference_id`,`reference_type`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8
mysql> select * from references_1 where entity_id = 1;
+----+-----------+--------------+------------------+
| id | entity_id | reference_id | reference_type |
+----+-----------+--------------+------------------+
| 1 | 1 | 25636826 | reference_type_1 |
| 2 | 1 | 2563XCDA6826 | reference_type_2 |
| 3 | 1 | 16992176 | reference_type_3 |
| 4 | 1 | 4521882 | reference_type_4 |
+----+-----------+--------------+------------------+
4 rows in set (0.00 sec)
Reference table structure 2 :
CREATE TABLE `references_2` (
`id` bigint(30) NOT NULL AUTO_INCREMENT,
`entity_id` int(11) DEFAULT NULL,
`reference_type_1` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`reference_type_2` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`reference_type_3` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
`reference_type_4` varchar(50) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
KEY `index_on_entity_id` (`entity_id`),
KEY `index_on_reference_type_1` (`reference_type_1`),
KEY `index_on_reference_type_2` (`reference_type_2`),
KEY `index_on_reference_type_3` (`reference_type_3`),
KEY `index_on_reference_type_4` (`reference_type_4`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8
mysql> select * from references_2 where entity_id = 1;
+----+-----------+------------------+------------------+------------------+------------------+
| id | entity_id | reference_type_1 | reference_type_2 | reference_type_3 | reference_type_4 |
+----+-----------+------------------+------------------+------------------+------------------+
| 2 | 1 | 25636826 | 2563XCDA6826 | 16992176 | 4521882 |
+----+-----------+------------------+------------------+------------------+------------------+
Please share your views, Thanks in advance.
Edition:
Queries :
select * from references_1 where entity_id = 1; //assuming entity_id has index.
How would INSERT (Write operations) performance w.r.t both table structures?
MySQL's InnoDB storage engine (the default) stores rows in pages of fixed size (default is 16KB per page). Some number of rows fit on a single page, depending on the row size. I.e. if rows are smaller, more rows fit per page.
Pages are the increment of data loaded from storage to RAM. So if your query references one row on that page, the whole page is loaded into RAM, and then all rows on the same page are faster to access.
A single row will not be split. It will be stored in the same page (except for very long varchar or text/blob columns, those can expand to other pages).
Assuming rows for the same entity_id are probably grouped together, the difference in storage and performance between your two table designs is really very close. It's true that in the first design, you have extra rows so there will be extra instances of id
and entity_id
. But those are only one bigint and one int, so not much overhead. The other columns will use identical storage.
Other considerations:
Will you will ever expect to extend the reference types to 5 or higher? The second design would require you to add a column with ALTER TABLE.
Do you need a varchar for the reference type? Could you encode it as a tinyint or an ENUM? That would save space.
On the other hand, using your second design saves even more space because the reference types are only part of the metadata. Therefore they take space only once, not on every row.