Search code examples
mysqlinnodb

Why is InnoDB table size much larger than expected?


I'm trying to figure out storage requirements for different storage engines. I have this table:

CREATE TABLE  `mytest` (
  `num1` int(10) unsigned NOT NULL,
  KEY `key1` (`num1`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

When I insert some values and then run show table status; I get the following:

+----------------+--------+---------+------------+---------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------------+----------+----------------+---------+
| Name           | Engine | Version | Row_format | Rows    | Avg_row_length | Data_length | Max_data_length  | Index_length | Data_free | Auto_increment | Create_time         | Update_time         | Check_time | Collation         | Checksum | Create_options | Comment |
+----------------+--------+---------+------------+---------+----------------+-------------+------------------+--------------+-----------+----------------+---------------------+---------------------+------------+-------------------+----------+----------------+---------+
| mytest         | InnoDB |      10 | Compact    | 1932473 |             35 |    67715072 |                0 |     48840704 |   4194304 |           NULL | 2010-05-26 11:30:40 | NULL                | NULL       | latin1_swedish_ci |     NULL |                |         |

Notice avg_row_length is 35. I am baffled that InnoDB would not make better use of space when I'm just storing a non-nullable integer.

I have run this same test on myISAM and by default myISAM uses 7 bytes per row on this table. When I run

ALTER TABLE mytest MAX_ROWS=50000000, AVG_ROW_LENGTH = 4;

causes myISAM to finally correctly use 5-byte rows.

When I run the same ALTER TABLE statement for InnoDB the avg_row_length does not change.

Why would such a large avg_row_length be necessary when only storing a 4-byte unsigned int?


Solution

  • InnoDB tables are clustered, that means that all data are contained in a B-Tree with the PRIMARY KEY as a key and all other columns as a payload.

    Since you don't define an explicit PRIMARY KEY, InnoDB uses a hidden 6-byte column to sort the records on.

    This and overhead of the B-Tree organization (with extra non-leaf-level blocks) requires more space than sizeof(int) * num_rows.