Search code examples
mysqlinnodb

InnoDB index size with NULLs allowed (MySQL)


I wonder if anyone ever found confirmation in MySQL docs that for InnoDB, a column that allows NULL in the index takes 1 extra byte?

Example: create a column SMALLINT UNSIGNED DEFAULT NULL; (2 bytes). The index uses 3 bytes (without taking into account PK links).

The same column that does not allow NULL: SMALLINT UNSIGNED NOT NULL; The index will be as it should - 2 bytes.

UPD: I found this in docs: "Due to the key storage format, the key length is one greater for a column that can be NULL than for a NOT NULL column." But still, I dont understand, whether index size is 1 byte greater with NULLable column or not.

P.S. Sorry for my bad English :)


Solution

  • There are several flaws in the key_len of EXPLAIN.

    • There are differences between Engines, but Explain does not take account of such.
    • The null bit may or may not take a full byte. Still, 3 vs 2 is a handy clue that the SMALLINT is NULL or NOT NULL.
    • VAR... actually takes a variable amount of space.
    • InnoDB` has a 1- or 2-byte prefix to each column; that is not mentioned.
    • The key_len usually accounts for any column(s) that are tested with =. If there also a "range" test (BETWEEN, >, LIKE 'foo%', etc) that can use part of the index, key_len does not indicate such.
    • Ditto for using part of the index for GROUP BY and ORDER BY.

    You can get more information (but still not 'everyting') by using EXPLAIN FORMAT=JSON SELECT ....

    Logically, if not in reality, there is no room for NULL in a 2-byte SMALLINT. So, more space is needed -- at least one bit.

    There are two separate issues -- The size of the index BTree, and the data structure(s) used during the query.

    I would argue that the extra byte or bit for NULL is not worth worrying about. Instead, it is better to say NOT NULL except when you have a "business logic" requirement for NULL (no value, N/A, not yet specified, etc, etc). Then let the table, index, etc, consume an extra bit or byte as needed.

    I think (without sufficient confirmation) that InnoDB takes no extra space for the null bit -- it is one of the 8 or 16 bits that prefixes each column.

    Note that in InnoDB, an index BTree is essentially identical to the data BTree. (And the PRIMARY KEY is the ordering of the data BTree.)