I am trying to estimate the real disk-usage required space for each record of my table in MySQL RDBMS.
The table has a structure like this:
So there are 5 INT fields and a VARCHAR of a maximum of 34 chars (i.e. 34 bytes).
I have 2 questions:
1) The total should be 54 bytes per record (with variable VARCHAR, of course) am I right when I am saying that, or there are also some over-head bytes which I should consider when estimating the disk-usage space?
2) I have also used INT(5) instead of CHAR(5) cause I need to store only exactly 5 digits in that field (I am going to do that by application, with regExp and string length, cause I know that INT(5) could be more than an int with 5 digits). But could this be considered such as an optimization by the disk-usage space cause I am using an INT (4 bytes) instead of a CHAR(5) which is 5 bytes, i.e. 1 more byte per record?
Thanks for the attention!
One record itself will use
1 byte in offsets
0 bytes in NULLable bits
5 bytes in "extra bytes" header
4 bytes ID
6 bytes transaction id
7 bytes rollback pointer
0-3*34 bytes in VARCHAR(34) (one character may take up to 3 bytes because of UTF8)
4*4 bytes in other integers
Each distinct value of FK will lead to one record in a secondary index. it will use
5 bytes in "extra bytes" header
4 bytes INT for FK value
4 bytes INT for Primary key
Other overhead is page level: 120 bytes per page (16k) in headers page fill factor 15/16 - i.e. one page may contain 15k in records.
And the last - add space used by non-leaf pages, which should be small anyway
So, answer to question - 1) yes there will be some overhead that you can calculate using information above.
2) CHAR(5) in UTF8 will add a byte for its length, so INT looks reasoaanle to use