I saw comment "If you have 50 million values between 10 and 15 characters in a varchar(20) column, and the same 50 million values in a varchar(50) column, they will take up exactly the same space. That's the whole point of varchar, as opposed to char.". Can Anybody tell me the reason? See What is a reasonable length limit on person "Name" fields?
MySQL offers a choice of storage engines. The physical storage of data depends on the storage engine.
MyISAM Storage of VARCHAR
In MyISAM, VARCHAR
s typically occupy just the actual length of the string plus a byte or two of length. This is made practical by the design limitation of MyISAM to table locking as opposed to a row locking capability. Performance consequences include a more compact cache profile, but also more complicated (slower) computation of record offsets.
(In fact, MyISAM gives you a degree of choice between fixed physical row size and variable physical row size table formats depending on column types occuring in the whole table. Occurrence of VARCHAR
changes the default method only, but the presence of a TEXT
blob forces VARCHAR
s in the same table to use the variable length method as well.)
The physical storage method is particularly important with indexes, which is a different story than tables. MyISAM uses space compression for both CHAR
and VARCHAR
columns, meaning that shorter data take up less space in the index in both cases.
InnoDB Storage of VARCHAR
InnoDB, like most other current relational databases, uses a more sophisticated mechanism. VARCHAR
columns whose maximum width is less than 768 bytes will be stored inline, with room reserved matching that maximum width. More accurately here:
For each non-NULL variable-length field, the record header contains the length of the column in one or two bytes. Two bytes will only be needed if part of the column is stored externally in overflow pages or the maximum length exceeds 255 bytes and the actual length exceeds 127 bytes. For an externally stored column, the two-byte length indicates the length of the internally stored part plus the 20-byte pointer to the externally stored part. The internal part is 768 bytes, so the length is 768+20. The 20-byte pointer stores the true length of the column.
InnoDB currently does not do space compression in its indexes, the opposite of MyISAM as described above.
Back to the question
All of the above is however just an implementational detail that may even change between versions. The true difference between CHAR
and VARCHAR
is semantic, and so is the one between VARCHAR(20)
and VARCHAR(50)
. By ensuring that there is no way to store a 30 character string in a VARCHAR(20)
, the database makes the life easier and better defined for various processors and applications that it supposedly integrates into a predictably behaving solution. This is the big deal.
Regarding personal names specifically, this question may give you some practical guidance. People with full names over 70 UTF-8 characters are in trouble anyway.