Search code examples
mysqlvarchar

Why does MySQL VARCHAR allow more than the max length?


I have created a table with an UTF-8 VARCHAR(5000), and filled it with data. But it looks like this field is allowing more data than it is instructed to:

mysql> DESCRIBE test;
+---------+------------------+------+-----+---------+----------------+
| Field   | Type             | Null | Key | Default | Extra          |
+---------+------------------+------+-----+---------+----------------+
| id      | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| comment | varchar(5000)    | YES  |     | NULL    |                |
+---------+------------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)

mysql> SELECT MAX(LENGTH(comment)) FROM test;
+----------------------+
| MAX(LENGTH(comment)) |
+----------------------+
|                 5001 |
+----------------------+
1 row in set (0.01 sec)

Why is that?


Solution

  • Ok, the problem is that LENGTH() returns the length in bytes, not chars. Because the string is UTF-8, I need to use CHAR_LENGTH() instead:

    mysql> DESCRIBE test;
    +---------+------------------+------+-----+---------+----------------+
    | Field   | Type             | Null | Key | Default | Extra          |
    +---------+------------------+------+-----+---------+----------------+
    | id      | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
    | comment | varchar(5000)    | YES  |     | NULL    |                |
    +---------+------------------+------+-----+---------+----------------+
    2 rows in set (0.01 sec)
    
    mysql> SELECT MAX(LENGTH(comment)) FROM test;
    +----------------------+
    | MAX(LENGTH(comment)) |
    +----------------------+
    |                 5001 |
    +----------------------+
    1 row in set (0.01 sec)
    
    mysql> SELECT MAX(CHAR_LENGTH(comment)) FROM test;
    +----------------------+
    | MAX(LENGTH(comment)) |
    +----------------------+
    |                 5000 |
    +----------------------+
    1 row in set (0.01 sec)
    

    The length was 5001 because the string contained exactly one two-byte character!