Search code examples
mysqlcharacter-encodingcharacter

Why BIT_LENGTH(str) function in MySQL return the size of Chinese characters is 2 bytes?


MySQL version: Ver 8.0.34 for Win64 on x86_64 (MySQL Community Server - GPL)

BIT_LENGTH(str): Returns the length of the string str in bits. Returns NULL if str is NULL.

When I type:

select BIT_LENGTH('你好')/8;

I got:

+----------------------+
| BIT_LENGTH('你好')/8 |
+----------------------+
|               4.0000 |
+----------------------+

When I create a table and create a column called 'Hello', and insert '你好':

select Hello from chinese_test;
+-------+
| Hello |
+-------+
| NULL  |
| NULL  |
| NULL  |
| 你好  |
+-------+
select LENGTH(Hello) from chinese_test;
select LENGTH(Hello) from chinese_test;
+---------------+
| LENGTH(Hello) |
+---------------+
|          NULL |
|          NULL |
|          NULL |
|             6 |
+---------------+

So, why I got 2 bytes from the BIT_LENGTH funcion. I googled and checked the MySQL Reference Manual (It's thankful that someone will tell me what I missed). There is no related content with the BIT_LENGTH function inside.

I just want to know how the BIT_LENGTH function return 2 bytes of chinese character rather than 3 bytes or 4 bytes.


Solution

  • The BIT_LENGTH result is dependent on your connection character set.

    What is the output of:

    select @@character_set_connection, BIT_LENGTH('你好')/8;
    

    See a dbfiddle.