Search code examples
mysqlunicodemariadbutf8mb4

MariaDB: Insert fails due to random Unicode codepoint


I’m using MariaDB 10.6.16, but reproduced the same behaviour with 10.3.39, too. My problem is this little SQL snippet:

CREATE TABLE problem (
    text   TEXT
) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

INSERT INTO problem (text) VALUES ('𤋮');

This results in the error message:

ERROR 1366 (22007) at line 5: Incorrect string value: '\xF0\xA4\x8B\xAE' for column `problem`.`text` at row 1

The bytes \xF0\xA4\x8B\xAE represent the character 𤋮 exactly, so the problem is that MariaDB doesn’t allow inserting this specific character.

What I’ve tried:

  • checking the character. It’s U+242EE, an ancient Unicode character defined in 2001. So it shouldn’t be a problem with too new a code point for the collation.
  • checking two MariaDB versions, as noted above.
  • removing the collation and charset from the table definition. No change.
  • using other 4 byte characters in the SQL query. They work just fine, it’s only this (and possibly some other random CJK characters, too).

How can I convince MariaDB to insert this character into my DB?


Solution

  • Getting the consistent character set for CHARACTER_SET_RESULTS, CHARACTER_SET_CONNECTION and CHARACTER_SET_CLIENT is required to use non-default character sets.

    To do this all in one statement, use SET NAMES, in this case:

    SET NAMES utf8mb4
    

    Will set all these client related system variables to a character set that supports the '𤋮' character.

    The collation is automaticity set to the default for the character set, though it can be explicit with set names too.