Search code examples
mysqlsqlunicode

mysql unicode text incorrect string warning on insert, despite character set variables set utf8mb4


First, I know, yes, this is yet another mysql unicode question.

Problem: I am unable to insert unicode text into my mysql database

I want to execute the following query:

INSERT INTO usert SET username='田中'

When I do, I get this warning:

Incorrect string value: '\x93c\x92\x86' for column 'username' at row 1

A blank space is inserted into the table instead of the data

I have tried as many answers and forums as I could, and I believe that all appropriate variables, table, and column settings are set to 'utf8mb4' character set, with collation 'utf8mb4_general_ci' or 'utfmb4_unicode_ci'

I will tell you why I believe that by giving you the details, and sql commands used to show them.

First, mysql version:

mysql:> SHOW VARIABLES LIKE 'version'

Confirms that the version is 5.6.23

To show the character set variables in mysql:

mysql:> SHOW VARIABLES LIKE '%char%'

That command shows (in slightly different format):

character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
...
character_set_results: utf8mb4
character_set_server: utf8mb4
character_set_system: utf8

Collation:

mysql:> SHOW VARIABLES LIKE '%collat%'

RESULTS:

collation_connection: utf8mb4_unicode_ci
collation_database: utf8mb4_unicode_ci
collation_server: utf8mb4_unicode_ci

So far so good?

Now, for the table character set and collation:

Look at table details command:

mysql:> SHOW TABLE STATUS

shows that the collation is utf8mb4_general_ci

Command for looking at column details:

mysql:> SHOW FULL COLUMNS IN usert

Confirms that the collation for column 'username' is utf8mb4_general_ci

In summary, from what I have studied, all relevant variables, database, table, and column settings seem to be set to the relevant utf8mb4 setting. Despite that, I am unable to insert the unicode Japanese text.

(By the way, I dont think the 4-byte unicode settings utf8mb4 is necessary here, but it is what I am using because it seemed to fix many other unicode mysql problems)

What other settings in mysql or the system are likely causing this problem? What other settings can I/ should I change to allow inserting japanese text appropriately?

EDIT UPDATE: I am on a Japanese computer


Solution

  • The problem was the default system settings, which also affected the input settings at the command line.

    Its a Japanese computer, which apparently uses shift-jis encoding using, NOT unicode, by default. The text I was inputting was encoded in this way, and in similar input files I was trying to use.

    Therefore, I set the character set to be 'jsis' in the server, i.e. setting character-set-server=sjis in the my.ini initializer file, and set the mysql character set to be the same by entering skip-character-set-client-handshake into the same initilization file.

    The character set for the column of course must also be changed via

    ALTER TABLE usert MODIFY username varchar(30) CHARACTER SET sjis COLLATE sjis_japanese_ci

    Now, you can insert the japanese text from command line, and other japanese files which use shift-jis encoding.

    Another option for inputting japanese text seems to be cp932, which is the windows version of shift-jis.

    Incidentally, if you DO wish to use unicode via command line, apparently powershell has better support for it, rather than the normal cmd I was using, but I haven't tried it personally.