internet-explorer encoding utf-8 tinymce shift-jis

Pasting Japanese Shift-JIS encoded text into a UTF-8 application

I have a question related to the encoding of Japanese text.

Let us say I have a system which comprises a jvm and a database. It serves pages through an application server to client users on Internet Explorer web browser. The JVM and database use UTF-8 thoroughly. There are a number of text areas, some but not all of which make use of tinymce.

I am concerned about a situation that a Japanese user pastes some text which is not encoded in UTF-8. Is this likely to cause problems? If the user pastes text encoded in S-JIS, can it be expected to work? Early tests have not thrown any problems however I have no knowledge of the language and am concerned that special cases may exist.

Solution

Basing on http://msdn.microsoft.com/en-us/library/windows/desktop/ff729168%28v=vs.85%29.aspx:

Unicode and non-Unicode text have clipboard types of CF_UNICODE_TEXT and CF_TEXT and the operating system converts transparently between them, based on what type of data the target application requires.

If you are concerned about Japanese users, but think you can test the issue because you don't speak Japanese, you're wrong.

First of all, you can set the entire operating system to Japanese regional settings, which will set the non-Unicode encoding to Shift-JIS systemwide.

Second, you can start a single application with Japanese regional settings using AppLocale.

Third, you can test any other non-Unicode encoding, like Windows-1250/1251/1252, since the nature of conversion is practically identical.