Search code examples
perlunicodemojolicious

UTF8 strings partially not recognized


When I fetch Cyrillic text from SQLite3 database, in some situations perl (or Mojolicious, or DBIx::Class - I honestly don't know) fails to decode bytestream. For example, given the text:

1984г1ф!!11четыре

output will be:

1984г1ф!!11������
1984\x{433}1\x{444}!!11\x{fffd}\x{fffd}\x{fffd}\x{fffd}\x{fffd}\x{fffd}

Why is this happening? How to fix this?

Update: I was able to trace the source of this issue. Looks like malformed string is taken from user input on a web-page, and dispatched to Controller action as a parameter: code here.

Executing save action produces the following log:

[2012/07/24 14:06:09] [DEBUG] 15703 Mojolicious.Plugin.RequestTimer - POST /admin/node/save (Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:14.0) Gecko/20100101 Firefox/14.0.1).
[2012/07/24 14:06:09] [DEBUG] 15703 Mojolicious.Routes - Routing to a callback.
[2012/07/24 14:06:09] [DEBUG] 15703 Mojolicious.Routes - Routing to controller "MyApp::Admin" and action "save".
Wide character in print at /home/nikita/perl5/lib/perl5/Log/Log4perl/Appender/File.pm line 245.
Wide character in print at /home/nikita/perl5/lib/perl5/Log/Log4perl/Appender/Screen.pm line 39.
[2012/07/24 14:06:09] [DEBUG] 15703 MyApp.Admin - 123123!!!11������������������

Update 2: I am using Morbo as development server, and my application layout contains meta header:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

Update 3: Strange, but sometimes the string is encoded and displayed correctly:

[2012/07/24 14:55:52] [DEBUG] 16451 Mojolicious.Routes - Routing to a callback.
[2012/07/24 14:55:52] [DEBUG] 16451 Mojolicious.Routes - Routing to controller "MyApp::Admin" and action "save".
[2012/07/24 14:55:52] [DEBUG] 16451 MyApp.Admin - 112!!ЫВафывафывп
[2012/07/24 14:55:52] [DEBUG] 16451 Mojolicious.Plugin.RequestTimer - 302 Found (0.326543s, 3.062/s).
[2012/07/24 14:55:52] [DEBUG] 16451 Mojolicious.Plugin.RequestTimer - GET /admin (Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:14.0) Gecko/20100101 Firefox/14.0.1).

and if I do the same thing second time, I get:

[2012/07/24 14:57:30] [DEBUG] 16451 Mojolicious.Routes - Routing to controller "MyApp::Admin" and action "save".
Wide character in print at /home/nikita/perl5/lib/perl5/Log/Log4perl/Appender/File.pm line 245.
Wide character in print at /home/nikita/perl5/lib/perl5/Log/Log4perl/Appender/Screen.pm line 39.
[2012/07/24 14:57:30] [DEBUG] 16451 MyApp.Admin - 112!!��������������������
[2012/07/24 14:57:30] [DEBUG] 16451 Mojolicious.Plugin.RequestTimer - 302 Found (0.362417s, 2.759/s).
[2012/07/24 14:57:30] [DEBUG] 16451 Mojolicious.Plugin.RequestTimer - GET /admin (Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:14.0) Gecko/20100101 Firefox/14.0.1).

Solution

  • Removed

    % use utf8;
    

    from one of the templates, and after that text is displayed as it should.