Search code examples
perlmojolicious

Problems with Git::Repository concerning Umlaut in use with Mojolicious or HTML::Entities


I have a problem while displaying log-entries from git in a website regarding the Umlauts. I don't have an idea where to look for solutions, so I ask for help here. I do think that it might be an issue with encoding, but use utf8 did not have any effect in my tries. I made some effort to explain the problem in the hope to find a helpful answer. Thanks a lot.

So I create a repo with umlauts in the commit message:

echo "Hello Wörld!" > a_file.txt
git init
git add a_file.txt
git commit -m "Some Ümlaut: üöä"

I can now look at it in cmd, no problems occur:

$ git log
  ...
    Some Ümlaut: üöä

I can also print this stuff in Perl without issues. I'll call:

use Git::Repository;
my $repo = Git::Repository->new(work_tree => ".");
my $log  = $repo->run( "log" );
print "$log\n";

which gives me the same output as the shell example above.

The problem occurs when I'm using Mojolicious. Here is an example:

use Mojolicious::Lite;

get '/' => sub {
  my $self = shift;

  use Git::Repository;
  my $repo = Git::Repository->new(work_tree => ".");
  my $log  = $repo->run( "log" );
  $self->render(text => "$log  -- möre Ümläut\n" );
};

app->start;

When I run this, the Umlaut in the string will work, but not those coming from the commit message. To show it I can run the above Perl as following:

perl mojo.pl daemon

I then call the website with curl:

$ curl http://127.0.0.1:3000
...
    Some Ãmlaut: üöä  -- möre Ümläut

As I said: Umlauts from Git fail, rest is ok.

So I thought I was clever and translate them to HTML entities:

use strict;
use warnings;
use Git::Repository;
use HTML::Entities 'encode_entities';

my $repo = Git::Repository->new(work_tree => ".");
my $log = $repo->run( "log" );
print "$log\n";

my $htmlified = encode_entities($log);
print "$htmlified\n";

But calling this, only the first output is good. HTML::Entities has the same problem as Mojolicious:

...
Some Ümlaut: üöä
...
Some Ümlaut: üöä

Is it, that the problem is in Git::Repository, or where do I fail? I used Perl 5.16 on Ubuntu 12.04 for this tests. Thanks for any help.


Solution

  • I found out how to do it, decode_utf8() is your friend here. But I am still not sure why this step is needed...

    Here is how it goes:

    use Mojolicious::Lite;
    
    # we need this lib, part of core
    use Encode;
    
    get '/' => sub {
      my $self = shift;
    
      use Git::Repository;
      my $repo = Git::Repository->new(work_tree => ".");
      my $log  = $repo->run( "log" );
    
      # this call does the trick
      my $wtf  = decode_utf8($log);
    
      $self->render(text => "$wtf  -- möre Ümläut\n" );
    };
    
    app->start;
    

    Hope this helps other people as well. If someone thinks there should be a bugreport to one of those libs mentioned please tell here. I have no clue, if this is a workaround, bug or feature :-P