Search code examples
perlhashassociativedata-dumper

Accents not respected in printing out with data::dumper PERL


I would like to print out the content of an associative array. For this I'm using Data::dumper.

So, for exemple, if the associative array is called "%w", I write :

  print OUT Dumper(\%w);

Here's the problem: there are some words like "récente" that are printed out as "r\x{e9}cente".

If I write just :

print OUT %w;

I've no problems, so "récente" it will be printed out as "récente".

All text files used for the script are in utf8. Moreover I use the module "utf8" and I specify always the character encoding system.

For ex. :

open( IN, '<', $file_in);
binmode(IN,":utf8");

I'm pretty sure that the problem is related to Data::dumper. Is there a way to solve this or another way to print out the content of an associative array?

Thank you.


Solution

  • This is intentional. The output by Data::Dumper is intended to produce the same data structure when evaluated as Perl code. To limit the effect of character encodings, non-ASCII characters will be dumped using escapes. In addition to that, it's sensible to set $Data::Dumper::Useqq = 1 so that any unprintable characters are dumped using escapes.

    Data::Dumper isn't really meant as a way to display data structures – if you have specific formatting requirements, just write the necessary code yourself. For example

    use utf8;
    use feature 'say';
    open my $out, ">:utf8", $filename or die "Can't open $filename: $!";
    my %hash = (
        bárewørdş => '–Uni·code–',
    );
    
    say { $out } "{";
    for my $key (sort keys %hash) {
        say { $out } "  $key: $hash{$key}";
    }
    say { $out } "}";
    

    produces

    {
      bárewørdş: –Uni·code–
    }