Search code examples
perlutf-8urilwp

HTTP request with Perl to a utf-8 URI (some non-ascii chars inside) throws a 404 Not Found error


I'm trying to request an URL which has some characters that are non-ASCII, for example: http://perry.wikia.com/wiki/Página_principal which has an á symbol.

I've tried with LWP::UserAgent but it throws a 404 Not found error:

#!/usr/bin/perl

use utf8;
use LWP::UserAgent;
use Encode qw(decode encode);

my $br = LWP::UserAgent->new;
#~ my $url = 'http://perry.wikia.com/wiki/Página_principal'; # doesn't work either
my $url = encode('UTF-8','http://perry.wikia.com/wiki/Página_principal');
my $response = $br->get($url);
if ($response->{success}) {
    my $html = $response->{content};
} else {
  die "Unexpected error requesting $url : " . $response->status_line;
}

I've tried with HTTP::Tiny too, same result:

#!/usr/bin/perl

use utf8;
use HTTP::Tiny;
use Encode qw(decode encode);

my $url = 'http://perry.wikia.com/wiki/Página_principal';
#~ my $url = encode('UTF-8','http://perry.wikia.com/wiki/Página_principal'); # doesn't work either
my $response = HTTP::Tiny->new->get($url);
if ($response->{success}) {
    my $html = $response->{content};
} else {
  die "Unexpected error requesting $url : " . $response->{status};
}

Solution

  • This is not a bug in any of the Perl modules. This URL actually does return a 404.