Search code examples
perllwp-useragentpercent-encoding

How to force ISO-8859-1 encoding for form data using LWP::UserAgent?


It seems that LWP::UserAgent always encodes form data as UTF-8, even if explicitely encode it as ISO-8859-1 as follows:

use Encode;
use LWP::UserAgent;
use utf8;

my $ua = LWP::UserAgent->new;
$ua->post('http://localhost:8080/', {
    text => encode("iso-8859-1", 'è'),
});

Request content is text=%C3%A8. How can I have è encoded as %E8 instead?


Solution

  • Short answer to myself: just put the variable name (i.e. "text") in quotes instead of writing it as a bareword.

    $ua->post('http://localhost:8080/', {
        'text' => encode("iso-8859-1", 'è'),
    });
    

    Ratio: this weird behaviour is caused by the combination of the following factors:

    • Perl bug #68812 caused the UTF-8 internal flag to be set to all barewords. This was fixed in latest Perl versions (>= 5.12);
    • URI.pm concatenates keys to values (i.e. "text=è") before converting characters, so the value is always promoted to UTF-8 if the key has the internal flag set, even if you passed the value as octects.

    I don't think that the bug pointed out by @Lumi about URI.pm using \C has effect on this specific issue.