Search code examples
perlhttp-status-code-403www-mechanizelwp-useragent

403 error when using LWP::UserAgent but not with WWW::Mechanize


I am trying to access a site using Perl5 and LWP::UserAgent. However upon connecting, the script dies with a "403 access denied" message. The weird part is that it works flawlessly using WWW::Mechanize but the fetch code is the exact same. Normally I'd suspect the user agent being the cause but as mentioned before the code is the same in both cases.

Is there a difference in how WWW::Mechanize and LWP::UserAgent handle requests that could cause this issue?

Here is some sample code that demonstrates two different approaches.

# Mechanize
use strict;
use warnings "all";
use WWW::Mechanize;

my $mech = WWW::Mechanize->new(
    agent_alias => 'Mozilla/5.0',
    show_progress => 1);

my $mech->get("www.foo.com");

# LWP
use strict;
use warnings "all";
use LWP::UserAgent;

my $ua = LWP::UserAgent->new(
    agent_alias => 'Mozilla/5.0',
    show_progress => 1);

my $r = $ua->get("www.foo.com");

Solution

  • There is no agent_alias argument documented, neither for LWP::UserAgent nor for WWW::Mechanize. And there is also no agent_alias argument implement. Instead the argument gets ignored in both cases and it uses the builtin default. But the default is different. There is a agent_alias method though for WWW::Mechanize. From the documentation:

    ...For instance,

    $mech->agent_alias( 'Windows IE 6' );
    

    sets your User-Agent to

    Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
    

    According to the documentation of LWP::UserAgent the argument you actually want to use is properly called agent and defaults to libwww-perl/#.### (the #.### being the version number). With WWW::Mechanize the same argument can be used but the documented different default is WWW-Mechanize/#.##.