Search code examples
perllwp

Perl LWP::Simple won't "get" a webpage when running from remote server


I'm trying to use Perl to scrape a publications list as follows:

use XML::XPath;
use XML::XPath::XMLParser;
use LWP::Simple;

my $url = "https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxxx/xxxxxx.rdf";

my $content = get($url);
die "Couldn't get publications!" unless defined $content;

When I run it on my local (Windows 7) machine it works fine. When I try to run it on the linux server where we are hosting some websites, it dies. I installed XML and LWP using cpan so those should be there. I'm wondering if the problem could be some sort of security or permissions on the server (keeping it from accessing an external website), but I don't even know where to start with that. Any ideas?


Solution

  • Turns out I didn't have LWP::Protocol::https" installed. I found this out by switching

    LWP::Simple  
    

    to

    LWP::UserAgent 
    

    and adding the following:

    my $ua = LWP::UserAgent->new;
    my $resp = $ua->get('https://connects.catalyst.harvard.edu/Profiles/profile/xxxxxx/xxxxxxx.rdf' );
    print $resp;
    

    It then returned an error telling me it didn't have the protocol to access the https without LWP::Protocol::https, so I installed it with

    cpan LWP::Protocol::https
    

    and all was good.