I'm using the MediaWiki API to get search results. I simply want to grab the URL to the first result, the XML element marked 'Url'. There will eventually be other things I will want to do with the XML, but I suppose in getting an answer for this I will realize what I'm doing wrong and be able to do the other stuff. Here's the page I'm working with.
require HTTP::Request;
require LWP::UserAgent;
require XML::Simple;
my $url = URI->new("http://en.wikipedia.org/w/api.php?action=opensearch&search=rooney&limit=10&namespace=0&format=xml");
my $request = HTTP::Request->new(GET => $url);
my $ua = LWP::UserAgent->new;
my $response = $ua->request($request);
my $xml = XML::Simple->new();
my $data = $xml->XMLin($response->content);
Everything up to here seems to work fine. My HTTP request goes through alright (if I just print $response->content
it returns the XML content fine and if I print $data
, I am told that it is a hash.
In attempt to get the 'Url' element, I have tried numerous approaches based on the searching I've done. A few below:
print $data->{'Url'};
print $data->{Url};
print $data{Url}
Pro tip: use Data::Dumper
to look inside your data structure.
use Data::Dumper;
print Dumper($data);
You'll get something like this ...
$VAR1 = {
'xmlns' => 'http://opensearch.org/searchsuggest2',
'Section' => {
'Item' => [
{
'Url' => {
'content' => 'http://en.wikipedia.org/wiki/Rooney',
'xml:space' => 'preserve'
},
'Description' => {
'content' => 'Rooney may refer to:',
'xml:space' => 'preserve'
},
'Text' => {
'content' => 'Rooney',
'xml:space' => 'preserve'
}
},
... much much more ...
from which you can deduce that the route to your desired data is through
$data->{Section}{Item}[0]{Url}{content}
You should also look into using something like XML::XPath
, which makes it much easier to conduct this kind of search.