I'm trying to get the attribute @id1
from <Incoming>
in the below XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Incomings xmlns:ns2="http://testme.org/foo/schema">
<Incoming id1="6bbaec22" id2="928c2081">
<ns2:Address>fubar@test.com</ns2:Address>
</Incoming>
</Incomings>
The only information that I can pass in is the email address fubar@test.com
I'm using XML::LibXML
and XML::LibbXML::XPathContext
as below:
my $dom = XML::LibXML->new->parse_file( $xml_file ); # XML contains as above
my $xpc = XML::LibXML::XPathContext->new( $dom->documentElement );
$xpc->registerNs('x', 'http://testme.org/foo/schema');
my $email = 'fubar@test.com';
my $xpath = "/x:Incomings/x:Incoming/x:ns2:Address[text()='$email']/../\@id1";
my @nodes = $xpc->findnodes( $xpath );
But it always gives me an invalid expression in $xpath
around the ns2:Address.
What mistake did I make above? If the node name is only <Address>
then removing the ns2: from my $xpath
statement giving me the correct values in @nodes
.
Thanks!
I think there's two problems here - first off, xpath
expressions find nodes. You can search based on the existence and content of an attribute, but findnodes
will give you the element, not the content.
Secondly - you can't nest namespaces in XML. x:ns2:Address
isn't valid. Do you actually need to register your x
namespace there? You may not need to at all. (e.g. based on your small XML snippet).
Can I offer an alternative option? Because you're working with perl
you don't actually necessarily need to do everything via the xpath expression.
I'd be perhaps thinking findnodes
followed by grep
:
NB: Using XML::Twig for illustration - pretty sure something pretty similar works in XML::LibXML.
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new( 'pretty_print' => 'indented_a' )->parse( \*DATA );
my @elt_list = grep { $_->trimmed_text =~ m{fubar\@test.com} }
( $twig->findnodes('//ns2:Address') );
foreach my $elt (@elt_list) {
print $elt -> parent -> att('id1');
}
__DATA__
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Incomings xmlns:ns2="http://testme.org/foo/schema">
<Incoming id1="6bbaec22" id2="928c2081">
<ns2:Address>fubar@test.com</ns2:Address>
</Incoming>
</Incomings>
I'd also note - your xpath lets you find an element - rather than an attribute - so you can select on 'elements with an id1
attribute like this:
my @elt_list = ( $twig->findnodes("//ns2:Address[string()='$email']/../.[\@id1]") );
foreach my $elt (@elt_list) {
print $elt -> att('id1');
}
Depends rather on how specific you want to be with your findnodes
search. Based on what you've provided in that snippet, you've gone for much too complicated, and could simply do:
use XML::Twig;
my $twig = XML::Twig->parsefile('your_file.xml');
print $twig -> findnodes('//Incoming',0)->att('id1'),"\n";
Or:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::LibXML;
my $xml = XML::LibXML->new->parse_file( 'sample2.xml' );
foreach my $node ( $xml -> findnodes( '//Incoming' ) ) {
print $node ->getAttribute('id1'), "\n";
}
Or with a bit of grepping:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::LibXML;
my $email = 'fubar@test.com';
my $xml = XML::LibXML->new->parse_file( 'sample2.xml' );
foreach my $node ( grep { $_ -> textContent =~ m{$email} } $xml -> findnodes( '//Incoming' ) ) {
print $node ->getAttribute('id1'), "\n";
}
If you particularly want to be using that x
namespace though - this works:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::LibXML;
my $xml = XML::LibXML->new->parse_file('sample2.xml');
my $xpc = XML::LibXML::XPathContext->new( $xml->documentElement );
$xpc->registerNs( 'x', 'http://testme.org/foo/schema' );
my $email = 'fubar@test.com';
my ( $id1 ) = map { $_ -> getAttribute('id1') // () } $xpc->findnodes("/Incomings/Incoming/x:Address[text()='$email']/..");
print $id1,"\n";
(Also works if I mock up some XML with multiple 'Incoming' nodes to select the first with the right email address. Note //
is perl 5.10 onwards, and is a conditional on 'defined'. You could probably substitute it with ||
on older versions, which is 'true/false' - the only places where there's differences is empty strings and zeros)