Given the following XML snippet:
<outline>
<node1 attribute1="value1" attribute2="value2">
text1
</node1>
</outline>
How do I get this output?
outline
node1=text1
node1 attribute1=value1
node1 attribute2=value2
I have looked into use XML::LibXML::Reader;
, but that module appears to only provide access to attribute values referenced by their names. And how do I get the list of attribute names in the first place?
You find the list of attributes by doing $e->findnodes( "./@*");
Below is a solution, with plain XML::LibXML, not XML::LibXML::Reader, that works with your test data. It may be sensitive to extra whitespace and mixed-content though, so test it on real data before using it.
#!/usr/bin/perl
use strict;
use warnings;
use XML::LibXML;
my $dom= XML::LibXML->load_xml( IO => \*DATA);
my $e= $dom->findnodes( "//*");
foreach my $e (@$e)
{ print $e->nodeName;
# text needs to be trimmed or line returns show up in the output
my $text= $e->textContent;
$text=~s{^\s*}{};
$text=~s{\s*$}{};
if( ! $e->getChildrenByTagName( '*') && $text)
{ print "=$text"; }
print "\n";
my @attrs= $e->findnodes( "./@*");
# or, as suggested by Borodin below, $e->attributes
foreach my $attr (@attrs)
{ print $e->nodeName, " ", $attr->nodeName. "=", $attr->value, "\n"; }
}
__END__
<outline>
<node1 attribute1="value1" attribute2="value2">
text1
</node1>
</outline>