Search code examples
xmlperlxpathends-with

How to access a tag in xml that ends with specific text using ends-with


I have an example xml

<?xml version="1.0" ?>
<Details date="2022-02-09" ver="1">
<VerNum>/14</VerNum>
<Info>
 <model>S22</model>
 <branch name="city_1">
  <prevstock>10000</prevstock>
  <def>1</def>
 </branch>
 <branch name="city_2">
  <presstock>2000</presstock>
  <def>2</def>
 </branch>
 <branch name="city_3">
  <futstock>3000</futstock>
  <def>0.3</def>
 </branch>
</Info>
</Details>

I need to access the stock, tag name is not always consistent and I can't depend on position of node too, I do not understand the correct usage of XPath / ends-with function.

use warnings;
use strict;
use feature 'say';
use Data::Dumper;
use XML::LibXML;

my $file = "ex.xml";#// die "Usage: $0 filename\n";
my $parser = XML::LibXML->load_xml(location => $file);
my %branch_stock;
foreach my $sec ($parser->findnodes('/Details/Info')) { 
    for my $branch ($sec->findnodes('./branch')) {
        my $branch_name = $branch->getAttribute('name');
        my $stock_value = $branch->findnodes('*[ends-with(name(),"stock")]')->[0]->textContent;

        #say "$branch_name --> $stock_value";

        $branch_stock{$branch_name} = $stock_value;
    }   
}

say Dumper \%branch_stock;

This gives me error,

 error : xmlXPathCompOpEval: function ends-with not found
XPath error : Unregistered function
XPath error : Stack usage errror
 error : xmlXPathCompiledEval: 2 objects left on the stack.

Could anyone please help understand the problem and help overcome please ? Thanks a lot in advance.


Solution

  • That nice ends-with(), along with many other features, is in XPath2 (and later). In XML::LibXML we are limited to XPath 1.0, as the underlying libxml2 is.

    One workable function for querying by partial text is contains

    use warnings;
    use strict;
    use feature 'say';
    
    use Data::Dumper;
    use XML::LibXML;
    
    my $file = shift // die "Usage: $0 file\n";
    
    my $parser = XML::LibXML->load_xml(location => $file);
    
    my %branch_stock;
    for my $sec ($parser->findnodes('/Details/Info')) { 
        for my $branch ($sec->findnodes('./branch')) {
            my $branch_name = $branch->getAttribute('name');
    
            for my $stock ($branch->findnodes('*[contains(name(),"stock")]')) {
                say "$branch_name --> $stock";
                $branch_stock{$branch_name} = $stock->textContent;
    
                # No "ends-with" in XPath1, what we have here
                # $branch->findnodes('*[ends-with(name(),"stock")]')
            }
        }   
    }
    print Dumper \%branch_stock;
    

    This prints

    city_1 --> <prevstock>10000</prevstock>
    city_2 --> <presstock>2000</presstock>
    city_3 --> <futstock>3000</futstock>
    $VAR1 = {
              'city_3' => '3000',
              'city_1' => '10000',
              'city_2' => '2000'
            };
    

    A word on sources and documentation, which I find not so easy for XPath.

    There is an overview in perl-libxml-by-example. While XPath 1.0 lacks powerful features of later versions, it does have a scoop of functions. One can also create custom functions using Perl API for that. The library, XML::LibXML, uses XML::LibXML::XPathContext.


    If there is indeed one stock under each branch, clearly expected here, we don't need a loop but can pick the "first" (only) element

    for my $sec ($parser->findnodes('/Details/Info')) { 
        for my $branch ($sec->findnodes('./branch')) {
            my $branch_name = $branch->getAttribute('name');
    
            my $stock = $branch->findnodes('*[contains(name(),"stock")]')->[0];
            say "$branch_name --> $stock";
            $branch_stock{$branch_name} = $stock->textContent;
         }
    }
    

    And there is a shortcut for that, findvalue

    for my $sec ($parser->findnodes('/Details/Info')) { 
        for my $branch ($sec->findnodes('./branch')) {
            my $branch_name = $branch->getAttribute('name');
    
            my $stock_value = $branch->findvalue('*[contains(name(),"stock")]');
            $branch_stock{$branch_name} = $stock_value;
         }
    }