Search code examples
phphtmlxpathdomdocument

PHP DOMXPath->query()/->evaluate() not matching inner text


I am currently trying to create a pure PHP menu traversal system - it's because I'm doing an impromptu project for some people but they want as little JS as possible (i.e: none) and ideally pure PHP.

I have a menu which looks like this:

ul {
  list-style-type: none;
}

nav > ul.sidebar-list ul.sub {
  display: none;
}

nav > ul.sidebar-list ul.sub.active {
  display: block;
}
<nav class="sidebar" aria-labelledby="primary-navigation">
  <ul class="sidebar-list">

    <!--each element has a sub-menu which is initially hidden by css when the page is loaded. Via php the appropriate path the current page and top-level links will be visible only-->
    <a href="#"><li>Home</li></a>
    <!--sub-items-->
    <ul class="sub active">
      <a href="#"><li>Barn</li></a>
      <a href="#"><li>Activities</li></a>
      <ul class="sub active">
        <a href="#"><li>News</li></a>
        <a href="#"><li>Movements</li></a>
        <a href="#"><li>Reviews</li></a>
        <a href="#"><li>About Us</li></a>
        <a href="#"><li>Terms of Use</li></a>
      </ul>
    </ul>
    <a href="#"><li>Events</li></a>
    <ul class="sub">
      <a href="#"><li>Overview</li></a>
      <a href="#"><li>Farming</li></a>
      <a href="#"><li>Practises</li></a>
      <a href="#"><li>Links</li></a>
      <ul class="sub">
        <a href="#"><li>Another Farm</li></a>
        <a href="#"><li>24m</li></a>
      </ul>
    </ul>
  </ul>
</nav>

In order to attempt to match the title inner-text of the page to a menu-item innertext (probably not the best way of doing things but I'm still learning php) I run:

$menu = new DOMDocument();
assert($menu->loadHTMLFile($menu_path), "Loading nav.html (menu file) failed");
//show content to log of the html document
error_log("HTML file: \n\n".$menu->textContent);

//set up a query to find an element matching the title string found
$xpath = new DOMXPath($menu);

$menu_query = "//a/li[matches(text(), '$title_text', 'i')]";
$elements = $xpath->query($menu_query);
error_log($elements ? ("Result of xpath query is: ".print_r($elements, TRUE)): "The xpath query for searching the menu is incorrect and will not find you anything!\ntype of return: ".gettype($elements));

I get the correct return at: https://www.freeformatter.com/xpath-tester.html but in the script I don't. I have tried many different combinations of the text matching such as: //x:a/x:li[lower-case(text())='$title_text'] but always an empty node list.


Solution

  • PHP uses XPath 1.0. matches is an XPath 2.0 function, so you would have seen warnings in your error log if you were looking for them.

    PHP Warning:  DOMXPath::query(): xmlXPathCompOpEval: function matches not found in php shell code on line 1
    PHP Stack trace:
    PHP   1. {main}() php shell code:0
    PHP   2. DOMXPath->query() php shell code:1
    

    A simple case-sensitive match can be done with an equality check.

    $title_text = "Farming";
    $menu_query = "//a/li[. = '$title_text']";
    

    But the case-insensitive search involves translating the characters from upper to lower case:

    $title_text = "FaRmInG";
    $title_text = strtolower($title_text);
    $menu_query = "//a/li[translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz') = '$title_text']";
    

    In either case we end up with a NodeList that can be iterated through:

    $html = <<< HTML
    <nav class="sidebar" aria-labelledby="primary-navigation">
      <ul class="sidebar-list">
    
        <!--each element has a sub-menu which is initially hidden by css when the page is loaded. Via php the appropriate path the current page and top-level links will be visible only-->
        <a href="#"><li>Home</li></a>
        <!--sub-items-->
        <ul class="sub active">
          <a href="#"><li>Barn</li></a>
          <a href="#"><li>Activities</li></a>
          <ul class="sub active">
            <a href="#"><li>News</li></a>
            <a href="#"><li>Movements</li></a>
            <a href="#"><li>Reviews</li></a>
            <a href="#"><li>About Us</li></a>
            <a href="#"><li>Terms of Use</li></a>
          </ul>
        </ul>
        <a href="#"><li>Events</li></a>
        <ul class="sub">
          <a href="#"><li>Overview</li></a>
          <a href="#"><li>Farming</li></a>
          <a href="#"><li>Practises</li></a>
          <a href="#"><li>Links</li></a>
          <ul class="sub">
            <a href="#"><li>Another Farm</li></a>
            <a href="#"><li>24m</li></a>
          </ul>
        </ul>
      </ul>
    </nav>
    HTML;
    $menu = new DOMDocument();
    $menu->loadHTML($html);
    $xpath = new DOMXPath($menu);
    $elements = $xpath->query($menu_query);
    foreach ($elements as $element) {
        print_r($element);
    }