Search code examples
phphtmldomdomdocumentdomxpath

get the value of domXPath from HTML


I need to parse out of my html the contents of the <span id="the_name"> tag.
the html looks like this:

...
<span id="userName" class="username"></span> 
<div class="main"> 
 <div class="menu"> 
  <div id="totals" class="totals" >
  </div> 
  <ul id="alter_menu"> 
  </ul> 
  <div class="content"> 
   <br /> 
   <table width="70%" style="margin-left: auto; margin-right: auto;"> 
   <tr> 
    <td class="major_text" align="center">
    <br/> 
    <span id="verbatim" class="sender"> Alexander</span>
    </td> 
   </tr> 
   <tr> 
   <td>
    </td> 
    </tr> 
   <tr> 
   <td class="newline"> 
  </td>
</div>
...

the code I run:

$dom = new domDocument($html);
$xpath = new domXPath($dom);
$nodes = $xpath->query('//span[@id="verbatim"]');
echo $nodes->item(0)->nodeValue;

Problem is I keep getting a NULL for $nodes->item(0)->nodeValue, I am not sure how to inspect this domElement.

Of course, I need the Alexander value


Solution

  • You just instantiate the DOMDocument then use ->loadHTML() to actually load the HTML markup:

    $dom = new domDocument();
    libxml_use_internal_errors(true);
    $dom->loadHTML($html); // this line is important
    $xpath = new domXPath($dom);
    $nodes = $xpath->query('//span[@id="verbatim"]');
    echo $nodes->item(0)->nodeValue;
    

    Sample Output

    ->evaluate() will also work as well:

    echo $xpath->evaluate('string(//span[@id="verbatim"])');