Why XPath Query is not working?

I want to pick the title and youtube link from the following xml:

`<?xml version="1.0" encoding="UTF-8"?><feed     xmlns=""><category term="videos" label="/r/videos"/>    <icon></icon><id>/r/videos/.xml</id><link     rel="self" href=""     type="application/atom+xml" /><link rel="alternate" href="" type="text/html" /><logo></logo><subtitle>A great place for video content of all kinds.</subtitle><title>Videos</title><entry><author><name>/u/LegendaryContent</name><uri></uri></author><category term="videos" label="/r/videos"/><content type="html">&lt;table&gt; &lt;tr&gt;&lt;td&gt; &lt;a href=&quot;;&gt; &lt;img src=&quot;; alt=&quot;1,400 Employees being laid off&quot; title=&quot;1,400 Employees being laid off&quot; /&gt; &lt;/a&gt; &lt;/td&gt;&lt;td&gt; &amp;#32; submitted by &amp;#32; &lt;a href=&quot;;&gt; /u/LegendaryContent &lt;/a&gt; &lt;br/&gt; &lt;span&gt;&lt;a href=&quot;;&gt;[link]&lt;/a&gt;&lt;/span&gt; &amp;#32; &lt;span&gt;&lt;a href=&quot;;&gt;[comments]&lt;/a&gt;&lt;/span&gt; &lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;</content><id>t3_45crp7</id><link href="" /><updated>2016-02-12T03:22:38+00:00</updated><title>1,400 Employees being laid off</title></entry></feed>`

My code is here:

$videos ="";
$video_category = "Trending Videos";
$url = "";
$feed_dom = new domDocument; 
$feed_dom->preserveWhiteSpace = false;
$items = $feed_dom->getElementsByTagName('entry');

foreach($items as $item){
$title = $item->getElementsByTagName('title')->item(0)->nodeValue;
$desc_table = $item->getElementsByTagName('content')->item(0)->nodeValue;

$table_dom = new domDocument;
$xpath = new DOMXpath($table_dom);
$table_dom->preserveWhiteSpace = false;
$yt_link_node = $xpath->query("//table/tr/td[2]/a[2]");

foreach($yt_link_node as $yt_link){

$yt = $yt_link->getAttribute('href');
echo $title;
echo $yt;

For some reason, it isn't working and I have applied almost every xpath query that I found on google & stackoverflow. Title is echoing well, but not the $yt. Can you pick what wrong I am doing?


  • It's because the DOM is slightly different from what you seem to expect.

    The HTML you are parsing there ($desc_table) typically has this structure:

                <a href="">
                    <img src="" 
                         alt="..." title="..." />
            <td> &#32; submitted by &#32; 
                <a href=""> /u/... </a> 
                    <a href="">[link]</a>
                    <a href="">[comments]</a>

    So there is no second anchor element (a) that is a direct child of the second td element, as the second (and third) anchor is wrapped in a span tag.

    So if you want to get to this link:

                    <a href="">[link]</a>

    then use this XPath instead:

     $yt_link_node = $xpath->query("//table/tr/td[2]/span[1]/a");