Search code examples
phpxpathdomxpath

DOMXPath union extract with PHP


I'm trying to get img and the div which is coming after the div which contains that img, all in one query. So I did this:

$nodes = $xpath->query('//div[starts-with(@id, "someid")]/img | 
//div[starts-with(@id, "someid")]/following-sibling::div[@class="spec_class"][1]/text()');

Now, I'm able to get the attributes of img tag, but I can't get the text of the following sibling. If I separate the query (two queries - first for the img and second query for the sibling) it works. But how can I do this with only one query? By the way, there is no error in the syntax. But somehow the union doesn't work or maybe I'm not extracting the sibling content right.

Here's the markup (which repeats many times with another text and id="someid_%randomNumber%)

<div id="someid_1">
    <img src="link_to_image.png" />
    ...some text...
</div>

<div>...another text...</div>

<div class="spec_class">
...Important text...
</div>

I want to get in one query both link_to_image.png and ...Important text...


Solution

  • Your query seems correct.

    Example XML:

    <div>
        <div id="someid-1"><img src="foo"/></div>
        <div class="spec_class">bar</div>
        <div class="spec_class">baz</div>
    </div>
    

    Example PHP Code:

    $dom = new DOMDocument;
    $dom->loadXml($xhtml);
    $xpath = new DOMXPath($dom);
    foreach ($xpath->query('//div…') as $node) {
        echo $dom->saveXML($node);
    }
    

    Outputs (demo):

    <img src="foo"/>bar
    

    Note that you will have to iterate the DOMNodeList returned by the XPath query.