Search code examples
phphtmldomxpathdomxpath

DOMXpath query on query


I have the following HTML:

[...]
<div class="row clearfix">
    <div class="col1">Data</div>
    <div class="col2">Data</div>
    <div class="col3">Data</div>
    <div class="col4">Data</div>
    <div class="col5">Data</div>
    <div class="col6">Data</div>
    <div class="col7">Data</div>
    <div class="col8">Data</div>
</div><!--// row-->

<div class="row clearfix otherClass">
    <div class="col1">Data</div>
    <div class="col2">Data</div>
    <div class="col3">Data</div>
    <div class="col4">Data</div>
    <div class="col5">Data</div>
    <div class="col6">Data</div>
    <div class="col7">Data</div>
    <div class="col8">Data</div>
</div><!--// row-->

<div class="row clearfix thirdClass">
    <div class="col1">Data</div>
    <div class="col2">Data</div>
    <div class="col3">Data</div>
    <div class="col4">Data</div>
    <div class="col5">Data</div>
    <div class="col6">Data</div>
    <div class="col7">Data</div>
    <div class="col8">Data</div>
</div><!--// row-->
[...]

I want to get all of these divs out of the HTML, they all start with "row clearfix" as class, but can have more data to it. After that I want to be able to handle each col separetely, so get the value of col1, col2, col3 ect.

I have written this code, but am stuck now. Can someone help me out?

        $oDom = new DOMDocument();
        $oDom->loadHtml($a_sHTML);

        $oDomXpath = new DOMXpath($oDom);
        $oDomObject = $oDomXpath->query('//div[@class="row clearfix"]');

        foreach ($oDomObject as $oObject) {
            var_dump($oObject->query('//div[@class="col1"]')->nodeValue);
        }



UPDATE *Solution*
Thanks to the replies below, I got it working with the following code:

    $oDom = new DOMDocument();
    @$oDom->loadHtml($a_sHTML);

    $oDomXpath = new DOMXpath($oDom);
    $oDomObject = $oDomXpath->query('//div[contains(@class,"row") and contains(@class,"clearfix")]');

    foreach ($oDomObject as $oObject) {
        foreach($oObject->childNodes as $col)
        {
            if ($col->hasAttributes())
            {
                var_dump($col->getAttribute('class') . " == " . trim($col->nodeValue));
            }
        }
    }

Solution

  • To match the outer divs I think that what you need is

    //div[starts-with(@class,"row clearfix")]
    

    or

    //div[contains(@class,"row clearfix")]
    

    or

    //div[contains(@class,"row") and contains(@class,"clearfix")]
    

    I'd go for the last one because the class names could be in any order.

    I am not 100% sure what you want to do with the inner div, but you could get them with something like this:

    div[starts-with(@class,"col")]