Search code examples
phpquerypath

Using multiple find in foreach with QueryPath


I'm using QueryPath and PHP.

This finds the .eventdate okay, but doesn't return anything for .dtstart:

$qp = htmlqp($url);
foreach ($qp->find('table#schedule')->find('tr') as $tr){
    echo 'date: ';
    echo $tr->find('.eventdate')->text();
    echo ' time: ';
    echo $tr->find('.dtstart')->text();
    echo '<br>';
}

If I swap the two, .dtstart works okay, but .eventdate doesn't return anything. Thus, it seems that find() in querypath destroys the element and only returns the value it needs, making iteration over $tr not possible to search for multiple items.

Here's example HTML for a TR I'm dealing with:

<tr class="event"><th class="date first" scope="row"><abbr class="eventdate" title="Thursday, February 01, 2011" >02/01</abbr><span class="eventtime" ><abbr class="dtstart" title="2012-02-01T19:00:00" >7:00 PM</abbr><abbr class="dtend" title="2012-02-01T21:00:00" >9:00 PM</abbr></span></th><td class="opponent summary"><ul><li class="first">@ <a class="team" href="/high-schools/ridge-wolves/basketball-winter-11-12/schedule.htm" >Ridge </a> <span class="game-note">*</span></li><li class="location" title="Details: Ridge High School">Details: Ridge High School</li><li class="last"><a class="" href="/local/stats/pregame.aspx?contestid=4255-4c6c-906d&amp;ssid=381d-49f5-9f6d" >Preview Game</a></li></ul></td><td class="result last"><a class="pregame" href="/local/stats/pregame.aspx?contestid=4255-4c6c-906d&amp;ssid=381d-49f5-9f6d">Preview</a></td></tr>

I tried copying the $tr before the first find and replacing it before the second, but that didn't work.

How can I search during each $tr for certain variables?

FYI, beyond .eventdate and .dtstart, I also want the .opponent, href under the a for the opponent and the a anchor text.


Solution

  • I'm just learning QueryPath myself, but I think you should branch the row object. Otherwise the $tr->find('.eventdate') will take you to the abbr element contained in the row, and each following find() will try to find elements beneath the abbr, resulting in no matches. branch() (see documentation) creates a copy of the QueryPath object, leaving the original object (in this case $tr) intact.

    So your code would be:

    $qp = htmlqp($url);
    foreach ($qp->find('table#schedule')->find('tr') as $tr){
        echo 'date: ';
        echo $tr->branch()->find('.eventdate')->text();
        echo ' time: ';
        echo $tr->branch()->find('.dtstart')->text();
        echo '<br>';
    }
    

    I don't know if this is the preferred way to achieve what you want, but it seems to work.