Search code examples
phpxpathappendhref

Find and append hrefs of a certain class


I've been searching for a solution to this but haven't found quite the right thing yet.

The situation is this: I need to find all links on a page with a given class (say class="tracker") and then append query string values on the end, so when a user loads a page, those certain links are updated with some dynamic information.

I know how this can be done with Javascript, but I'd really like to adapt it to run server side instead. I'm quite new to PHP, but from the looks of it, XPath might be what I'm looking for but I haven't found a suitable example to get started with. Is there anything like GetElementByClass?

Any help would be greatly appreciated!

Shadowise


Solution

  • Is there anything like GetElementByClass?

    Here is an implementation I whipped up...

    function getElementsByClassName(DOMDocument $domNode, $className) {
        $elements = $domNode->getElementsByTagName('*');
        $matches = array();
        foreach($elements as $element) {
            if ( ! $element->hasAttribute('class')) {
                continue;
            }
            $classes = preg_split('/\s+/', $element->getAttribute('class'));
            if ( ! in_array($className, $classes)) {
                continue;
            }
            $matches[] = $element;
        }
        return $matches;
    }
    

    This version doesn't rely on the helper function above.

    $str = '<body>
        <a href="">a</a>
            <a href="http://example.com" class="tracker">a</a>
            <a href="http://example.com?hello" class="tracker">a</a>
        <a href="">a</a>
    </body>
        ';
    
    $dom = new DOMDocument;
    
    $dom->loadHTML($str);
    
    $anchors = $dom->getElementsByTagName('body')->item(0)->getElementsByTagName('a');
    
    foreach($anchors as $anchor) {
    
        if ( ! $anchor->hasAttribute('class')) {
            continue;
        }
    
        $classes = preg_split('/\s+/', $anchor->getAttribute('class'));
    
        if ( ! in_array('tracker', $classes)) {
            continue;
        }
    
        $href = $anchor->getAttribute('href');
    
        $url = parse_url($href);
    
        $attach = 'stackoverflow=true';
    
        if (isset($url['query'])) {
            $href .= '&' . $attach;
        } else {
            $href .= '?' . $attach;
        }
    
        $anchor->setAttribute('href', $href);
    }
    
    echo $dom->saveHTML();
    

    Output

    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
    <html><body>
        <a href="">a</a>
            <a href="http://example.com?stackoverflow=true" class="tracker">a</a>
            <a href="http://example.com?hello&amp;stackoverflow=true" class="tracker">a</a>
        <a href="">a</a>
    </body></html>