Search code examples
phpregexarrayspreg-replacedomparser

Php add nofollow to links if domains are in the array?


How can i check if in the same string one or many from a list of domains stored in an array are present and link to them?

the array:

array  = ('example.com','domain.com','example.net')

and the text:

Lorem ipsum <a href="http://example.net">dolor sit amet</a>, consectetur adipiscing <a href="http://domain.com">elit</a>. 
Quisque quam urna, <a href="http://example.com/some-page/">hendrerit ut</a> vestibulum sit amet, elementum interdum dolor.

What i want to do is to add nofollow to the links if they are present in the array.

Can somebody help me?


Solution

  • Don't parse HTML with a regex. Use a DOM Parser instead.

    function getRootDomain($url) 
    {
        // @ http://stackoverflow.com/a/19068356/1438393
        if (!preg_match("~^(?:f|ht)tps?://~i", $url)) {
            $url = "http://" . $url;
        }
        return implode('.', array_slice(explode('.', parse_url($url, PHP_URL_HOST)), -2));
    }
    
    // your domains array
    $domains = array('example.com','domain.com','example.net');
    
    $dom = new DOMDocument;
    $dom->loadHTML($html);
    $dom->preserveWhiteSpace = false; 
    $dom->formatOutput = true; 
    
    // loop through all links
    foreach ($dom->getElementsByTagName('a') as $link) {
        $href = $link->getAttribute('href');
        if (in_array(getRootDomain($href), $domains)) {
            $link->setAttribute('rel', 'nofollow');
        }
    }
    
    echo $dom->saveHTML();
    

    Demo!