Search code examples
phphtmlregexpreg-replace-callback

RegEx find all anchor tags, including ones with images


I am trying to find all <a> tags and add a div around them. How do you change the RegEx to match <a><img></a> as well as <a>text</a>, or any tags within the <a> tag. I have:

<php

    $a_pattern = '@<a\s*.*>.*(<.*>)?.*</a>@i';
    $out = preg_replace_callback($a_pattern,"match_callback",$html);
    function match_callback($matches)
    {
         var_dump($matches);
    }

?>

Solution

  • Here is how you can do it with a built-in PHP DOM parser (using with a some fake HTML, but you will get the idea):

    <?php
    $doc = new DOMDocument('1.0', 'UTF-8');
    $doc = DOMDocument::loadHTML('<body>
         <a href="somewere"><img src="www.foo.com/example.gif" class="foo" alt="..."><br></a>
         <a href="somewere again"><img src="www.bar.com/1.jpg" class="bar" alt="..."></a>
         <a href="somewere again and back">Text</a>
         </body>
    ', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
    
    foreach ($doc->getElementsByTagName('a') as $a_node) {
       $div = $a_node->ownerDocument->createElement('div');
       $node = $a_node->parentNode->insertBefore($div, $a_node);
       $node->appendChild($a_node);
    }
    echo $doc->saveHTML();
    

    Output of the sample demo:

    <body>
    <div><a href="somewere"><img src="www.foo.com/example.gif" class="foo" alt="..."><br></a></div>
    <div><a href="somewere%20again"><img src="www.bar.com/1.jpg" class="bar" alt="..."></a></div>
    <div><a href="somewere%20again%20and%20back">Text</a></div>
    </body>
    

    You can also add attributes with the help of:

    $node->setAttribute('class', 'title');