Search code examples
phpregexpreg-replacepreg-match

Remove special characters like lt; but not anchor tag


How can I remove special characters like ;lt ;gt but not Anchor tag e.g

&amp;lt;a href=&amp;quot;http://www.imdb.com/name/nm0005069/&amp;quot;&amp;gt;Spike Jonze&amp;lt;/a&amp;gt; This cause by <a class="primary-black" href="http://example.com/community/RobHallums">RobHallums</a> 

should be

Spike Jonze This cause by <a class="primary-black" href="http://example.com/community/RobHallums">RobHallums</a>

Solution

  • Here's a quick one for you:

    <?php
    
    // SET OUR DEFAULT STRING
    $string = '&amp;lt;a href=&amp;quot;http://w...content-available-to-author-only...b.com/name/nm0005069/&amp;quot;&amp;gt;Spike Jonze&amp;lt;/a&amp;gt; This cause by <a class="primary-black" href="http://e...content-available-to-author-only...e.com/community/RobHallums">RobHallums</a>';
    
    // USE PREG_REPLACE TO STRIP OUT THE STUFF WE DON'T WANT
    $string = preg_replace('~&amp;lt;.*?&amp;gt;~', '', $string);
    
    // PRINT OUT OUR NEW STRING
    print $string;
    

    All I'm doing here is looking for &amp;lt;, followed by any character ., any number of times *, until it matches the next part of the string ?, which is &amp;gt;.

    Any time it finds that, it replaces it with nothing. So you're left with the text you want.

    Here is a working demo:

    http://ideone.com/uSnY0b