Search code examples
phpregexreplacehyperlinkhashtag

Convert hashtags to hyperlinks without partially matching htmlentities


I want to replace all occurrences of #word with an HTML link. I have written a preg_replace() call for this:

$text = preg_replace('~#([\p{L}|\p{N}]+)~u', '<a href="/?aranan=$1">#$1</a>', $text);

The problem is, this regular expression also matches the html character codes like &#039; and therefore corrupts the output.

I need to exclude alphanumeric substrings which are preveded by &#, but I do not know how to do that using regular expressions.


Solution

  • '~(?<!&)#([\p{L}|\p{N}]+)~u'
    

    That's a negative lookbehind assertion: http://www.php.net/manual/en/regexp.reference.assertions.php

    Matches # only if not preceded by &