Search code examples
phpregexpreg-replace-callback

preg_replace_callback pattern issue


I'm using the following pattern to capture links, and turn them into HTML friendly links. I use the following pattern in a preg_replace_callback and for the most part it works.

"#(https?|ftp)://(\S+[^\s.,>)\];'\"!?])#"

But this pattern fails when the text reads like so:

http://mylink.com/page[/b]

At that point it captures the [/b amusing it is part of the link, resulting in this:

<a href="http://woodmill.co.uk[/b">woodmill.co.uk[/b</a>]

I've look over the pattern, and used some cheat sheets to try and follow what is happening, but it has foxed me. Can any of you code ninja's help?


Solution

  • Ok I solved the problem. Thanks to @Cyborgx37 and @MikeBrant for your help. Here's the solution.

    Firstly I replaced my regexp pattern with the one that João Castro used in this question: Making a url regex global

    The problem with that pattern is it captured any trailing dots at the end, so in the final section of the pattern I added ^. making the final part look like so [^\s^.]. As I read it, do not match a trailing space or dot.

    This still caused an issue matching bbcode as I mentioned above, so I used preg_replace_callback() and create_function() to filter it out. The final create_function() looks like this:

    create_function('$match','
                    $match[0] = preg_replace("/\[\/?(.*?)\]/", "", $match[0]);
                    $match[0] = preg_replace("/\<\/?(.*?)\>/", "", $match[0]);
                    $m = trim(strtolower($match[0]));
                    $m = str_replace("http://", "", $m);
                    $m = str_replace("https://", "", $m);
                    $m = str_replace("ftp://", "", $m);
                    $m = str_replace("www.", "", $m);
    
                    if (strlen($m) > 25)
                    {
                        $m = substr($m, 0, 25) . "...";
                    }
    
                    return "<a href=\"$match[0]\" target=\"_blank\">$m</a>";
    '), $string);
    

    Tests so far are looking good, so I'm happy it is now solved.

    Thanks again, and I hope this helps someone else :)