Search code examples
phpregexhyperlinkreplacehref

PHP URL to Link with Regex


I know I've seen this done a lot in places, but I need something a little more different than the norm. Sadly When I search this anywhere it gets buried in posts about just making the link into an html tag link. I want the PHP function to strip out the "http://" and "https://" from the link as well as anything after the .* so basically what I am looking for is to turn A into B.

A: http://www.youtube.com/watch?v=spsnQWtsUFM
B: <a href="http://www.youtube.com/watch?v=spsnQWtsUFM">www.youtube.com</a>

If it helps, here is my current PHP regex replace function.

ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]", "<a href=\"\\0\" class=\"bwl\" target=\"_new\">\\0</a>", htmlspecialchars($body, ENT_QUOTES)));

It would probably also be helpful to say that I have absolutely no understanding in regular expressions. Thanks!

EDIT: When I entered a comment like this blahblah https://www.facebook.com/?sk=ff&ap=1 blah I get html like this<a class="bwl" href="blahblah https://www.facebook.com/?sk=ff&amp;ap=1 blah">www.facebook.com</a> which doesn't work at all as it is taking the text around the link with it. It works great if someone only comments a link however. This is when I changed the function to this

preg_replace("#^(.*)//(.*)/(.*)$#",'<a class="bwl" href="\0">\2</a>',  htmlspecialchars($body, ENT_QUOTES));

Solution

  • This is the simples and cleanest way:

    $str = 'http://www.youtube.com/watch?v=spsnQWtsUFM';
    preg_match("#//(.+?)/#", $str, $matches);
    
    $site_url = $matches[1];
    

    EDIT: I assume that the $str had been checked to be a URL in the first place, so I left that out. Also, I assume that all the URLs will contain either 'http://' or 'https://'. In case the url is formatted like this www.youtube.com/watch?v=spsnQWtsUFM or even youtube.com/watch?v=spsnQWtsUFM, the above regexp won't work!

    EDIT2: I'm sorry, I didn't realize that you were trying to replace all strings in a whole test. In that case, this should work the way you want it:

    $str = preg_replace('#(\A|[^=\]\'"a-zA-Z0-9])(http[s]?://(.+?)/[^()<>\s]+)#i', '\\1<a href="\\2">\\3</a>', $str);