I wrote this function to convert all specific URLs(mywebsite.com) to links, and strip other URLs to @@@spam@@@.
function get_global_convert_all_urls($content) {
$content = strtolower($content);
$replace = "/(?:http|https)?(?:\:\/\/)?(?:www.)?(([A-Za-z0-9-]+\.)*[A-Za-z0-9-]+\.[A-Za-z]+)(?:\/.*)?/im";
preg_match_all($replace, $content, $search);
$total = count($search[0]);
for($i=0; $i < $total; $i++) {
$url = $search[0][$i];
if(preg_match('/mywebsite.com/i', $url)) {
$content = str_replace($url, '<a href="'.$url.'">'.$url.'</a>', $content);
} else {
$content = str_replace($url, '@@@spam@@@', $content);
}
}
return $content;
}
The only problem that i can't solve is, the regex not ending on space if 2 URLs in one line.
$content = "http://www.mywebsite.com/index.html http://www.others.com/index.html";
Result:
<a href="http://www.mywebsite.com/index.html http://www.others.com/index.html">http://www.mywebsite.com/index.html http://www.others.com/index.html</a>
How can i get the result below:
<a href="http://www.mywebsite.com/index.html">http://www.mywebsite.com/index.html</a> @@@spam@@@
I have tried add this (\s|$) at the ending of regex but no luck:
/(?:http|https)?(?:\:\/\/)?(?:www.)?(([A-Za-z0-9-]+\.)*[A-Za-z0-9-]+\.[A-Za-z]+)(?:\/.*)?(\s|$)/im
Edited based on change in your question.
The problem is your .* at the end of your regex, so my suggestion is to replace it with a more precise expression. I cooked this up real quick, you'll want to some tests to verify your cases. =)
$matches = null;
$returnValue = preg_match_all('!(?:http|https)?(?:\\:\\/\\/)?(?:www.)?(([A-Za-z0-9-]+\\.)*[A-Za-z0-9-]+\\.[A-Za-z]+)(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\\-\\._\\?\\,\\\'/\\\\\\+&%\\$#\\=~])*[^\\.\\,\\)\\(]!', 'mywebsite.com/index.html others.com/index.html', $matches);
Results in:
array (
0 =>
array (
0 => 'mywebsite.com/index.html ',
1 => 'others.com/index.html',
),
1 =>
array (
0 => 'mywebsite.com',
1 => 'others.com',
),
2 =>
array (
0 => '',
1 => '',
),
3 =>
array (
0 => '',
1 => '',
),
4 =>
array (
0 => 'l',
1 => 'm',
),
)