How can I replace contents of a tag with it's link
$str = 'This <strong>string</strong> contains a <a href="/local/link.html">local link</a>
and a <a href="http://remo.te/link.com">remote link</a>';
$str = strip_tags($str,'<a>'); // strip out the <strong> tag
$str = ?????? // how can I strip out the local link anchor tag, but leave the remote link?
echo $str;
Desired output:
This string contains a local link and a <a href="http://remo.te/link.com">remote link</a>
Or, better yet, replace contents of remote link with its url:
This string contains a local link and a http://remo.te/link.com
How can I achieve the final output?
To replace your remotely linked anchor with the URL:
<a href="(https?://[^"]+)">.*?</a>
$1
To remove the anchor around a local URL:
<a href="(?!https?://)[^"]+">(.*?)</a>
$1
Explanation:
Both expressions match <a href="
, ">
, and </a>
literally. The first one will then match a remote URL (http
, optional s
, ://
and everything up to the closing "
) in a capture group that we can reference with $1
. The second expression will match anything that does not start with the protocol used previously, and then capture the actual text of the link into $1
.
Please note that regular expressions aren't the best solution to parsing HTML, since HTML is not a regular language. However, it seems like your use case is "simple" enough that we can make a regular expression. This will not work with links like <a href=''></a>
or <a href="" title=""></a>
, but it can be expanded on to allow for these use cases (hence my previous note of HTML not being regular).
PHP
$str = 'This <strong>string</strong> contains a <a href="/local/link.html">local link</a> and a <a href="http://remo.te/link.com">remote link</a>';
$str = strip_tags($str,'<a>');
$str = preg_replace('~<a href="(https?://[^"]+)".*?>.*?</a>~', '$1', $str);
$str = preg_replace('~<a href="(?!https?://)[^"]+">(.*?)</a>~', '$1', $str);
echo $str;
// This string contains a local link and a http://remo.te/link.com