I want to remove from the text of all links (<a href=""></a>
), except for those who have tag attribute href="site.com"
(for example).
<a href="site.com">text</a>
<a href="google.com">text</a>
<a href="yandex.com">text</a>
That is that the last two links left. Can you please tell the correct regular expression for it (in Notepad + +).
First, the .*
should be lazy, because otherwise, you will be matching more than necessary.
<a href=".*?">.*?</a>
Next, you can make use of a negative lookahead to prevent matches from <a href="site.com">text</a>
and you do it like this:
<a href="(?!site.com">).*?">.*?</a>
Result if you replace by nothing will be that only <a href="site.com">text</a>
will be left.
If you want to keep the text, wrap the text around parentheses and call it in the replace:
<a href="(?!site.com">).*?">(.*?)</a>
And replace with $1
.
Be sure to select "Regular expression". And if your links span multiples lines, check the checkboxbox ". matches newline" as well.