I want to remove references on Wikipedia with AutoWikiBrowser (.net regex flavor), an automatic editor that handles regexes, but I am facing a newbie problem with the tags.
For example, I want to remove all references containing example.com
, e.g.
<ref>{{cite web|title=Bar|url=https://example.com/bar}}</ref>
I tried the basic regex <ref>.*?example.com.*?</ref>
(replaced with nothing), but it also captures everything after the first <ref>
tag encountered, e.g:
<ref>{{cite web|title=Foo|url=https://zzz.com/foo}}</ref> blah-blah <ref>{{cite web|title=Bar|url=https://example.com/bar}}</ref>
I tried lookarounds with the tags, but the issue is it is not capturing the tags.
I am sorry to ask such a simple question, but I have been searching for the last hour to no avail, I speak English quite fluently, but not when it comes to technical terms...
You can use this regex, which will match a <ref>
tag that includes example.com
before the closing </ref>
:
<ref>(?:(?!<\/ref>).)*example\.com.*?<\/ref>
This matches:
<ref>
: the characters <ref>
(?:(?!<\/ref>).)*
: any number of characters that do not start a closing </ref>
tag (using a tempered greedy token)example\.com
: the characters example.com
.*?
: a minimal number of characters<\/ref>
: the characters </ref>
Demo on regex101
Note dependent on your regex engine and its regex delimiters you may not need the \
before the /
in </ref>