I'm looking for a regex which matches to certain URL's:
I want to match any URL except if they include the word "Koeln" OR if they contain the word "Karneval" (regardless if they contain Koeln or not).
Exaple:
1) http://www.news.com/Report-Deutschland/Panorama/Deutschland/story.html
2) http://www.news.com/Koeln/Karneval/story.html
3) http://www.news.com/Koeln/Koelnaktuell/story.html
1) and 2) should match. 1) because it doesn't include "Koeln" and 2) because it includes "Karneval" 3) should not match because it includes "Koeln" but not "Karneval"
I tried many different regex using positive/negative lookahead but none of them worked so far.
I plan to implement the regex with preg in PHP.
Not sure if this is the best approach here, but you can give this a shot and see if it works for you:
(http://.*?/Karneval.*$|http://www\.news\.com(?!/Koeln).*$)
I am basically just doing two expressions ... one to match Karneval
and one that doesn't find /Koeln
after www.news.com
.
Here's a demo you can try: Regex101 Demo
Hopefully this works for you or at least points you in the right direction.