Search code examples
regexgoogle-analyticsregex-lookaroundsregex-groupregex-negation

Negative Lookahead with Regex to exclude phrases containing a particular word in Google Analytics and Google Search Console


Recently Google launched a new feature on Google Search Console - Regex for queries.

I would like to exclude input that contains any of a list of keywords.

With [^test one|test2], I can exclude exact match.

I would like to find out how I can exclude all the phrases (strings) containing "test one" or "test2":

Here is a list of inputs and expected match:

input should match?
test one goes ok no
test2 no
tomorrow yes
test one no
test2 tomorrow no
goes ok yes

This is going to be implemented on Google Search Console according to the guidelines:

Regular expression filter If you choose the Custom (regex) filter, you can filter by a regular expression (a wildcard match) for the selected item. You can use regular expression filters for page URLs and user queries. The RE2 syntax is used.

The default matching is "partial match", which means that your regular expression can match anywhere in the target string unless you use ^ or $ to require matching from the start or end of the string, respectively. Default matching is case-sensitive. You can specify "(?i)" at the beginning of your regular expression string for case-insensitive matches. Example: (?i)https Invalid regular expression syntax will return no matches. Regular expression matching is tricky; try out your expression on a live testing tool, or read the full RE2 syntax guide


Solution

  • Use a negative look ahead to exclude input with any of your blacklist terms:

    ^(?!.*(test one|test2)).*
    

    See live demo.