Search code examples
regexgoregex-lookarounds

Regex to ignore multiple prefixes and match only if they dont have one or two specific hardcoded attributes?


I am looking for a GoLang RE2 regex that would only not match if the user types in one of the following, all of these must not be case sensitive :

.i l .i Latest .info l .info Latest p!i l p!i latest p!info l p!info latest

So in each of these cases, theres one prefix and one attribute to it, if the user types in just the prefix without the latest or l after it or types in anything else other than latest and l after the prefix including any numbers or special characters. It should be a match.

I have jerry rigged this regex : (?i)\A\.i (?:L.|[^L]+L)

This regex somewhat works, but it only works for .i prefix and checks for an L in front and does not check for numbers. I cannot wrap my head around how i could solve this. I wouldnt mind using multiple regexes, one for each prefix. I tried replacing the \.i with other prefixes and the (?:L.|[^L]+L) part with the word LATEST. That does not seem to work.

Thanks for the help : )


Solution

  • Go does not supports lookarounds. In that case, you can specify what you would allow to match.

    In this case you allow any of the prefixes optionally followed by a "word" other than "l" or "latest"

    One option could be

    (?i)^(?:\.|p!)i(?:nfo)?\b(?:(?: +(?:la(?:t(?:e(?:s(?:t\S)?)?)?)?)| +(?:[^\sl]\S*|l[^\sa]\S*|la[^\st]\S*|lat[^\se]\S*|late[^\ss]\S*|lates[^\st]|latest\S+))(?: +.*)?)?$
    

    In parts

    • (?i) Case insensitive modifier
    • ^ Start of string
    • (?:\.|p!)i Match .i or p!i
    • (?:nfo)?\b Optionally match nfo followed by a word boundary
    • (?: Non capture group
      • (?: Non capture group
        • +(?:la(?:t(?:e(?:s(?:t\S)?)?)?)?) Match 1+ spaces and la lat late lates or latest (followed by at least a single non whitespace char for latest)
        • | Or
        • +(?:[^\sl]\S*|l[^\sa]\S*|la[^\st]\S*|lat[^\se]\S*|late[^\ss]\S*|lates[^\st]|latest\S+) Match 1+ spaces followed by 6 variations that can start the same word excluding 1 char (also excluding a whitespace char using \s), or match the word followed by at least a single whitespace char
      • ) Close non capture group
      • (?: +.*)? Optionally match 1+ spaces and 0+ times any char except a newline
    • )? Close non capture group and make it optional
    • $ End of string

    Regex demo