Search code examples
regexgoregex-greedy

Fail the regex match at the end of the string in Go


I'm trying to test a string against this pattern: "At least one square bracket pair, wrapping 2 digits, followed by at lease one character". For example, [11][22][33]dd should match while [11][22][33] shouldn't.

I've tried this regex: (\[\d{2}])+.+. However, when it is tested against with [11][22][33], which should have failed, it still passes that test. The first + quantifier only matches two groups [11] and [22], while the rest part [33] is matched by .+.

I thought the "greedy" behaviour of the + quantifier would exhaust all the matching segments of the group it modifies; however it seems that the regex engine would place the "exhaust all matching possibilities" principle above the "greedy quantifier" rule, not the way I'd expected.

How should I achieve my goal?

(This question is actually language-agnostic, though tagged with "golang" which is the language I'm currently using.)


Solution

  • You may use

    re := regexp.MustCompile(`(?:\[\d{2}])+(.*)`)
    match := re.FindStringSubmatch(s)
    if len(match) > 1 {
        return match[1] != ""
    }
    return false
    

    The (?:\[\d{2}])+(.*) pattern matches 1+ occurrences of [, 2 digits, ] and then captures any 0 or more chars other than line break chars into Group 1. Then, if the match was found (if len(match) > 1), true should be returned if the Group 1 value is not empty (match[1] != ""), otherwise, false is returned.

    See Go demo:

    package main
    
    import (
        "fmt"
        "regexp"
    )
    
    func main() {
        strs := []string{
            "[11][22][33]",
            "___[11][22][33]",
            "[11][22][33]____",
            "[11][22]____[33]",
        }
        for _, str := range strs {
            fmt.Printf("%q - %t\n", str, match(str))
        }
    }
    
    var re = regexp.MustCompile(`(?:\[\d{2}])+(.*)`)
    
    func match(s string) bool {
        match := re.FindStringSubmatch(s)
        if len(match) > 1 {
            return match[1] != ""
        }
        return false
    }
    

    Output:

    "[11][22][33]" - false
    "___[11][22][33]" - false
    "[11][22][33]____" - true
    "[11][22]____[33]" - true