Search code examples
regexsecuritygoglob

Are there evil globs?


I've read about Evil RegExp and usually ensure a basic level of safety is in place when dealing with user input with regards to RegExp.

What I am unsure about is whether this issue is also present in Glob. I imagine it will come down to the individual implementations of Glob'ing' and in my particular instance I am using https://github.com/gobwas/glob/

I'd appreciate any advice available for how to test for this issue and potentially how to mitigate against it.


Solution

  • By "evil regex" I assume you mean a regex that falls victim to catastrophic backtracking.

    From what you're describing, it seems like you're using a glob library to avoid these "evil regexes". Globs are essentially a weaker version of regex.

    The thing that you're missing here is the fact that regexes don't have to be evil. This can be proven in plain Go, with no external libraries.

    Try running this Go program:

    package main
    
    import (
        "fmt"; "regexp"
    )
    
    func main() {
        reg := regexp.MustCompile(`^([^z]*?,){11}P`)
    
        txt := `1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18zP`
    
        fmt.Println(reg.ReplaceAllString(txt, ""))
    }
    

    You might wonder why this code doesn't measure how much time execution took. It's because it's not needed (and also because I don't know much Go).

    The regex will work in almost all regex flavors. You can try running it in Java, Perl or another similar flavor (I like using PCRE on https://regex101.com/#pcre), but the outcome will be one of two things:

    • A timeout
    • You get fed up with how long it's taking and stop the program

    Yes, that combination causes catastrophic backtracking in most regex flavors. But not Go. Why?

    Go doesn't use backtracking at all for its regexes, so it's not even a possibility. According to this site:

    In Go, we find an optimized regular expression engine. This runs in linear time, making complex patterns faster. It is located in the regexp package.

    Read more about the differences between backtracking and non-backtracking engines here.


    Considering the glob library (according to that GitHub link) appears faster than Go's regexps, performance shouldn't be a problem.