Search code examples
regexgogofmt

Regex match of =~ returns wrong result when starts with .*


package main

import "fmt"
import "regexp"

func main() {
    var sep = "=~"
    var filter = "exported_pod=~.*grafana.*"
    matched, _ := regexp.MatchString(sep+`\b`, filter)
    fmt.Println(matched)
}

In the above snippet, I'm trying to return True if =~ is exactly present in the filter string.

Unable to understand why it's returning false.

It works as expected if the filter string is "exported_pod=~grafana.*" whereas if it is "exported_pod=~.*grafana.*", it fails. Please help me in understanding what's wrong here.


The actual problem is:

Split the string around either =, =~, !=, !~.

In the example above, the result should be [ "exported_pod", ".*grafana.*" ].
But that split should happen for any one of the listed separators.


Solution

  • From regex101:

    \b matches, without consuming any characters, immediately between a character matched by \w (a-z) and a character not matched by \w (in either order).
    It cannot be used to separate non words from words.

    So using \b would not work. (irrespective of the fact regexp might not be the best fit for this case)

    To simply test if the string includes =~ (as in "How to check if a string contains a substring in Go")

    fmt.Println(strings.Contains(filter, "=~")) // true
    

    See this playground example.

    package main
    
    import (
        "fmt"
        "strings"
    )
    
    func main() {
        var sep = "=~"
        var filter = "exported_pod=~.*grafana.*"
        matched := strings.Contains(filter, sep)
        fmt.Println(matched)
    }
    

    If you need to test for more than one separator though, then yes, regex can help: playground example, with regex tested here.

    package main
    
    import "fmt"
    import "regexp"
    
    func main() {
        var filter = "exported_pod=~.*grafana.*"
        matched, _ := regexp.MatchString(`[^=!~](=|=~|!=|!~)[^=!~]`, filter)
        fmt.Println(matched)
    }
    

    Using a regexp with a named capture group:

    [^=!~](?P<separator>=|=~|!=|!~)[^=!~]
           ^^^^^^^^^^^^^
    

    You can extract that separator, using regexp.SubexpIndex (Go 1.15+, Aug. 2020), and use it to split your original string.
    See this playground example:

    package main
    
    import "fmt"
    import "regexp"
    import "strings"
    
    func main() {
        var filter = "exported_pod=~.*grafana.*"
        re := regexp.MustCompile(`[^=!~](?P<separator>=|=~|!=|!~)[^=!~]`)
        matches := re.FindStringSubmatch(filter)
        separator := matches[re.SubexpIndex("separator")]
        filtered := strings.Split(filter, separator)
        fmt.Println(filtered)
    }
    

    filtered is an array with parts before and after any =~ (the separator detected by the regexp).