Search code examples
regexmongodbgomgounicode-escapes

mgo regular expression doesn't work


Now I have some documents, each of which has a key path and value like \A\, \B\, \A\C\, \A\C\D\, \A\E\, \A\E\F\.

I want to find the ones which have only 1 segment. It means the result should be \A\ and \B\. I use Regular Expression /^\\[^\\]*\\$/, which works fine in MongoDB terminal. But when I tried to apply it to Go programs it doesn't work.

Go codes:

var nodeList []NodeEntry // NodeEntry would match every field of one document
err = c.Find(bson.M{"path": bson.M{"$regex": bson.RegEx{"^\\[^\\]*\\$", ""}}}).All(&nodeList)
fmt.Println(nodeList)

Output:

[]

It's so strange, and then I found out that any Regex with \\ would produce an empty result.

So is it a bug of mgo?

(I don't know if it's inappropriate, but I've also posted this question on the mgo.users mailing list.)


Solution

  • In Go, the backslash (\) is the escape character of an interpreted string literal (using "..." as enclosures). In your case, you´d rather want to use a raw string literal (using `...` as enclosures).

    Let's look at this piece of code:

    package main
    
    import "fmt"
    
    func main() {
        fmt.Println("^\\[^\\]*\\$")
        fmt.Println(`^\\[^\\]*\\$`)
    }
    

    Result:

    ^\[^\]*\$
    ^\\[^\\]*\\$
    

    You can see that it is the second option that is the regex string you desire. So, to solve your problem, just enclose your regex string in backticks instead of quotes:

    err = c.Find(bson.M{"path": bson.M{"$regex": bson.RegEx{`^\\[^\\]*\\$`, ""}}}).All(&nodeList)
    

    Go spec reference: http://golang.org/ref/spec#String_literals