There is some strings like:
.texta texti(
.textb textj(
textc textk(
package main
import (
"fmt"
"regexp"
)
func main() {
text := ".texta texta(, .textb textb(, textc textc("
pattern := `[^.,]\s*(\w+)\s+(\w+)\s*\(`
re := regexp.MustCompile(pattern)
matches := re.FindAllStringSubmatch(text, -1)
for _, match := range matches {
if len(match) > 1 {
fmt.Println(match[1])
}
}
}
Why the result has "exta", "extb"?
The target is to get "textc", excludes words started with "." or ","."
If the pattern is \s*(\w+)\s+(\w+)\s*\(
, the result is "texta", "textb" and "textc"
The problem is that your regex is not "anchored" to any specific position while you expect \w+
to start matching at the beginning of a word. The [^,.]
matches any character other than .
and ,
and it can match a word character. So, you need to make sure the negated character class does not match word character, and you may want to also allow a match at the start of the string, you will need to add an alternative.
You can use
pattern := `(?:[^.,\w]|^)\s*(\w+)\s+(\w+)\s*\(`
where (?:[^.,\w]|^)
matches either a char other the .
, ,
or word char, or a position at the start of the string.
See the Go playground demo.