Search code examples
regexgo

How to remove redundant spaces/whitespace from a string in Golang?


I was wondering how to remove:

  • All leading/trailing whitespace or new-line characters, null characters, etc.
  • Any redundant spaces within a string (ex. "hello[space][space]world" would be converted to "hello[space]world")

Is this possible with a single Regex, with unicode support for international space characters, etc.?


Solution

  • It seems that you might want to use both \s shorthand character class and \p{Zs} Unicode property to match Unicode spaces. However, both steps cannot be done with 1 regex replacement as you need two different replacements, and the ReplaceAllStringFunc only allows a whole match string as argument (I have no idea how to check which group matched).

    Thus, I suggest using two regexps:

    • ^[\s\p{Zs}]+|[\s\p{Zs}]+$ - to match all leading/trailing whitespace
    • [\s\p{Zs}]{2,} - to match 2 or more whitespace symbols inside a string

    Sample code:

    package main
    
    import (
        "fmt"
        "regexp"
    )
    
    func main() {
        input := "   Text   More here     "
        re_leadclose_whtsp := regexp.MustCompile(`^[\s\p{Zs}]+|[\s\p{Zs}]+$`)
        re_inside_whtsp := regexp.MustCompile(`[\s\p{Zs}]{2,}`)
        final := re_leadclose_whtsp.ReplaceAllString(input, "")
        final = re_inside_whtsp.ReplaceAllString(final, " ")
        fmt.Println(final)
    }