Search code examples
stringgounicodetrim

Removing special chars from words


I was writing a function in GO for cleaning up individual words, in such a manner that special characters in the beginning and end of each would be removed.

ie:

  • .-hello, -> hello
  • "back-to-back" -> back-to-back

Ended up with the following, by checking letter by letter from each end if they belong to the unicode.Letter set, which works fine, but I was wondering if there are better or more efficient ways of doing so? I experimented with strings.TrimLeft/Right, but then I have to define my own set of chars to remove. It would have been nice to use a predefined set.

func TrimWord(word []rune) string {
    var prefix int = 0
    var suffix int = len(word)

    for x := 0; x < len(word); x++ {
        if !unicode.IsLetter(word[x]) {
            prefix++
        } else {
            break
        }
    }

    for x := len(word) - 1; x >= 0; x-- {
        if suffix == prefix {
            break
        }
        if !unicode.IsLetter(word[x]) {
            suffix--
        } else {
            break
        }
    }
    return string(word[prefix:suffix])
}

Solution

  • package main
    
    import (
        "fmt"
        "strings"
        "unicode"
    )
    
    func trimWord(s string) string {
        return strings.TrimFunc(s, func(r rune) bool {
            return !unicode.IsLetter(r)
        })
    }
    
    func main() {
        fmt.Println(trimWord(`.-hello,`))       // -> hello
        fmt.Println(trimWord(`"back-to-back"`)) // -> back-to-back
    }
    

    https://go.dev/play/p/l1A4hBDvFfr

    hello
    back-to-back
    

    Package strings

    func TrimFunc(s string, f func(rune) bool) string
    

    TrimFunc returns a slice of the string s with all leading and trailing Unicode code points c satisfying f(c) removed.