Search code examples
goascii

Checking a string contains only ASCII characters


Does Go have any method or there is a suggestion how to check if a string contains only ASCII characters? What is the right way to do it?

From my research, one of the solution is to check whatever there is any char greater than 127.

func isASCII(s string) bool {
    for _, c := range s {
        if c > unicode.MaxASCII {
            return false
        }
    }

    return true
}

Solution

  • In Go, we care about performance, Therefore, we would benchmark your code:

    func isASCII(s string) bool {
        for _, c := range s {
            if c > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    BenchmarkRange-4    20000000    82.0 ns/op
    

    A faster (better, more idiomatic) version, which avoids unnecessary rune conversions:

    func isASCII(s string) bool {
        for i := 0; i < len(s); i++ {
            if s[i] > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    BenchmarkIndex-4    30000000    55.4 ns/op
    

    ascii_test.go:

    package main
    
    import (
        "testing"
        "unicode"
    )
    
    func isASCIIRange(s string) bool {
        for _, c := range s {
            if c > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    func BenchmarkRange(b *testing.B) {
        str := ascii()
        b.ResetTimer()
        for N := 0; N < b.N; N++ {
            is := isASCIIRange(str)
            if !is {
                b.Fatal("notASCII")
            }
        }
    }
    
    func isASCIIIndex(s string) bool {
        for i := 0; i < len(s); i++ {
            if s[i] > unicode.MaxASCII {
                return false
            }
        }
        return true
    }
    
    func BenchmarkIndex(b *testing.B) {
        str := ascii()
        b.ResetTimer()
        for N := 0; N < b.N; N++ {
            is := isASCIIIndex(str)
            if !is {
                b.Log("notASCII")
            }
        }
    }
    
    func ascii() string {
        byt := make([]byte, unicode.MaxASCII+1)
        for i := range byt {
            byt[i] = byte(i)
        }
        return string(byt)
    }
    

    Output:

    $ go test ascii_test.go -bench=.
    BenchmarkRange-4    20000000    82.0 ns/op
    BenchmarkIndex-4    30000000    55.4 ns/op
    $