Search code examples
stringgocolorshex

Parse hex string to image/color


How can I parse RGB color in web color format (3 or 6 hex digits) to Color from image/color? Does go have any built-in parser for that? I want to be able to parse both #XXXXXX and #XXX colors formats. color docs says nothing about it: https://golang.org/pkg/image/color/ but this task is very common, so I believe that go has some functions for that (which I just didn't find).


Update: I created small Go library based on accepted answer: github.com/g4s8/hexcolor


Solution

  • Foreword: I released this utility (the 2. Fast solution) in github.com/icza/gox, see colorx.ParseHexColor().


    1. Elegant solution

    Here's another solution using fmt.Sscanf(). It certainly not the fastest solution, but it is elegant. It scans right into the fields of a color.RGBA struct:

    func ParseHexColor(s string) (c color.RGBA, err error) {
        c.A = 0xff
        switch len(s) {
        case 7:
            _, err = fmt.Sscanf(s, "#%02x%02x%02x", &c.R, &c.G, &c.B)
        case 4:
            _, err = fmt.Sscanf(s, "#%1x%1x%1x", &c.R, &c.G, &c.B)
            // Double the hex digits:
            c.R *= 17
            c.G *= 17
            c.B *= 17
        default:
            err = fmt.Errorf("invalid length, must be 7 or 4")
    
        }
        return
    }
    

    Testing it:

    hexCols := []string{
        "#112233",
        "#123",
        "#000233",
        "#023",
        "invalid",
        "#abcd",
        "#-12",
    }
    for _, hc := range hexCols {
        c, err := ParseHexColor(hc)
        fmt.Printf("%-7s = %3v, %v\n", hc, c, err)
    }
    

    Output (try it on the Go Playground):

    #112233 = { 17  34  51 255}, <nil>
    #123    = { 17  34  51 255}, <nil>
    #000233 = {  0   2  51 255}, <nil>
    #023    = {  0  34  51 255}, <nil>
    invalid = {  0   0   0 255}, input does not match format
    #abcd   = {  0   0   0 255}, invalid length, must be 7 or 4
    #-12    = {  0   0   0 255}, expected integer
    

    2. Fast solution

    If performance does matter, fmt.Sscanf() is a really bad choice. It requires a format string which the implementation has to parse, and according to it parse the input, and use reflection to store the result to the pointed values.

    Since the task is basically just "parsing" a hexadecimal value, we can do better than this. We don't even have to call into a general hex parsing library (such as encoding/hex), we can do that on our own. We don't even have to treat the input as a string, or even as a series of runes, we may lower to the level of treating it as a series of bytes. Yes, Go stores string values as UTF-8 byte sequences in memory, but if the input is a valid color string, all its bytes must be in the range of 0..127 which map to bytes 1-to-1. If that would not be the case, the input would already be invalid, which we will detect, but what color we return in that case should not matter (does not matter).

    Now let's see a fast implementation:

    var errInvalidFormat = errors.New("invalid format")
    
    func ParseHexColorFast(s string) (c color.RGBA, err error) {
        c.A = 0xff
    
        if s[0] != '#' {
            return c, errInvalidFormat
        }
    
        hexToByte := func(b byte) byte {
            switch {
            case b >= '0' && b <= '9':
                return b - '0'
            case b >= 'a' && b <= 'f':
                return b - 'a' + 10
            case b >= 'A' && b <= 'F':
                return b - 'A' + 10
            }
            err = errInvalidFormat
            return 0
        }
    
        switch len(s) {
        case 7:
            c.R = hexToByte(s[1])<<4 + hexToByte(s[2])
            c.G = hexToByte(s[3])<<4 + hexToByte(s[4])
            c.B = hexToByte(s[5])<<4 + hexToByte(s[6])
        case 4:
            c.R = hexToByte(s[1]) * 17
            c.G = hexToByte(s[2]) * 17
            c.B = hexToByte(s[3]) * 17
        default:
            err = errInvalidFormat
        }
        return
    }
    

    Testing it with the same inputs as in the first example, the output is (try it on the Go Playground):

    #112233 = { 17  34  51 255}, <nil>
    #123    = { 17  34  51 255}, <nil>
    #000233 = {  0   2  51 255}, <nil>
    #023    = {  0  34  51 255}, <nil>
    invalid = {  0   0   0 255}, invalid format
    #abcd   = {  0   0   0 255}, invalid format
    #-12    = {  0  17  34 255}, invalid format
    

    3. Benchmarks

    Let's benchmark these 2 solutions. The benchmarking code will include calling them with long and short formats. Error case is excluded.

    func BenchmarkParseHexColor(b *testing.B) {
        for i := 0; i < b.N; i++ {
            ParseHexColor("#112233")
            ParseHexColor("#123")
        }
    }
    
    func BenchmarkParseHexColorFast(b *testing.B) {
        for i := 0; i < b.N; i++ {
            ParseHexColorFast("#112233")
            ParseHexColorFast("#123")
        }
    }
    

    And here are the benchmark results:

    go test -bench . -benchmem
    
    BenchmarkParseHexColor-4         500000     2557 ns/op      144 B/op    9 allocs/op
    BenchmarkParseHexColorFast-4   100000000      10.3 ns/op      0 B/op    0 allocs/op
    

    As we can see, the "fast" solution is roughly 250 times faster and uses no allocation (unlike the "elegant" solution).