Search code examples
javascriptstringgobitwise-operators

Go returns other sum using bitwise operator than javascript


I've tried to make port of function made in JS using Go but I'm facing strange problem. The goal of function is to sum ascii codes from every letter in string.

Everything is fine until string length is <= 6 After that Go returns other results.

Original from JS

function c(e) { // e is string
    var t = 0;
     if (!e) // if e == ""
       return t;
     for (var n = 0; n < e.length; n++) {
            t = (t << 5) - t + e.charCodeAt(n),
            t &= t
     }
     return t
}sd
c("Google") // returns 2138589785
c("Google1") // returns 1871773944

Port in Go

package main

import (
    "fmt"
)

func main() {
    fmt.Println(CountChars("Google")) // returns 2138589785
    fmt.Println(CountChars("Google1")) // returns 66296283384
}

func CharCodeAt(s string) int {
    return int([]rune(s)[0])
}

func CountChars(char string) int {
    var sum int = 0
    if char == "" {
        return sum
    }
    for x:=0; x<len(char); x++ {
        charToCode := string(char[x])
        sum = (sum << 5) - sum + CharCodeAt(charToCode)
        sum &= sum
    }
    return sum
}

Go playground

JS playground in playcode


Solution

  • Integers in Javascript are 32-bit, while Go's int is architecture dependent, may be 32 bit and 64 bit. It's 64-bit on the Go Playground. And since each iteration shifts left by 5, using more than 6 characters surely "overflows" in Javascript (but not yet in Go): 7*5=35 > 32 bits.

    Use explicit 32-bit integers (int32) to have the same output as in Javascript:

    func CountChars(char string) int32 {
        var sum int32 = 0
        if char == "" {
            return sum
        }
        for x := 0; x < len(char); x++ {
            sum = (sum << 5) - sum + int32(char[x])
            sum &= sum
        }
        return sum
    }
    

    This way output will be the same as that of Javascript (try it on the Go Playground):

    2138589785
    1871773944
    

    Also note that Go stores strings as their UTF-8 byte sequences in memory, and indexing a string (like char[x]) indexes its bytes, the UTF-8 sequence. This is fine in your example as all the input characters are encoded using a single byte, but you'll get different result if the input contains multi-byte characters.

    To properly handle all cases, use a simple for range over the string: that returns the successive runes, which is also an alias to int32, so you get the code points you need.

    Also that check for empty string is unnecessary, if it's empty, the loop body will not be executed. Also sum &= sum: this is a no-op, simply remove this.

    The simplified version:

    func CountChars(s string) (sum int32) {
        for _, r := range s {
            sum = (sum << 5) - sum + r
        }
        return
    }
    

    Testing it:

    fmt.Println(CountChars("Google 世界"))
    

    Will output the same as in Javascript (try this one on the Go Playground):

    -815903459