Search code examples
cgocgo

Am I doing something wrong or is this a bug in Go's C compiler?


So I'm porting xxhash from using cgo to Go's native 9p C, however I'm running into a rather weird problem.

The hash function works perfectly fine if called as a cgo function, however if I try to use the "native" version it returns the wrong hash.

I know enough C to get it working, but before reporting the issue, I want to make sure I'm not doing anything wrong.

gist

xxhash.go:

//#include "xxhash_9p.c"
//import "C" //uncomment this and comment the next line for the cgo version
func XXH32_test(in unsafe.Pointer, l uint32, seed uint32) uint32


func GoXXH32(in []byte, seed uint32) (h uint32) {
    //omitted, full version in the gist above
}

func main() {
    b := []byte("ABCDEFGLAALSDLSD:LSDL:DL:DL:SDL:SL:DSL:DL:DSL:DL:{W{EOQWExzghp[[")
    fmt.Println(XXH32_test(unsafe.Pointer(&b[0]), uint32(len(b)), 0)) //uncomment this and comment the next line for the cgo version
    //fmt.Println(C.XXH32_test(unsafe.Pointer(&b[0]), C.uint(len(b)), 0))
    fmt.Println(GoXXH32(b, 0)) //this is tested against the C implementation and it's the right hash.
}

xxhash_9p.c:

#define PRIME32_1   2654435761U
#define PRIME32_2   2246822519U
#define PRIME32_3   3266489917U
#define PRIME32_4    668265263U
#define PRIME32_5    374761393U

#define U32 unsigned int
typedef struct _U32_S { U32 v; } U32_S;
#define A32(x) (((U32_S *)(x))->v)

U32 ·XXH32_test(const void* input, U32 len, U32 seed) {
//static U32 XXH32_test(const void* input, U32 len, U32 seed) {
    const char* p = (const char*)input;
    const char* bEnd = p + len;
    U32 h32;

    #define XXH_get32bits(p) A32(p)
    #define XXH_rotl32(x,r) ((x << r) | (x >> (32 - r)))

    if (len>=16) {
        const char* const limit = bEnd - 16;
        U32 v1 = seed + PRIME32_1 + PRIME32_2;
        U32 v2 = seed + PRIME32_2;
        U32 v3 = seed + 0;
        U32 v4 = seed - PRIME32_1;
        do
        {
            v1 += XXH_get32bits(p) * PRIME32_2; v1 = XXH_rotl32(v1, 13); v1 *= PRIME32_1; p+=4;
            v2 += XXH_get32bits(p) * PRIME32_2; v2 = XXH_rotl32(v2, 13); v2 *= PRIME32_1; p+=4;
            v3 += XXH_get32bits(p) * PRIME32_2; v3 = XXH_rotl32(v3, 13); v3 *= PRIME32_1; p+=4;
            v4 += XXH_get32bits(p) * PRIME32_2; v4 = XXH_rotl32(v4, 13); v4 *= PRIME32_1; p+=4;
        } while (p<=limit);

        h32 = XXH_rotl32(v1, 1) + XXH_rotl32(v2, 7) + XXH_rotl32(v3, 12) + XXH_rotl32(v4, 18);
    }
    else
    {
        h32  = seed + PRIME32_5;
    }

    h32 += (unsigned long) len;
    while (p<=bEnd-4) {
        h32 += XXH_get32bits(p) * PRIME32_3;
        h32  = XXH_rotl32(h32, 17) * PRIME32_4 ;
        p+=4;
    }

    while (p<bEnd) {
        h32 += (*p) * PRIME32_5;
        h32 = XXH_rotl32(h32, 11) * PRIME32_1 ;
        p++;
    }

    h32 ^= h32 >> 15;
    h32 *= PRIME32_2;
    h32 ^= h32 >> 13;
    h32 *= PRIME32_3;
    h32 ^= h32 >> 16;
    return h32;
}

Run:

$ go build && ./nocgo #9p native
134316512
981225178
$ go build && ./nocgo #cgo
981225178
981225178

TL;DR:

A C function returns the wrong value when used through Go's 6c, same exact C function returns the correct value when called through CGO.

//edit

I got a response on the issue, it's not gonna get fixed and the 9p toolchain is going away eventually.

From mi...@golang.org:

the C compiler will eventually go away. Plan for that, so don't rely on it.

Note the Plan 9 C compiler isn't fully ANSI compliant, and we're not going to fix bugs in it (because we control both the compiler and its input, we will just workaround its bugs).


Solution

  • After some digging, changing the function signature from

    U32 ·XXH32_test(const void* input, U32 len, U32 seed)
    

    to

    void ·XXH32_test(const unsigned char* input, U32 len, U32 seed, U32 *ret)
    

    And calling it like :

    var u uint32
    XXH32_test(unsafe.Pointer(&b[0]), uint32(len(b)), 0, &u)
    

    Returns the correct hash.

    I'm still not sure what's going on, it should work how it originally was, but I'm guessing the runtime is doing some magic behind the scenes.