Search code examples
gocryptography

Go - How to test a siphash 2-4 function with known vectors


I have a sipHashDigest function calculating a siphash2-4 digest with this go library. As you can see, the method normalizes the key to 16 bytes, splits the randomBytes and calculates the digest. My question is, how can I test this function to understand if the siphash has been calculated correctly? There are known vectors to use ? I was using this to check the results but I'm not getting the same results with my implementation. What I'm doing wrong?

Thanks for the support.

// normalizeKey ensures the key is exactly 16 bytes long by truncating or padding with zeros if necessary.
func normalizeKey(key string) []byte {
   const keySize = 16
   if len(key) > keySize {
       return []byte(key[:keySize])
   }
   if len(key) < keySize {
       paddedKey := make([]byte, keySize)
       copy(paddedKey, key)
       return paddedKey
   }
   return []byte(key)
}
func splitKey(key []byte) (uint64, uint64) {
    key0 := binary.LittleEndian.Uint64(key[:8])
    key1 := binary.LittleEndian.Uint64(key[8:])
    return key0, key1
} 
func sipHashDigest(randomBytes []byte, key string) uint64 {
    normalizedKey := normalizeKey(key)
    key0, key1 := splitKey(randomBytes)
    return siphash.Hash(key0, key1, []byte(normalizedKey))
}

Solution

  • Please note that what you call normalizedKey is actually the message (which can be of any length) and that what you call randomBytes is the 128 bit key.
    You can find a good documentation of the Go siphash library here.

    Wikipedia lists this C implementation as reference. This contains a file vectors.h with test vectors, e.g. 64 test vectors vectors_sip64 for SipHash-2-4. More precisely, these are the hash values. The key and input values can be found in the file test.c:

    test vectors:

    key: 0x000102030405060708090a0b0c0d0e0f
    
     #   Hash                 Input
     0   0x310e0edd47db6f72   <empty>
     1   0xfd67dc93c539f874   0x00
    ... 
    16   0xdb9bc2577fcc2a3f   0x000102030405060708090a0b0c0d0e0f
    ...
    63   0x724506eb4c328a95   0x000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e
    

    The Go library you are using satisfies these test vectors (e.g. for the test cases 0, 1, 16, 63):

    package main
    
    import (
        "encoding/binary"
        "encoding/hex"
        "fmt"
    
        "github.com/dchest/siphash"
    )
    
    func main() {
        key, _ := hex.DecodeString("000102030405060708090a0b0c0d0e0f")
    
        message, _ := hex.DecodeString("")
        hash := sipHashDigest(key, message)
        fmt.Println(toBytes(hash)) // 310e0edd47db6f72
    
        message, _ = hex.DecodeString("00")
        hash = sipHashDigest(key, message)
        fmt.Println(toBytes(hash)) // fd67dc93c539f874
    
        message, _ = hex.DecodeString("000102030405060708090a0b0c0d0e0f")
        hash = sipHashDigest(key, message)
        fmt.Println(toBytes(hash)) // db9bc2577fcc2a3f
    
        message, _ = hex.DecodeString("000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e")
        hash = sipHashDigest(key, message)
        fmt.Println(toBytes(hash)) // 724506eb4c328a95
    }
    
    func splitKey(key []byte) (uint64, uint64) {
        key0 := binary.LittleEndian.Uint64(key[:8])
        key1 := binary.LittleEndian.Uint64(key[8:])
        return key0, key1
    }
    
    func sipHashDigest(randomBytes []byte, message []byte) uint64 {
        key0, key1 := splitKey(randomBytes)
        return siphash.Hash(key0, key1, message)
    }
    
    func toBytes(data uint64) string {
        dataBytes := make([]byte, 8)
        binary.LittleEndian.PutUint64(dataBytes, uint64(data))
        return hex.EncodeToString(dataBytes)
    }
    

    Here, splitKey() and sipHashDigest() are essentially the functions you posted. In sipHashDigest() I passed the message directly as byte array for the test.


    The online site you are using only seems to be able to apply ASCII encoding for message and key. The following data can be used for a comparison:

    key := []byte("0123456789012345")
    message := []byte("The quick brown fox jumps over the lazy dog")
    hash := sipHashDigest(key, message)
    fmt.Println(toBytes(hash)) // 654cd7fbec56953a
    

    The result of the online site for the same data is 0x3a9556ecfbd74c65, which differs (only) in the endianess.


    Regarding the function normalizeKey():
    The purpose of normalizeKey() is not entirely clear to me. Since SipHash does not limit the length of the message, it is not necessary to apply the normalizeKey() function to the message (there may be other requirements that make this necessary, but it is not required for SipHash itself).
    If (with regard to the name) normalizeKey() is intended to be used to generate a 16 bytes key for SipHash from an arbitrary string: It is more secure to use a random 16 bytes sequence as key instead of a string (if the key material is a password, a key derivation function should be applied).