Search code examples
gobit-manipulationbitwise-operators

Is it possible to find duplicate characters in a random unicode string using the bitwise operations?


I was looking for a solution to find duplicate characters in a string and I was interested in a solution with bitwise operations.

I found such a variant with bitwise operations. But in it, the search occurs in the range a-z of the ASCII table.

func HasDuplicates(str string) (string, bool) {
    checker := 0
    for _, char := range str {
        val := char - 'a'
        fmt.Println(val)
        if (checker & (1 << val)) > 0 {
            fmt.Printf("'%c' is Duplicate\n", char)
            return str, false
        }
        checker |= 1 << val
    }
    return str, true
}

Is it possible to make a universal solution, like the example above, only for a random unicode string (hieroglyphs, emoji, etc.)?


Solution

  • Use a big.Int as a bitset:

    func HasDuplicates(str string) (string, bool) {
        var bits big.Int
        for _, char := range str {
            val := int(char)
            fmt.Println(val)
            if bits.Bit(val) != 0 {
                fmt.Printf("'%c' is Duplicate\n", char)
                return str, false
            }
            bits.SetBit(&bits, val, 1)
        }
        return str, true
    }
    

    https://go.dev/play/p/kS-OxYPts5G

    How efficient this is will depend on the implementation of big.Int, you're not in control of that the way you are when using bitwise operations on a simple integer.

    You could also use a map of booleans, although then it wouldn't be bitwise operations any more:

    func HasDuplicates(str string) (string, bool) {
        var bits = make(map[int]bool)
        for _, char := range str {
            val := int(char)
            fmt.Println(val)
            if bits[val] {
                fmt.Printf("'%c' is Duplicate\n", char)
                return str, false
            }
            bits[val] = true
        }
        return str, true
    }