Search code examples
swiftsethashable

Swift: Using ObjectID of class for hashable protocol results in random behaviour in set.contains method. What is wrong with the code?


I have a small number of instances of a custom class stored in a set. I need to check if a certain element is contained in that set. The criteria for a match must be the object's ID, not its content.

For simplification assume a class with an integer var as the only property, and two different instances of that class, both holding the number 1.

Directly comparing those instances should return true, but when a reference to the first is stored in the set, a query if the set contains a reference to the second one should return false.

Therefore I use the ObjectIdentifier of the object to generate the hash function required by the hashable protocol.

It is my understanding that the .contains method of a Swift Set uses the hash value first, and in case of hash collisions the equatable method is used as a fallback.

But in the following code, which can run in a playground, I get randum results:

class MyClass: Hashable {
    var number: Int
    init(_ number: Int) {
        self.number = number
    }
    static func == (lhs: MyClass, rhs: MyClass) -> Bool {
        return lhs.number == rhs.number
    }
    func hash(into hasher: inout Hasher) {
        hasher.combine(ObjectIdentifier(self))
    }
}

var mySet: Set<MyClass> = []

let number1 = MyClass(1)
let secondNumber1 = MyClass(1)

number1 == secondNumber1        // true: integer values are equal, so are the wrapping classes
number1 === secondNumber1       // false: two different instances

mySet.insert(number1)

mySet.contains(number1)         // true
mySet.contains(secondNumber1)   // should be false but randomly changes between runs

If you run the above code in an XCode Playground and manually restart playground execution this gives different results for the last line on each run. The desired behaviour is to get "false" every time.

So what would be the correct way to achieve the described bahaviour?


Solution

  • Simply put, Set relies on func hash(into hasher: inout Hasher) and ==. It is invalid to have an unmatched pair of these. In your case, your equality is value-based (dependant upon self.number), whereas your hash is identity based. This isn't legal.

    Your mySet.contains(secondNumber1) line is failing because secondNumber2 might have a hash collision with number1. Whether a collision occurs or not is undefined, because Swift uses a random seed to defend against hash-flood DDoS attacks. If a hash collision does occur, then your equality operator (==) falsely identifies as number1 as a match for secondNumber1

    Instead, what you could do is implement a wrapper struct that implements equality and hashing based on an object's identity. The object itself could have its own value-based equality and hash, for other purposes.

    struct IdentityWrapper<T: AnyObject> {
        let object: T
    
        init(_ object: T) { self.object = object }
    }
    
    extension IdentityWrapper: Equatable {
        static func == (lhs: IdentityWrapper, rhs: IdentityWrapper) -> Bool {
            return lhs.object === rhs.object
        }
    }
    
    extension IdentityWrapper: Hashable {
        func hash(into hasher: inout Hasher) {
            hasher.combine(ObjectIdentifier(self.object))
        }
    }
    

    Using the IdentityWrapper in a set requires you to manually wrap objects before interacting with the set. It's performant (since struct don't need any array allocation), and most likely the struct is entirely inlined anyway, but it can be a little annoying. Optionally, you could implement a struct IdentitySet<T> which just wraps a Set<IdentityWrapper<T>>, which tucks away the wrapping code.

    class MyClass: Hashable {
        var number: Int
    
        init(_ number: Int) {
            self.number = number
        }
    
        // Value-based equality
        static func == (lhs: MyClass, rhs: MyClass) -> Bool {
            return lhs.number == rhs.number
        }
    
        // Value-based hashing
        func hash(into hasher: inout Hasher) {
            hasher.combine(self.number)
        }
    }
    
    var mySet: Set<IdentityWrapper<MyClass>> = []
    
    let number1 = MyClass(1)
    let secondNumber1 = MyClass(1)
    
    number1 == secondNumber1        // true: integer values are equal, so are the wrapping classes
    number1 === secondNumber1       // false: two different instances
    
    mySet.insert(IdentityWrapper(number1))
    
    print(mySet.contains(IdentityWrapper(number1))) // true
    print(mySet.contains(IdentityWrapper(secondNumber1))) // false