Search code examples
swiftiequatablehashable

Set's contains method returns different value at different time


I was thinking about how Swift ensures uniqueness for Set because I have turned one of my obj from Equatable to Hashable for free and so I came up with this simple Playground

struct SimpleStruct: Hashable {
    let string: String
    let number: Int

    static func == (lhs: SimpleStruct, rhs: SimpleStruct) -> Bool {
        let areEqual = lhs.string == rhs.string
        print(lhs, rhs, areEqual)
        return areEqual
    }
}

var set = Set<SimpleStruct>()
let first = SimpleStruct(string: "a", number: 2)
set.insert(first)

So my first question was:

Will the static func == method be called anytime I insert a new obj inside the set?

My question comes from this thought:

For Equatable obj, in order to make this decision, the only way to ensure two obj are the same is to ask the result of static func ==.

For Hashable obj, a faster way is to compare hashValues... but, like in my case, the default implementation will use both string and number, in contrast with == logic.

So, in order to test how Set behaves, I have just added a print statement.

I have figured out that sometimes I got the print statement, sometimes no. Like sometimes hashValue isn't enough in order to make this decision ... So the method hasn't been called every time. Weird...

So I've tried to add two objects that are equal and wondering what will be the result of set.contains

let second = SimpleStruct(string: "a", number: 3)
print(first == second) // returns true
set.contains(second)

And wonders of wonders, launching a couple of times the playground, I got different results and this might cause unpredictable results ... Adding

var hashValue: Int {
    return string.hashValue
}

it gets rid of any unexpected results but my doubt is:

Why, without the custom hashValue implementation, == sometimes gets called and sometimes it doesn't? Should Apple avoid this kind of unexpected behaviours?

Returns false returns true


Solution

  • The synthesized implementation of the Hashable requirement uses all stored properties of a struct, in your case string and number. Your implementation of == is only based on the string:

    let first = SimpleStruct(string: "a", number: 2)
    let second = SimpleStruct(string: "a", number: 3)
    
    print(first == second) // true
    print(first.hashValue == second.hashValue) // false
    

    This is a violation of a requirement of the Hashable protocol:

    Two instances that are equal must feed the same values to Hasher in hash(into:), in the same order.

    and causes the undefined behavior. (And since hash values are randomized since Swift 4.2, the behavior can be different in each program run.)

    What probably happens in your test is that the hash value of second is used to determine the “bucket” of the set in which the value would be stored. That may or may not be the same bucket in which first is stored. – But that is an implementation detail: Undefined behavior is undefined behavior, it can cause unexpected results or even runtime errors.

    Implementing

    var hashValue: Int {
        return string.hashValue
    }
    

    or alternatively (starting with Swift 4.2)

    func hash(into hasher: inout Hasher) {
        hasher.combine(string)
    }
    

    fixes the rule violation, and therefore makes your code behave as expected.