My app uses mutable sets of custom elements. Once I had a crash with error „Duplicate element found in Set. Elements may have been mutated after insertion.“
Searching for an explanation, I found this post, which I don’t fully understand.
My impression is that one should not modify an element of a set, since this would modify also the hash value of the set, so that further accesses might fail.
My questions:
EDIT:
To phrase it differently: Is it safe to modify the property of a custom element of a mutable set, without modifying the set itself?
The implementation of Swift sets is similar to that of dictionaries, which is described nicely in Exploring Swift Dictionary's Implementation. In particular, the element storage is a list of “buckets“ each of which can be occupied or not. When a new element is inserted into the set, its hash value is used to determine the initial bucket. If that bucket is occupied, a linear search for the next free bucket is done. Similarly, when searching an element in the set, the hash value is used for determining the initial bucket, and then a linear search is done until the element (or an unoccupied bucket) is found.
(The details can be found in the open source implementation, the most relevant source files are Set.swift, NativeSet.swift, SetStorage.swift and HashTable.swift.)
Mutating the hash value of an inserted element breaks the invariants of the set storage implementation: Locating an element via its initial bucket no longer works. And mutating other properties which affect equality can lead to multiple “equal” elements in the same bucket list.
Therefore I think it is safe to say that
After inserting an instance of a reference type into a set, the properties of that instance must not be modified in a way that affects its hash value or testing for equality.
Examples
First, this is only a problem for sets of a reference type. A set of a value type contains independent copies of the value, and modifying a property of that value after the insertion does not affect the set:
struct Foo: Hashable {
var x: Int
}
var set = Set<Foo>()
var foo = Foo(x: 1)
set.insert(foo)
print(set.map { $0.x }) // [1]
foo.x = 2
print(set.map { $0.x }) // [1]
set.insert(foo)
print(set.map { $0.x }) // [1, 2]
Instances of a reference type are “pointers” to the actual object storage, and modifying a property of that instance does not mutate the reference. It is therefore possible to modify a property of an instance after it has been inserted into a set:
class Bar: Hashable {
var x : Int
init(x: Int) { self.x = x }
static func == (lhs: Bar, rhs: Bar) -> Bool { return lhs.x == rhs.x }
func hash(into hasher: inout Hasher) { hasher.combine(x) }
}
var set = Set<Bar>()
let bar = Bar(x: 1)
set.insert(bar)
print(set.map { $0.x }) // [1]
bar.x = 2
print(set.map { $0.x }) // [2]
However, this leads to crashes easily, e.g. if we insert the same reference again:
set.insert(bar)
Fatal error: Duplicate elements of type 'Bar' were found in a Set. This usually means either that the type violates Hashable's requirements, or that members of such a set were mutated after insertion.
Here is another example, where the hash value is identical for all instances, but modifying a property which is used for equality testing leads to a set of two “equal” instances:
class Baz: Hashable {
var x : Int
init(x: Int) { self.x = x }
static func == (lhs: Baz, rhs: Baz) -> Bool { return lhs.x == rhs.x }
func hash(into hasher: inout Hasher) { }
}
var set = Set<Baz>()
let baz1 = Baz(x: 1)
set.insert(baz1)
let baz2 = Baz(x: 2)
set.insert(baz2)
baz1.x = 2
print(set.map { $0.x }) // [2, 2]
print(set.count) // 2
print(Set(Array(set)).count) // 1 😲