Search code examples
swiftperformancedictionarylogarithmindexed

Is Swift dictionary of indexed for performance? Even for exotic types (UUID)?


I want to construct some arrays that will remain in order to get fast searches. If I use something like this:

let dictionary: [Int:Int] = [:]
for i in 0 ..< 10000000 {
    dictionary[i] = 0
}

Would the query:

dictionary[n] == nil

be performed in logarithmic time?

If yes, is it the same for other types: Float, Double, String.

And finally, I need it to work with the UUID type, will it work?


Solution

  • Swift's Dictionary is implemented with a hash table, therefore lookups will typically be done in O(1) time, assuming a good hashing algorithm. As @rmaddy says, this therefore means that the type of key is mainly irrelevant – the only thing that really matters is whether it has a good implementation of hashValue, which you shouldn't have to worry about at all for standard library types.

    The worst case time complexity for a hash table lookup (assuming maximal hash collisions) depends on how the hash table resolves the collisions. In the case of Dictionary, two storage schemes can be used, native or Cocoa.

    From HashedCollections.swift.gyb:

    In normal usage, you can expect the backing storage of a Dictionary to be a NativeStorage.

    [...]

    Native storage is a hash table with open addressing and linear probing. The bucket array forms a logical ring (e.g., a chain can wrap around the end of buckets array to the beginning of it).

    Therefore the worst case lookup time complexity is O(n), as if each element has the same hash, potentially every element will have to be searched through in order to determine if a given element exists in the table.

    For Cocoa storage, a _CocoaDictionaryBuffer wrapper is used in order to wrap an NSDictionary. Unfortunately, I cannot find any documentation on how it's implemented, although it's Core Foundation counterpart CFDictionary's header file states that:

    The access time for a value in the dictionary is guaranteed to be at worst O(lg N) for any implementation, current and future

    (Which sounds like it uses a balanced binary tree in order to handle collisions.)