Search code examples
arraysswiftperformanceoptimizationset

Is really Set.contains is much faster than Array.contains? And why?


Many people tell that Set.contains performance overcomes Array.contains performance, for example: https://www.hackingwithswift.com/forums/swift/why-is-contains-so-much-faster-with-sets-compared-to-arrays/13421

But what's behind if its true?

In case of set, first, hash value is calculated. Then program iterates over low-level unsorted array (not a swift array, but something like values aligned together in memory) of unique hashes, until it is matched.

In case of array, the program just iterates over array until it is matched.

So I guess that Set.contains overcomes Array.contains in performance only then Array contains lots of duplicates.

But if Array consists of unique values, its .contains method should overcome Set's one.

Also, comparing hashed should take much more resources than comparing Integers or other simple values (I know that Integer is Struct in swift, but anyway, it should finally come to compare 8 bytes in memory in case of iOS while hash should take much more bytes in memory).

Can you please explain if I'm correct or not?

Also, Set behind the scene may use some tricks to improve searching performance like using sorted low-level arrays and binary searching which would be much faster that linear search...

Thank you.


Solution

  • Then program iterates over low-level unsorted array

    No, a Set is more like a dictionary with the hash value as key and the real value as value. contains called on a Set has O(1) complexity, this means it takes always the same time to get the value regardless of the size of the Set.

    In contrast an Array does iterate over the items, the complexity of contains is O(n) where n is the length of the array.

    Bottom line: A Set is much more performant with regard to contains.