I am working on a project in Swift, using a dictionary
.
This dictionary is of the type [String : [Posting]]
. I have around 200k different "terms" (keys) to insert in it, and for each term I have around 500 to 1000 objects to append in a list. I know it's a weird practice but I don't have the choice and I must deal with all those elements.
The issue is that this is very very slow, as the dictionary gets bigger. I tried switching to a NSMutableDictionary
, no luck.
My addTerm
function is called everytime I need to insert an element :
func addTerm(_ term: String, withId id: Int, atPosition position: Int) {
if self.map[term] == nil {
self.map[term] = [Posting]()
}
if self.map[term]!.last?.documentId == id {
self.map[term]!.last?.addPosition(position)
}
else {
self.map[term]!.append(Posting(withId: id, atPosition: position, forTerm: term))
}
}
EDIT: I realized that its not the dictionary that causes all this lag, but its actually the arrays it contains. Arrays re-allocate way too much when adding new elements, and the best I could was to replace them with ContiguousArray
.
This is fairly common performance trap, as also observed in:
The issue stems from the fact that the array you're mutating in the expression self.map[term]!.append(...)
is a temporary mutable copy of the underlying array in the dictionary's storage. This means that the array is never uniquely referenced and so always has its buffer re-allocated.
This situation will fixed in Swift 5 with the unofficial introduction of generalised accessors, but until then, one solution (as mentioned in both the above Q&As) is to use Dictionary
's subscript(_:default:)
which from Swift 4.1 can mutate the value directly in storage.
Although your case isn't quite a straightforward case of applying a single mutation, so you need some kind of wrapper function in order to allow you to have scoped access to your mutable array.
For example, this could look like:
class X {
private var map: [String: [Posting]] = [:]
private func withPostings<R>(
forTerm term: String, mutations: (inout [Posting]) throws -> R
) rethrows -> R {
return try mutations(&map[term, default: []])
}
func addTerm(_ term: String, withId id: Int, atPosition position: Int) {
withPostings(forTerm: term) { postings in
if let posting = postings.last, posting.documentId == id {
posting.addPosition(position)
} else {
postings.append(Posting(withId: id, atPosition: position, forTerm: term))
}
}
}
// ...
}