Can anyone help shed some light on why the below code consumes well over 100 MB of RAM during runtime?
public struct Trie<Element : Hashable> {
private var children: [Element:Trie<Element>]
private var endHere : Bool
public init() {
children = [:]
endHere = false
}
public init<S : SequenceType where S.Generator.Element == Element>(_ seq: S) {
self.init(gen: seq.generate())
}
private init<G : GeneratorType where G.Element == Element>(var gen: G) {
if let head = gen.next() {
(children, endHere) = ([head:Trie(gen:gen)], false)
} else {
(children, endHere) = ([:], true)
}
}
private mutating func insert<G : GeneratorType where G.Element == Element>(var gen: G) {
if let head = gen.next() {
let _ = children[head]?.insert(gen) ?? { children[head] = Trie(gen: gen) }()
} else {
endHere = true
}
}
public mutating func insert<S : SequenceType where S.Generator.Element == Element>(seq: S) {
insert(seq.generate())
}
}
var trie = Trie<UInt32>()
for i in 0..<300000 {
trie.insert([UInt32(i), UInt32(i+1), UInt32(i+2)])
}
Based on my calculations total memory consumption for the above data structure should be somewhere around the following...
3 * count * sizeof(Trie<UInt32>)
Or –
3 * 300,000 * 9 = 8,100,000 bytes = ~8 MB
How is it that this data structure consumes well over 100 MB during runtime?
sizeof
reports only the static footprint on the stack, which the Dictionary
is just kind of a wrapper of the reference to its internal reference type implementation, and also the copy on write support. In other words, the key-value pairs and the hash table of your dictionary are allocated on the heap, which is not covered by sizeof
. This applies to all other Swift collection types.
In your case, you are creating three Trie
- and indirectly three dictionaries - every iteration of the 300000. I wouldn't be surprised if the 96-byte allocations mentioned by @Macmade is the minimum overhead of a dictionary (e.g. its hash bucket).
There might also be cost related to growing storage. So you may try to see if setting a minimumCapacity
on the dictionary would help. On the other hand, if you do not need a divergent path generated per iteration, you may consider an indirect enum as an alternative, e.g.
public enum Trie<Element> {
indirect case Next(Element, Trie<Element>)
case End
}
which should use less memory.