While refactoring some old code a cumbersome multilevel-map developed in-house was replaced by an std::undordered_multiset
.
The multilevel-map was something like [string_key1,string_val]
. A complex algorithm was applied to derive the keys from string_val
and resulted in duplicate string_val
being stored in the map but with different keys.
Eventually at some point of the application the multilevel-map was iterated to get the string_val
and its number of occurrences.
It replaced was an std::unordered_multilevelset
and string_val
are just inserted to it. It seems much simpler than having an std::map<std::string,int>
and checking-retrieving-updating the counter for every insertion.
What I want to do retrieve the number of occurrences of its inserted element, but I do not have the keys beforehands. So I iterate over the buckets but my program crashes upon creation of the string.
// hash map declaration
std::unordered_multiset<std::string> clevel;
// get element and occurences
for (size_t cbucket = clevel->bucket_count() - 1; cbucket != 0; --cbucket)
{
std::string cmsg(*clevel->begin(cbucket));
cmsg += t_str("times=") + \
std::to_string(clevel->bucket_size(cbucket));
}
I do not understand what is going on here, tried to debug it but I am somehow stack( overflown ?) :) . Program crashes in std::string cmsg(*it);
You should consider how multiset
actually works as a hashtable. For example reading this introduction you should notice that hash maps actually preallocate their internal buckets , and the number of buckets is optimized.
Therefore if you insert element "hello" , you will probably get a number of buckets already created, but only the one corresponding to hash("hello")
will actually have an element that you may dereference. The rest will be let's say invalid.
Dereferencing the iterator to the begin of every bucket results in SEGV which is your case here.
To remedy this situation you should check every time that begin is not past the end.
for (size_t cbucket = clevel->bucket_count() - 1; cbucket != 0; --cbucket)
{
auto it = clevel->begin(cbucket);
if (it != clevel->end(cbucket))
{
std::string cmsg(*it);
cmsg += t_str("times=") + \
std::to_string(clevel->bucket_size(cbucket));
}
}