I have a huge STL container of std::string
(hundreds of thousands of entries). I am using a vector
at the moment, but I am happy to change. The contents are sorted alphabetically, and are formed of alphabetical lower case characters a-z plus ñ
.
I am trying to implement a function which receives a const std::string& prefix
and returns a pair of iterators, one pointing at the first element beginning with such prefix and the other one pointing at the last one. If no strings match the criteria, return any two identical iterators. I need efficiency in the lookup because the vector is huge, so I want to make use of it being ordered for a binary search.
I think std::lower_bound
is what I am looking for, but I am missing a function to compare std::string
s which can deal with the Spanish ñ
.
In order to look up words by prefix, the correct structure is a trie. You can take a public implementation (there are several on the internet) and extend it to fit your needs.