I'm storing some data in a vector for various filepaths scanned, e.g. std::vector<CustomData> mDataVector
. A search routine yields multiple indices in this vector where that CustomData
matched some criterion, i.e. it spits another vector of indices (e.g. 0, 8, 1 for three items at indices 0, 8 and 1 in that vector that matched the search criterion). This part cannot be changed since this vector interfaces with an old API I can't change right now.
I also use a std::map<size_t, std::string> mIndexToFilepath
to quickly retrieve the old filepath in that previous vector from the indices of the search API:
// when inserting new data..
std::string filepath = "C:\\...";
CustomData data = getCustomDataFromFilepath(filepath);
mDataVector.push_back(data );
mIndexToFilepath.emplace(mDataVector.size() - 1, filepath);
And then when searching
std::vector<size_t> matchingIndices = searchVector(mDataVector); // I can't change this API
for(auto i : matchingIndices) {
// here I need the old filepath
std::string original_filepath = mIndexToFilepath[i];
.. do something with original_filepath..
}
Question: since these are consecutive indices, I thought of using a std::map
instead of std::unordered_map
. Will this somehow yield better performances during mIndexToFilepath[i]
search? (Not sure actually, I guess it would be a O(logN) search vs a constant search when using std::unordered_map
?) Should I switch to std::unordered_map
here?
Choosing between std::map
and std::unordered_map
comes down to if you care about ordering or lookup.
If you need ordering, then you want std::map
, at the cost of slower lookup (logarithmic).
If you need fast lookup, they you want std::unordered_map
, at the cost of no guaranteed ordering when looping.
However, in your case, since the elements have consecutive keys, why not just store the old file paths in a std::vector<std::string>
where the elements align with your std::vector<CustomData>
? Here you have the same ordering and the lookup is also constant time (and likely better then a hash map, since youre just jumping to an index rather than hashing the key).