c++vector data-structures binary-search-tree unordered-map

Using std::map or std::unordered_map when storing consecutive indices

I'm storing some data in a vector for various filepaths scanned, e.g. std::vector<CustomData> mDataVector. A search routine yields multiple indices in this vector where that CustomData matched some criterion, i.e. it spits another vector of indices (e.g. 0, 8, 1 for three items at indices 0, 8 and 1 in that vector that matched the search criterion). This part cannot be changed since this vector interfaces with an old API I can't change right now.

I also use a std::map<size_t, std::string> mIndexToFilepath to quickly retrieve the old filepath in that previous vector from the indices of the search API:

// when inserting new data..
std::string filepath = "C:\\...";
CustomData data = getCustomDataFromFilepath(filepath);
mDataVector.push_back(data );
mIndexToFilepath.emplace(mDataVector.size() - 1, filepath);

And then when searching

std::vector<size_t> matchingIndices = searchVector(mDataVector); // I can't change this API
for(auto i : matchingIndices) {
   // here I need the old filepath
   std::string original_filepath = mIndexToFilepath[i];
   .. do something with original_filepath..
}

Question: since these are consecutive indices, I thought of using a std::map instead of std::unordered_map. Will this somehow yield better performances during mIndexToFilepath[i] search? (Not sure actually, I guess it would be a O(logN) search vs a constant search when using std::unordered_map?) Should I switch to std::unordered_map here?

Solution

Choosing between std::map and std::unordered_map comes down to if you care about ordering or lookup.

If you need ordering, then you want std::map, at the cost of slower lookup (logarithmic).

If you need fast lookup, they you want std::unordered_map, at the cost of no guaranteed ordering when looping.

However, in your case, since the elements have consecutive keys, why not just store the old file paths in a std::vector<std::string> where the elements align with your std::vector<CustomData>? Here you have the same ordering and the lookup is also constant time (and likely better then a hash map, since youre just jumping to an index rather than hashing the key).