Search code examples
c++unordered-mapset-intersection

Intersection of two std::unordered_map


I have two std::unordered_map

std::unordered_map<int, int> mp1;
std::unordered_map<int, int> mp2;

I need to find the intersection of key-value pairs and store it in another map of the form.

std::unordered_map<int, int> mp;

How can i do this??


Solution

  • You could use std::set_intersection to fill a new container containing the key, value pairs that exists in both maps. set_intersection needs the ranges to be sorted (which is exactly what you won't get from an unordered_map) so either, replace the unordered_maps with map or create temporary maps (or temporary std::set<std::pair<int, int>>s) before using set_intersection.

    I would recommend replacing your original unordered_maps with ordered maps for efficiency if you need intersections often:

    #include <algorithm>
    #include <iostream>
    #include <iterator>
    #include <map>
    #include <unordered_map>
    #include <vector>
    
    int main() {
        std::map<int, int> mp1 {{1,0}, {2,0}, {3,0}};
        std::map<int, int> mp2 {{0,0}, {2,0}, {3,0}};
    
        // this can be unordered unless you plan to use it in an intersection later:
        std::unordered_map<int, int> mp;
    
        std::set_intersection(
            mp1.begin(), mp1.end(),
            mp2.begin(), mp2.end(), 
            std::inserter(mp, mp.begin())
        );
    
        for(auto[key, val] : mp) {
            std::cout << key << ',' << val << '\n';
        }
    }
    

    Possible output:

    3,0
    2,0
    

    If you want to stay with unordered_maps and to not have to create temporary sets or maps, you can just replace set_intersection above with a manual filler:

        const auto& [min, max] = std::minmax(mp1, mp2,
                                             [](auto& a, auto& b) {
                                                 return a.size() < b.size();
                                             });
        for(auto& [key, value] : min) {               // iterate over the smallest map
            auto fi = max.find(key);                  // find in the bigger map
            if(fi != max.end() && fi->second == value)
                mp.emplace(key, value);               // add the pair if you got a hit
        }
    

    The reason for iterating over the smallest map is to keep the number of find operations down to a minimum. Consider a case where one map contains 1 element and the other 1000000 elements. You then want 1 lookup and not 1000000.

    A more generic solution could be making a function template out of it:

    template<
        class Key,
        class T,
        class Hash = std::hash<Key>,
        class KeyEqual = std::equal_to<Key>,
        class Allocator = std::allocator< std::pair<const Key, T> >
    >
    auto unordered_map_intersection(
        const std::unordered_map<Key,T,Hash,KeyEqual,Allocator>& mp1,
        const std::unordered_map<Key,T,Hash,KeyEqual,Allocator>& mp2)
    {
        std::unordered_map<Key,T,Hash,KeyEqual,Allocator> mp;
    
        const auto& [min, max] = std::minmax(mp1, mp2,
                                             [](auto& a, auto& b) {
                                                 return a.size() < b.size();
                                             });
        for(auto& [key, value] : min) {               // iterate over the smallest map
            auto fi = max.find(key);                  // find in the bigger map
            if(fi != max.end() && fi->second == value)
                mp.emplace(key, value);               // add the pair if you got a hit
        }
        return mp;
    }