Search code examples
c++depth-first-searchunordered-maptopological-sort

Error in topological sort when representing the graph as an unordered_map<string, vector<string>>


When I try to implement a topological sort in C++ with an unordered_map<string, vector<string>> which represents the graph, I encounter an unexplainable error (from my part). Specifically, this happens only when the 'current node' that is being visited does not exist as a key in the unordered_map (i.e, it has no outgoing edges). Instead of returning the 'correct' order, it terminates the function call topSort entirely and returns only a small subset of the order.

The code returns: ML, AML, DL

Instead, a possible correct solution could be: LA, MT, MA, PT, ML, AML, DL

Can anyone explain why this happens?

The following is a small code snippet where the problem occurs:

// 0 -> white (node has not been visited)
// 1 -> grey (node is currently being visited)
// 2 -> black (node is completely explored)
bool topSortVisit(unordered_map<string, vector<string>>& graph,
        unordered_map<string, int>& visited, string node, vector<string>& result){

    if(visited[node] == 1) return false;
    if(visited[node] == 2) return true;

    // Mark current node as being visited.
    visited[node] = 1;
    // node might not have outgoing edges and therefore not in the
    // unordered_map (graph) as a key.
    for(auto neighbor : graph[node]){
        if(!topSortVisit(graph, visited, neighbor, result)) return false;
    }

    result.push_back(node);
    visited[node] = 2;
    return true;
}

vector<string> topSort(unordered_map<string, vector<string>>& graph){

    unordered_map<string, int> visited;
    vector<string> result;

    // Should visit all nodes with outgoing edges in the graph.
    for(auto elem : graph){
        string node = elem.first;
        bool acyclic = topSortVisit(graph, visited, node, result);
        if(!acyclic){
            cout << "cycle detected\n";
            return vector<string>{};
        }

    }

    reverse(result.begin(), result.end());
    return result;
}

And here is the code to reproduce everything:

#include<iostream>
#include<vector>
#include<unordered_map>
#include<algorithm>

using namespace std;

bool topSortVisit(unordered_map<string, vector<string>>& graph,
        unordered_map<string, int>& visited, string node, vector<string>& result){

    if(visited[node] == 1) return false;
    if(visited[node] == 2) return true;

    visited[node] = 1;
    for(auto neighbor : graph[node]){
        if(!topSortVisit(graph, visited, neighbor, result)) return false;
    }

    result.push_back(node);
    visited[node] = 2;
    return true;
}

vector<string> topSort(unordered_map<string, vector<string>>& graph){

    unordered_map<string, int> visited;
    vector<string> result;

    for(auto elem : graph){
        string node = elem.first;
        bool acyclic = topSortVisit(graph, visited, node, result);
        if(!acyclic){
            cout << "cycle detected\n";
            return vector<string>{};
        }

    }

    return result;
}


unordered_map<string, vector<string>> makeGraph(vector<pair<string, string>> courses){
    unordered_map<string, vector<string>> graph;

    for(auto p : courses){
        graph[p.first].push_back(p.second);
    }
    return graph;
}

int main(){

    vector<pair<string, string>> pairs;
    pairs.push_back(make_pair("LA", "ML"));
    pairs.push_back(make_pair("MT", "ML"));
    pairs.push_back(make_pair("MA", "PT"));
    pairs.push_back(make_pair("PT", "ML"));
    pairs.push_back(make_pair("ML", "DL"));
    pairs.push_back(make_pair("ML", "AML"));

    auto graph = makeGraph(pairs);
    vector<string> result = topSort(graph); // ML, AML, DL
    // A possible correct solution could be: LA, MT, MA, PT, ML, AML, DL


    for(string s : result){
        cout << s << " ";
    }
    cout << "\n";
}

Solution

  • Inserting into an unordered_map invalidates iterators into the map if it rehashes. That breaks your loop with auto elem : graph (which, incidentally, copies your vector<string> objects; use auto &elem instead). Pass your graph as const& to avoid such shenanigans; the compiler will then gently suggest that you use at instead of operator[].