Search code examples
c++algorithmgraphunion-find

Detection of cycle in Graph using find and Union


    int main()
    {

    char line[100];
    int N = 5;
    vector<int>adj[N];
    FILE *in = fopen("test.txt", "r");

    for (int i = 1; i <= N; i++) // Accepting the graph from file
    {
        fgets(line, 100, in);

        char *pch = strtok(line, "\t \n");
        int u = atoi(pch);

        pch = strtok(NULL, "\t \n");
        while (pch != NULL)
        {
            int v = atoi(pch);
            adj[u-1].push_back(v);
            pch = strtok(NULL, "\t \n");
        }

    }
        for( int i = 0; i < 5; i++ )  // printing the graph
        {
           for( int p = 0 ; p < adj[i].size(); p++ )
           {
                cout<< i+1 << " , "<< adj[i][p]<<endl;
           }
        }

        if (isCycle(adj))
             cout << endl << "graph contains cycle" ;
        else
             cout << endl << "graph  does not contain cycle" ;

        return 0;
    }

    int isCycle( vector<int> adj[] )
    {
        // Allocate memory for creating V subsets
        int *parent = (int*) malloc( 5 * sizeof(int) );

       // Initialize all subsets as single element sets
        memset(parent, -1, sizeof(int) * 5);
        for(int i = 0; i < 5; i++)
        {
           for( int p = 0 ; p < adj[i].size(); p++ )
           {    
                int x = find(parent,i);
                int y = find(parent, adj[i][p]-1);  // I think problem is here

                if (x == y)
                return 1;

            Union(parent, x, y);
            }
        }
        return 0;
    }   

    // A utility function to find the subset of an element i
    int find(int parent[], int i)
    {    
        if (parent[i] == -1)
            return i;
        return find(parent, parent[i]);
    }

    // A utility function to do union of two subsets
    void Union(int parent[], int x, int y)
    {
        int xset = find(parent, x);
        int yset = find(parent, y);
        parent[xset] = yset;
    }

test.txt file contains following inputs :

1 2 3
2 1 4 5
3 1
4 2 
5 2

First column contains the vertices ( 1 - 5 )

1 2 3 

Above row ( first row ) means , Node 1 is connected to Node 2 and Node 3

2 1 4 5 

Above row ( 2nd row ) means , Node 2 is connected to Node 1 , Node 4 and Node 5

Now here problem is , take any input it always says : Graphs contains cycle.( Though graph does not contain cycle ) Now in above input graph does not contain cycle but saying graph contains cycle. Where am I wrong ?? Can anyone help me ??


Solution

  • The problem is with your input, but first some background:


    Background on using Union-Find to discover Cycles

    The Union-Find algorithm expects an undirected graph.

    It essentially works as follows:

    • Create a set of edges which are basically pairs of node ids
      • e.g. (1,2), (2,3)
    • For each edge:
      • find the "parent" of the left-side (Find Portion)
      • find the "parent" of the right-side (Find Portion)
      • if the parents are identical, you have a cycle
      • otherwise, the parent of the left-side now equals the parent of the right side (Union Portion)

    "Parent": Is an arbitrary designation between two undirected nodes. Arbitrarily we say that one is the parent of the other, and not vice versa.

    • At first, no nodes have a parent (which the sentinel value of -1 is used for.
    • Then, as you iterate over edges, you will assign these parents
      • if a parent doesn't exist, a node is its own parent (0 is the parent of 0, 1 is the parent of 1, etc.)
      • after computing the parents for both sides of an edge (e.g. 1 and 2 for edge (1, 2) we will at first see that their parents are not the same (1's parent is 1 and 2's parent is 2).
      • At this point, we union the parents to make them the same
        • the parent of 1 becomes 2 and the parent of 2 remains 2
        • think of the "Union" portion as "unioning two subsets of nodes under a common parent", so the subsets 1 and 2 become (1, 2), whose parent is 2.

    However, the way your algorithm is written, it assumes that if we first receive edge (1, 2), then we will not later receive edge (2, 1). Your input disagrees. Therefore you have cycles.

    If you remove these redundant edges and provide input like:

    1 2 3
    2 4 5
    3 
    4  
    5
    

    It will work(I C++-ified the heck out of your code here, by the way). However, otherwise it will correctly report a cycle

    Your challenge

    Therefore is to take into account that your input is different than what your algorithm expects. That is, you should probably not create edges if one already exists.

    What I would recommend: - since the graph is undirected, store edges with the smaller id always on the left. You can maintain a sorted list of edges, and do not insert duplicate edges (use a std::set) to represent your edge list).

    The resulting code looks something like this (using cin for input):

    using edge_t = std::pair<int, int>;
    using edge_list_t = std::set<edge_t>;
    using parent_child_map_t = std::map<int, int>;
    
    // find the parent for an id
    int find(const parent_child_map_t& idToParent, int id)
    {    
        auto iter = idToParent.find(id);
        if (iter == idToParent.end())
            return id;
        return find(idToParent, iter->second);
    }
    
    // lhsId and rhsId are two sides to an edge
    // this Union function will determine who is the "parent"
    // arbitrarily choosing the rhsId's parent as lhsId's parent
    void ComputeUnion(parent_child_map_t* idToParent, int lhsId, int rhsId)
    {
        if (!idToParent)
            return;
    
        int xsubset = find(*idToParent, lhsId);
        int ysubset = find(*idToParent, rhsId);
        (*idToParent)[xsubset] = ysubset;
    }
    
    bool HasCycle(const edge_list_t& edges )
    { 
        // determine parents
        parent_child_map_t idToParent;
    
        for (auto&& nextEdge : edges)
        {
            int x = find(idToParent, nextEdge.first);
            int y = find(idToParent, nextEdge.second);
            if (x == y)
                return true;
            ComputeUnion(&idToParent, x, y);
        }
    
        return false;
    } 
    
    int main()
    {
        edge_list_t edges;
    
        std::string nextline;
        while(std::getline(std::cin, nextline))
        {
            std::istringstream nextLineStream(nextline);
            int id;
            nextLineStream >> id;
            int nextNeighbor;
            while(nextLineStream >> nextNeighbor)
            {
                int lhs = std::min(id, nextNeighbor);
                int rhs = std::max(id, nextNeighbor);
                edges.insert(std::make_pair(lhs, rhs));
            }
        }
    
        if (HasCycle(edges))
             std::cout << "Graph contains cycle\n";
        else
             std::cout << "Graph does not contain cycle\n";
    
         return 0;
    }
    

    And now it no longer reports a cycle in your input!

    But, if we provide it input like so (Notice the additional edge of (4,1)):

    1 2 3
    1 2 3
    2 1 4 5
    3 1
    4 2 1
    5 2
    

    Then it correctly reports a cycle!