Search code examples
c#graph-theory

Find all nodes that are a leaf in a graph


I have a function that determines if a node is a leaf.

I define a leaf as being a node that is not part of a cycle in the graph.

An example:

enter image description here

Leaf nodes are marked with red arrows, they do not belong to a cycle.

Before I find all the cycles in the graph, I first want to eliminate the need to check these leaf nodes to optimise my algorithm.

My current method does not traverse to find if a node is a leaf node as I am not sure how to go about it, I have a basic check that looks like this:

    private static bool IsLeafNode(Node node)
    {
        int TotalLeafs(Node node)
        {
            int j = 0;
            for (int i = 0; i < node.Nodes.Count; i++){
                Node n = node.Nodes[i];
                j += n.Nodes.Count == 1 ? 1 : 0; //if leaf add 1
            }
            return j;
        }

//has 1 connection OR all its connections lead to nodes that have 1 connection
        return node.ConnectionCount == 1 || TotalLeafs(node) == node.Nodes.Count;
    } 

The problem here is it does not consider the two leaf nodes that have 2 connections (but it is obvious still that they are leaf nodes).

How might I go about eliminating all the nodes that are leaf nodes?


Solution

  • You can do a backwards search by starting at the known leaves, and recursively checking their neighbor if they are leaves as well. But each time when you check the number of connections, you have to "remove" the nodes which you have already considered to be leaves. If the resulting number of connection is zero or one, you found a leaf. The algorithm can look like this:

    • Create a new empty Set of leaves named foundLeaves.
    • Find all the "real" leaves in your graph, which have only one connection. Put them in a Set of nodes named leavesToCheck.
    • While the leavesToCheck set is not empty:
      • Pick and remove one node from leavesToCheck.
      • Check the number of connections this node have, minus the nodes which are already in the foundLeaves set:
        • When the number of connections is zero or one: You found a leaf. Add the found leaf into foundLeaves and add the neighbor of that node (if there is one) to the leavesToCheck set to check in the future.
        • When the number of connections is greater or equal two: Do nothing.

    Eventually, the leavesToCheck set becomes empty. At that point the foundLeaves set contains all the nodes, which are considered leaves by your definition and have no cycles.

    Of course, you can use other data structures like lists or queues for the foundLeaves and leavesToCheck collections. If your overall goal/algorithm supports it that you can remove the leaf nodes from the graph, you can do that. That way you don't need the foundLeaves list.