Search code examples
c#algorithmcircular-referencecycle-detection

Most efficient way of finding circular references in list


Given the following list of redirects

[
    {
        "old": "a",
        "target": "b"
    },
    {
        "old": "b",
        "target": "c"
    },
    {
        "old": "c",
        "target": "d"
    },
    {
        "old": "d",
        "target": "a"
    },
    {
        "old": "o",
        "target": "n"
    },
    {
        "old": "n",
        "target": "b"
    },
    {
        "old": "j",
        "target": "x"
    },
    {
        "old": "whatever",
        "target": "something"
    }
]

Here we can see that the first item "a" should redirect to "b". If we follow the list we can see the following pattern:

a -> b
b -> c
c -> d
d -> a

So we would end up with a circular reference since "a" would end up pointing to "d" and "d" points to "a".

What would be the most efficient way of finding the circular references?

I've come up with the following algorithm in C#

var items = JsonConvert.DeserializeObject<IEnumerable<Item>>(json)
    .GroupBy(x => x.Old)
    .Select(x => x.First())
    .ToDictionary(x => x.Old, x => x.Target);
var circulars = new Dictionary<string, string>();
foreach (var item in items)
{
    var target = item.Value;
    while (items.ContainsKey(target))
    {
        target = items[target];

        if (target.Equals(item.Key))
        {
            circulars.Add(target, item.Value);
            break;
        }
    }
}

This will give me a list containing 4 items looking like this:

[
    {
        "old": "a",
        "target": "b"
    },
    {
        "old": "b",
        "target": "c"
    },
    {
        "old": "c",
        "target": "d"
    },
    {
        "old": "d",
        "target": "a"
    }
]

But im only interested in telling the user something like

"Hey you can't do that, It will be a circular reference because of "a" points to "b" which points to "c" which points to "d" which points to "a"

So, do you guys have any suggestions? Im sure that some other(better) algorithms exists for doing this... :)


Solution

  • While the generic graph-cycle-finding algorithms will work, your case is a bit special due to the "Old is unique, target is not" constraint. It effectively means, that each node can only have a single successor and thus it can at most be part of one cycle. Also, when DFS-Traversing the nodes, there won't be any fork, so an iterative DFS implementation becomes very easy.

    Given an arbitrary starting node, this function can find a cycle that is reachable from the starting node:

    /// <summary>
    /// Returns a node that is part of a cycle or null if no cycle is found
    /// </summary>
    static string FindCycleHelper(string start, Dictionary<string, string> successors, HashSet<string> stackVisited)
    {
        string current = start;
        while (current != null)
        {
            if (stackVisited.Contains(current))
            {
                // this node is part of a cycle
                return current;
            }
    
            stackVisited.Add(current);
    
            successors.TryGetValue(current, out current);
        }
    
        return null;
    }
    

    In order to maintain efficiency, it can be extended to early-return when an already checked node is reached (using previouslyVisited):

    /// <summary>
    /// Returns a node that is part of a cycle or null if no cycle is found
    /// </summary>
    static string FindCycleHelper(string start, Dictionary<string, string> successors, HashSet<string> stackVisited, HashSet<string> previouslyVisited)
    {
        string current = start;
        while (current != null)
        {
            if (previouslyVisited.Contains(current))
            {
                return null;
            }
            if (stackVisited.Contains(current))
            {
                // this node is part of a cycle
                return current;
            }
    
            stackVisited.Add(current);
    
            successors.TryGetValue(current, out current);
        }
    
        return null;
    }
    

    The following function is used to maintain consistency of the visited sets

    static string FindCycle(string start, Dictionary<string, string> successors, HashSet<string> globalVisited)
    {
        HashSet<string> stackVisited = new HashSet<string>();
    
        var result = FindCycleHelper(start, successors, stackVisited, globalVisited);
    
        // update collection of previously processed nodes
        globalVisited.UnionWith(stackVisited);
    
        return result;
    }
    

    It is called for each old node in order to check for cycles. When a cycle starting node is detected, the cycle information can be separately created:

    // static testdata - can be obtained from JSON for real code
    IEnumerable<Item> items = new Item[]
    {
        new Item{ Old = "a", Target = "b" },
        new Item{ Old = "b", Target = "c" },
        new Item{ Old = "c", Target = "d" },
        new Item{ Old = "d", Target = "a" },
        new Item{ Old = "j", Target = "x" },
        new Item{ Old = "w", Target = "s" },
    };
    var successors = items.ToDictionary(x => x.Old, x => x.Target);
    
    var visited = new HashSet<string>();
    
    List<List<string>> cycles = new List<List<string>>();
    
    foreach (var item in items)
    {
        string cycleStart = FindCycle(item.Old, successors, visited);
    
        if (cycleStart != null)
        {
            // cycle found, get detail information about involved nodes
            List<string> cycle = GetCycleMembers(cycleStart, successors);
            cycles.Add(cycle);
        }
    }
    

    Output your found cycles any way you want. For example

    foreach (var cycle in cycles)
    {
        Console.WriteLine("Cycle:");
        Console.WriteLine(string.Join(" # ", cycle));
        Console.WriteLine();
    }
    

    The implementation of GetCycleMembers is pretty easy - it depends on a correct starting node:

    /// <summary>
    /// Returns the list of nodes that are involved in a cycle
    /// </summary>
    /// <param name="cycleStart">This is required to belong to a cycle, otherwise an exception will be thrown</param>
    /// <param name="successors"></param>
    /// <returns></returns>
    private static List<string> GetCycleMembers(string cycleStart, Dictionary<string, string> successors)
    {
        var visited = new HashSet<string>();
        var members = new List<string>();
        var current = cycleStart;
        while (!visited.Contains(current))
        {
            members.Add(current);
            visited.Add(current);
            current = successors[current];
        }
        return members;
    }