Search code examples
c#circular-dependency

In C#, how to find chain of circular dependency?


This error usually occurs when one deployment project contains the project outputs of a second deployment project, and the second project contains the outputs of the first project.

I have a method that check circular dependency. In input, we have a dictionary that contains, for example, <"A", < "B", "C" >> and <"B", < "A", "D" >>, this means that A depends on B and C and we have circular dependency with A->B.

But usually we have a more complex situation, with a chain of dependency. How to modify this method to find a chain of dependency? For example, I want to have a variable that contains chain A->B->A, rather than class A has a conflict with class B.

private void FindDependency(IDictionary<string, IEnumerable<string>> serviceDependence)

Solution

  • A simple way to find cycles in a graph is to use a recursive depth-first graph-coloring algorithm in which nodes are marked as "visiting" or "visited". If, when visiting a node, you find it is already in the "visiting" state, you have a cycle. Nodes marked as "visited" can be skipped. For instance:

    public class DependencyExtensions
    {
        enum VisitState
        {
            NotVisited,
            Visiting,
            Visited
        };
    
        public static TValue ValueOrDefault<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TKey key, TValue defaultValue)
        {
            TValue value;
            if (dictionary.TryGetValue(key, out value))
                return value;
            return defaultValue;
        }
    
        static void DepthFirstSearch<T>(T node, Func<T, IEnumerable<T>> lookup, List<T> parents, Dictionary<T, VisitState> visited, List<List<T>> cycles)
        {
            var state = visited.ValueOrDefault(node, VisitState.NotVisited);
            if (state == VisitState.Visited)
                return;
            else if (state == VisitState.Visiting)
            {
                // Do not report nodes not included in the cycle.
                cycles.Add(parents.Concat(new[] { node }).SkipWhile(parent => !EqualityComparer<T>.Default.Equals(parent, node)).ToList());
            }
            else
            {
                visited[node] = VisitState.Visiting;
                parents.Add(node);
                foreach (var child in lookup(node))
                    DepthFirstSearch(child, lookup, parents, visited, cycles);
                parents.RemoveAt(parents.Count - 1);
                visited[node] = VisitState.Visited;
            }
        }
    
        public static List<List<T>> FindCycles<T>(this IEnumerable<T> nodes, Func<T, IEnumerable<T>> edges)
        {
            var cycles = new List<List<T>>();
            var visited = new Dictionary<T, VisitState>();
            foreach (var node in nodes)
                DepthFirstSearch(node, edges, new List<T>(), visited, cycles);
            return cycles;
        }
    
        public static List<List<T>> FindCycles<T, TValueList>(this IDictionary<T, TValueList> listDictionary)
            where TValueList : class, IEnumerable<T>
        {
            return listDictionary.Keys.FindCycles(key => listDictionary.ValueOrDefault(key, null) ?? Enumerable.Empty<T>());
        }
    }
    

    Then, you could use it like:

            var serviceDependence = new Dictionary<string, List<string>>
            {
                { "A", new List<string> { "A" }},
                { "B", new List<string> { "C", "D" }},
                { "D", new List<string> { "E" }},
                { "E", new List<string> { "F", "Q" }},
                { "F", new List<string> { "D" }},
            };
            var cycles = serviceDependence.FindCycles();
            Debug.WriteLine(JsonConvert.SerializeObject(cycles, Formatting.Indented));
            foreach (var cycle in cycles)
            {
                serviceDependence[cycle[cycle.Count - 2]].Remove(cycle[cycle.Count - 1]);
            }
            Debug.Assert(serviceDependence.FindCycles().Count == 0);
    

    Update

    Your question has been updated to request the "most efficient algorithm" for finding cyclic dependencies. The code in the original answer is recursive, so there's a chance of a StackOverflowException for dependency chains thousands of levels deep. Here's a non-recursive version with an explicit stack variable:

    public static class DependencyExtensions
    {
        enum VisitState
        {
            NotVisited,
            Visiting,
            Visited
        };
    
        public static TValue ValueOrDefault<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, TKey key, TValue defaultValue)
        {
            TValue value;
            if (dictionary.TryGetValue(key, out value))
                return value;
            return defaultValue;
        }
    
        private static void TryPush<T>(T node, Func<T, IEnumerable<T>> lookup, Stack<KeyValuePair<T, IEnumerator<T>>> stack, Dictionary<T, VisitState> visited, List<List<T>> cycles)
        {
            var state = visited.ValueOrDefault(node, VisitState.NotVisited);
            if (state == VisitState.Visited)
                return;
            else if (state == VisitState.Visiting)
            {
                Debug.Assert(stack.Count > 0);
                var list = stack.Select(pair => pair.Key).TakeWhile(parent => !EqualityComparer<T>.Default.Equals(parent, node)).ToList();
                list.Add(node);
                list.Reverse();
                list.Add(node);
                cycles.Add(list);
            }
            else
            {
                visited[node] = VisitState.Visiting;
                stack.Push(new KeyValuePair<T, IEnumerator<T>>(node, lookup(node).GetEnumerator()));
            }
        }
    
        static List<List<T>> FindCycles<T>(T root, Func<T, IEnumerable<T>> lookup, Dictionary<T, VisitState> visited)
        {
            var stack = new Stack<KeyValuePair<T, IEnumerator<T>>>();
            var cycles = new List<List<T>>();
    
            TryPush(root, lookup, stack, visited, cycles);
            while (stack.Count > 0)
            {
                var pair = stack.Peek();
                if (!pair.Value.MoveNext())
                {
                    stack.Pop();                    
                    visited[pair.Key] = VisitState.Visited;
                    pair.Value.Dispose();
                }
                else
                {
                    TryPush(pair.Value.Current, lookup, stack, visited, cycles);
                }
            }
            return cycles;
        }
    
        public static List<List<T>> FindCycles<T>(this IEnumerable<T> nodes, Func<T, IEnumerable<T>> edges)
        {
            var cycles = new List<List<T>>();
            var visited = new Dictionary<T, VisitState>();
            foreach (var node in nodes)
                cycles.AddRange(FindCycles(node, edges, visited));
            return cycles;
        }
    
        public static List<List<T>> FindCycles<T, TValueList>(this IDictionary<T, TValueList> listDictionary)
            where TValueList : class, IEnumerable<T>
        {
            return listDictionary.Keys.FindCycles(key => listDictionary.ValueOrDefault(key, null) ?? Enumerable.Empty<T>());
        }
    }
    

    This should be reasonably efficient at N*log(N) + E where N is the number of nodes and E is the number of edges. The Log(N) comes from building the visited hash table and could be eliminated by making each node remember its VisitState. This seems reasonably performant; in the following test harness, the time to find 17897 cycles of average length 4393 in 10000 nodes with 125603 total dependencies is around 10.2 seconds:

    public class TestClass
    {
        public static void TestBig()
        {
            var elapsed = TestBig(10000);
            Debug.WriteLine(elapsed.ToString());
        }
    
        static string GetName(int i)
        {
            return "ServiceDependence" + i.ToString();
        }
    
        public static TimeSpan TestBig(int count)
        {
            var serviceDependence = new Dictionary<string, List<string>>();
            for (int iItem = 0; iItem < count; iItem++)
            {
                var name = GetName(iItem);
                // Add several forward references.
                for (int iRef = iItem - 1; iRef > 0; iRef = iRef / 2)
                    serviceDependence.Add(name, GetName(iRef));
                // Add some backwards references.
                if (iItem > 0 && (iItem % 5 == 0))
                    serviceDependence.Add(name, GetName(iItem + 5));
            }
    
            // Add one backwards reference that will create some extremely long cycles.
            serviceDependence.Add(GetName(1), GetName(count - 1));
    
            List<List<string>> cycles;
    
            var stopwatch = new Stopwatch();
            stopwatch.Start();
            try
            {
                cycles = serviceDependence.FindCycles();
            }
            finally
            {
                stopwatch.Stop();
            }
    
            var elapsed = stopwatch.Elapsed;
    
            var averageLength = cycles.Average(l => (double)l.Count);
            var total = serviceDependence.Values.Sum(l => l.Count);
    
            foreach (var cycle in cycles)
            {
                serviceDependence[cycle[cycle.Count - 2]].Remove(cycle[cycle.Count - 1]);
            }
            Debug.Assert(serviceDependence.FindCycles().Count == 0);
    
            Console.WriteLine(string.Format("Time to find {0} cycles of average length {1} in {2} nodes with {3} total dependencies: {4}", cycles.Count, averageLength, count, total, elapsed));
            Console.ReadLine();
            System.Environment.Exit(0);
    
            return elapsed;
        }
    }