Search code examples
c#linqfor-loopnested-loops

How to make nested for-loop more efficient?


I have the following code, which I would like to speed up:

foreach (var workflow in closedWorkflows)
{
    var w = relatedWorkflows.FirstOrDefault(r => r.Fingerprint != workflow.Fingerprint && r.TradeDate == workflow.TradeDate);
    if (w != null)
    {
        relatedWorkflows.Add(workflow);
    }
}

relatedWorkflows and closedWorkflows are list of same type of objects. I thought about creating lookup or dictionary with an anonymous class or tuple on Fingerprint and TradeDate, but one check is for equality and other one is for inequality. Would it be a good approach to create lookups on TradeDate, and then create lookups on Fingerprint for each lookup list of TradeDate ?

Updated:

A BIG thanks to @Dmitry Bychenko.

In my test the difference is 41486 ms vs 26ms !!!


Solution

  • The main problem with performance is with

    var w = relatedWorkflows.FirstOrDefault(...)
    

    we scan relatedWorkflows again and again, that's why we have O(closedWorkflows.Count * relatedWorkflows.Count) time complexity. We can get rid of relatedWorkflows.Count with the help of hash-based collections (dictionary and hashset), e.g.

    var dict = workflow
      .GroupBy(item => item.TradeDate, item => item.FingerPrint)
      .ToDictionary(group => group.Key, group => group.ToHashSet());
    
    foreach (var workflow in closedWorkflows) {
      // If we have a date, but NOT print we add it into relatedWorkflows
      if (dict.TryGetValue(workflow.TradeDate, out var prints) && 
          prints.Add(workflow.Fingerprint)) {
        relatedWorkflows.Add(workflow);
      }
    }