Search code examples
c#linq

Remove majority portion of list when exists in other list C#


I have 2 lists:

pending = contains ID field and several others

posted = contains only ID field

I want to remove all of the records from pending if the ID field is in the posted list, which works with the following:

pending = pending.Where(p => !posted.Any(pp => pp.ID == p.ID)).ToList();

however, this does not perform well when for example pending has 600k records and 599,999 of them are in posted

Is there a more efficient way to handle this?


Solution

  • There is a more efficient way.

    Right now, for each element in list A, you are traversing all of list B in order to try and find it. This is a slow operation.

    What you can do is convert list B to a HashSet which offers near O(1) direct lookups on the key you select. Then, the traversal of list A will only do a single Contains() call for each element to check list B.

    var listB = new HashSet<int>(posted.Select(i => i.ID));
    pending = pending.Where(p => !listB.Contains(p.ID)).ToList();
    

    This should speed up your algorithm considerably.