Search code examples
listdelphiuniquedelphi-xe7spring4d

How to remove all duplicates from a list?


Consider this test app:

function RemoveDuplicates(const Input: IEnumerable<Integer>): IEnumerable<Integer>;
begin
  // How to implement this function?
end;

var
  Enumerable: IEnumerable<Integer>;
  UniqueEnumerable: IEnumerable<Integer>;
begin
  Enumerable := TCollections.CreateList<Integer>([1, 1, 2, 3, 3, 3, 4]);
  UniqueEnumerable := RemoveDuplicates(Enumerable);
  UniqueEnumerable.ForEach(
    procedure(const I: Integer)
    begin
      WriteLn(I);
    end);
  ReadLn;
end.

How can I implement the RemoveDuplicates function (this is called nub in Haskell)?


Solution

  • Use what is already there:

    uses
      Spring.Collections,
      Spring.collections.Extensions;
    
    function RemoveDuplicates(const Input: IEnumerable<Integer>): IEnumerable<Integer>;
    begin
      Result := TDistinctIterator<Integer>.Create(Input, nil);
    end;
    

    This supports lazy evaluation (means that Input is not getting processed before the resulting enumerable is getting processed). It is using a hashset (currently implemented as Dictionary) internally to keep track of the already found items (this happens inside the enumerator).

    Why is that important? Because any operation that does a complete enumeration might cause unwanted performance impact if Input involves other costly operations which may by far outweigh any benefits of other approaches of removing duplicates (like putting it into a list and sort that). Also an IEnumerable is not guaranteed to be finite.

    If between calling this function and enumerating the result the Input was changed that change is affecting the outcome of your enumeration whereas it would not be the case if you are not supporting lazy evaluation. If you are enumerating multiple times the outcome might be different (i.e. up-to-date) each time.