Consider this test app:
function RemoveDuplicates(const Input: IEnumerable<Integer>): IEnumerable<Integer>;
begin
// How to implement this function?
end;
var
Enumerable: IEnumerable<Integer>;
UniqueEnumerable: IEnumerable<Integer>;
begin
Enumerable := TCollections.CreateList<Integer>([1, 1, 2, 3, 3, 3, 4]);
UniqueEnumerable := RemoveDuplicates(Enumerable);
UniqueEnumerable.ForEach(
procedure(const I: Integer)
begin
WriteLn(I);
end);
ReadLn;
end.
How can I implement the RemoveDuplicates
function (this is called nub
in Haskell)?
Use what is already there:
uses
Spring.Collections,
Spring.collections.Extensions;
function RemoveDuplicates(const Input: IEnumerable<Integer>): IEnumerable<Integer>;
begin
Result := TDistinctIterator<Integer>.Create(Input, nil);
end;
This supports lazy evaluation (means that Input is not getting processed before the resulting enumerable is getting processed). It is using a hashset (currently implemented as Dictionary) internally to keep track of the already found items (this happens inside the enumerator).
Why is that important? Because any operation that does a complete enumeration might cause unwanted performance impact if Input
involves other costly operations which may by far outweigh any benefits of other approaches of removing duplicates (like putting it into a list and sort that). Also an IEnumerable is not guaranteed to be finite.
If between calling this function and enumerating the result the Input
was changed that change is affecting the outcome of your enumeration whereas it would not be the case if you are not supporting lazy evaluation. If you are enumerating multiple times the outcome might be different (i.e. up-to-date) each time.