Maybe I am missing the details here but I would expect that IEnumerable.Except() would work on Enumerables not concretely cast to a collection.
Let me explain with a simple example: I have a list of files on a directory and I want to exclude the files that start with a certain string.
var allfiles = Directory.GetFiles(@"C:\test\").Select(f => new FileInfo(f));
Getting both the files matching and those not matching would be a matter of identifying one of the two collections and then .Except()-ing on the full list, right?
var matching = allfiles.Where(f => f.Name.StartsWith("TOKEN"));
and
var notmatching = allfiles.Except(matching, new FileComparer());
Where FileComparer() is some class that compares the full path of the two files.
Well, unless I cast both of the three collections to a List, the last notmatching variable still gives me the full list of files after I .Except() on the matching collection. To be clear:
var allfiles = Directory.GetFiles(@"C:\test\").Select(f => new FileInfo(f));
var matching = allfiles.Where(f => f.Name.StartsWith("TOKEN"));
var notmatching = allfiles.Except(matching, new FileComparer());
does not exclude, while
var allfiles = Directory.GetFiles(@"C:\test\").Select(f => new FileInfo(f)).ToList();
var matching = allfiles.Where(f => f.Name.StartsWith("TOKEN")).ToList();
var notmatching = allfiles.Except(matching, new FileComparer()).ToList();
actually does what is says on the tin. What am I missing here? I can't understand why LINQ doesn't manipulate the collection not currently cast to a list.
For instance, the FileComparer does not even get called in the first case.
internal class FileComparer : IEqualityComparer<FileInfo>
{
public bool Equals(FileInfo x, FileInfo y)
{
return x == null ? y == null : (x.Name.Equals(y.Name, StringComparison.OrdinalIgnoreCase) && x.Length == y.Length);
}
public int GetHashCode(FileInfo obj)
{
return obj.GetHashCode();
}
}
The difference between the two is that without ToList
, the deferred allfiles
query is executed twice, producing different instances of FileInfo
that will not pass reference equality.
Your FileComparer
implements GetHashCode
incorrectly, as it simply returns the reference-based hash code of the FileInfo
objects (which does not itself override GetHashCode
).
Implementations are required to ensure that if the
Equals(T, T)
method returnstrue
for two objectsx
andy
, then the value returned by theGetHashCode(T)
method forx
must equal the value returned fory
.
The solution is to implement GetHashCode
based on the same definition of equality as Equals
:
public int GetHashCode(FileInfo obj)
{
return StringComparer.OrdinalIgnoreCase.GetHashCode(obj.Name);
}