I have two IEnumerable<FileInfo>
lists that I would like to compare:
IEnumerable<FileInfo> list1 = dir1.GetFiles("*" + _ext1, SearchOption.AllDirectories);
IEnumerable<FileInfo> list2 = dir2.GetFiles("*" + _ext2, SearchOption.AllDirectories);
Where _ext1 and _ext2 are diffrent file extension types. For example:
string _ext1 = ".jpg";
string _ext2 = ".png";
So list1
will look something like:
file1.jpg
file2.jpg
file3.jpg
file4.jpg
file5.jpg
file6.jpg
and list2
will look like:
file1.png
file2.png
file4.png
I want to find everything in list2 that is not present in list1. I have tried the following:
List<string> list1FileNames = list1.Select(f => Path.GetFileNameWithoutExtension(f.FullName)).ToList();
List<string> list2FileNames = list2.Select(f => Path.GetFileNameWithoutExtension(f.FullName)).ToList();
var setDiff = list1FileNames .Except(list2FileNames );
This is great and works fine and returns (notice no file extension):
file3
file5
file6
However, what I really want is to get a list of FileInfo's not just the FileName strings. I need this because I need other information like the FullFile path, ext.. so just a list of filename strings will not do the job. How can I go about doing this?
If you're looking for speed, try this:
private IEnumerable<FileInfo> GetUniqueFilesWithoutExtension(IEnumerable<FileInfo> list1, IEnumerable<FileInfo> list2)
{
var d = new HashSet<string>();
foreach (var fi in list2)
{
d.Add(Path.GetFileNameWithoutExtension(fi.FullName));
}
foreach (var fi in list1)
{
if (!d.Contains(Path.GetFileNameWithoutExtension(fi.FullName)))
{
yield return fi;
}
}
}
Make a hash set of file names (sans extensions) from list2
, then iterate through list1
and only return the items with file names (sans extensions) that don't appear in the hash set from list2
. The yield return
lets you consume your results as they're discovered in a streaming fashion, instead of having to wait for the whole list to be generated, if that matters to you.