Search code examples
c#filefileinfo

C# FileInfo - Find duplicate Files


I have a FileInfo array with ~200.000 File Entries. I need to find all files which have the same filename. I need as result from every duplicate file the directory name and filename because I want to rename them afterwards.

What I've tried already:

  • Comparing each Entry with the whole List with 2 For "loops" // Bad Idea, this would need hours or even days ^^
  • Try to use Linq Sorting // Because i not used Linq before i have hardship to write the correct Statement, maybe someone can help me :)

Solution

  • Sounds like this should do it:

    var duplicateNames = files.GroupBy(file => file.Name)
                              .Where(group => group.Count() > 1)
                              .Select(group => group.Key);
    

    Now would be a very good time to learn LINQ. It's incredibly useful - time spent learning it (even just LINQ to Objects) will pay itself back really quickly.

    EDIT: Okay, if you want the original FileInfo for each group, just drop the select:

    var duplicateGroups = files.GroupBy(file => file.Name)
                               .Where(group => group.Count() > 1);
    
    // Replace with what you want to do
    foreach (var group in duplicateGroups)
    {
         Console.WriteLine("Files with name {0}", group.Key);
         foreach (var file in group)
         {
             Console.WriteLine("  {0}", file.FullName);
         }
    }