Search code examples
c#asp.net-mvclinqdotnetzip

How to find duplicates in a list of strings of server path names


I am trying to find duplicates in a list of strings of path names to the server: My paths will look like \\UTIR\STORAGE\10-23-2015\DEPOSITS\123_DEPOSIT_10-23-2015_1.pdf I will have have to 50 of these that I need to check the end of the path \123_DEPOSIT_10-23-2015_1.pdf to make sure there are no duplicates.

List<string> manypaths = (List<string>)TempData["temp"];
        var list= new List<string>();
        foreach (var item in manypaths)
        {
            if(list.Contains(item))
            {
                
            }
            else
            {
                list.Add(item);
            }
        }

I am using dotnetzip library and I have tried ContainsEntry, Contains. And everything else i have found online. When I add these files to a zip file I get an error:

System.ArgumentException: 'An item with the same key has already been added.

using (Ionic.Zip.ZipFile zip = new Ionic.Zip.ZipFile())
        {
            
            zip.AddFiles(list, @"\");

             
            MemoryStream output = new MemoryStream();

            zip.Save(output);
            return File(output.ToArray(), "application/zip");

        }

Solution

  • To get distinct path by last part, you could use group by the last part and take the first element, Like the following code :

    List<string> distinctFiles = files
        .GroupBy(x => x.Split(new char[] { '\\' }).Last())
        .Select(x => x.First())
        .ToList();
    

    Or

    List<string> distinctFiles = files
        .GroupBy(x => Path.GetFileName(x))
        .Select(x => x.First())
        .ToList();
    

    For Test:

    List<string> files = new List<string>
    {
        @"\\UTIR\STORAGE\10-23-2015\DEPOSITS\123_DEPOSIT_10-23-2015_1.pdf",
        @"\\UTIR\STORAGE1\10-23-2015\DEPOSITS\123_DEPOSIT_10-23-2015_1.pdf",
        @"\\UTIR\STORAGE\10-23-2015\DEPOSITS\123_DEPOSIT_10-23-2015_11.pdf",
    };
    

    Note that, the first and the second are duplicated, in different path

    Result

    "\\UTIR\STORAGE\10-23-2015\DEPOSITS\123_DEPOSIT_10-23-2015_1.pdf"
    "\\UTIR\STORAGE\10-23-2015\DEPOSITS\123_DEPOSIT_10-23-2015_11.pdf"
    

    I hope you find this helpful.