Search code examples
c#linqdatatable

Remove Duplicates from Datatable with LINQ without keeping a duplicated entry at all


I have a Datatable with several Columns which I want to remove all duplicates from like that

Dt1 = Dt1 .AsEnumerable().GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") }).Select(g => g.First()).CopyToDataTable();

However above code leaves one entry (the first one that is found) in the DataTable via the Select.First at the end of the LINQ code.

Is there a way to remove all duplicates and keep none?

Edit: Example what the code is doing now and what it should do.

Datatable with entries like that

Name Filesize Filename
One 50 Fileone
Two 50 Fileone
Three 50 Filetwo
Four 50 Filethree

Above LINQ will now remove Line 2 as Filename and Filesize are the same. However Line 1 will stay as the LINQ Code selects the first duplicated entry.

I want to have removed line 1 and line 2 from the Datatable.


Solution

  • Dt1 = Dt1.AsEnumerable()
             .GroupBy(r => new { filename = r.Field<string>("filename1"), filesize = r.Field<string>("filesizeinkb") })
             .Where(g => g.Count() == 1)
             .Select(g => g.First())
             .CopyToDataTable();
    

    That will discard any groups with more than one item, then get the first (and only) item from the rest.