Search code examples
c#datatabledataview

Removal of Duplicate Rows from Data table Based on Multiple columns


I have data table which contains many duplicate rows i need to filter those rows from data table based upon multiple columns to get distinct rows in resultant data table....

Barcode Itemid PacktypeId

1      100      1

1      100      2

1      100      3

1      100      1

1      100      3

need only rows which contains packtypeid 1,2,3 remaining 4th and 5th row should be removed

I have tried using two methods but none didn't turns for better result

Data table contains more than 10 columns but unique column's is "Barcode", "ItemID", "PackTypeID"

Method-1:

 dt_Barcode = dt_Barcode.DefaultView.ToTable(true, "Barcode", "ItemID", "PackTypeID");

The above method filter's the rows but it returns columns only 3 column values i need entire 10 column values.

Method-2:
                   List<string> keyColumns = new List<string>();
                   keyColumns.Add("Barcode");
                   keyColumns.Add("ItemID");
                   keyColumns.Add("PackTypeID");   
           RemoveDuplicates(DataTable table, List<string> keyColumns)
            {
            var uniqueness = new HashSet<string>();
            StringBuilder sb = new StringBuilder();
            int rowIndex = 0;
            DataRow row;
            DataRowCollection rows = table.Rows;             
            int i = rows.Count;
            while (rowIndex < i)
            {
                row = rows[rowIndex];
                sb.Length = 0;
                foreach (string colname in keyColumns)
                {
                    sb.Append(row[colname]);
                    sb.Append("|");
                }

                if (uniqueness.Contains(sb.ToString()))
                {
                    rows.Remove(row);
                }
                else
                {
                    uniqueness.Add(sb.ToString());
                    rowIndex++;
                }
               }

The Above Method returns exception like there is no rows at position 5


Solution

  • Method 3:

    Instead of Trying above 2 methods i found this Linq Method something very useful

         dt_Barcode = dt_Barcode.AsEnumerable().GroupBy(r => new { ItemID = r.Field<Int64>("ItemID"), PacktypeId = r.Field<Int32>("PackTypeID") }).Select(g => g.First()).CopyToDataTable();