Search code examples
c#datasetlinq-to-dataset

Aggregate datatable with dynamic number of columns


I have a datatable with dynamic set of columns and want to aggregate the numeric based columns and keep the final rows into new datatable.

DataTable Sample:-

PartnerName   CreditCol  DebitCol  AmountCol ....
P1              10       20        30 
P2              1        2         3 
P3              3        1         10
P2              1        100       200

The desired output should be :-

PartnerName   CreditCol  DebitCol  AmountCol ....
P1              10       20        30 
P2              2        102       203 
P3              3        1         10

The main thing here is the column set and will be dynamic. Sometime, there could be two columns and sometimes it could be 20 cols. Please suggest the linq query or any other solution.


Solution

  • Here is a dynamic approach that should work for your requirement:

    var rows = table.AsEnumerable();
    var columns = table.Columns.Cast<DataColumn>();
    int i;  // used to check if a string column can be parsed to int
    string columnToGroup = "partnername";
    DataColumn colToGroup = columns.First(c => c.ColumnName.Equals(columnToGroup, StringComparison.OrdinalIgnoreCase));
    var colsToSum = columns
         .Where(c => c != colToGroup &&
             (c.DataType == typeof(int) ||
             (c.DataType == typeof(string)
             && rows.All(r => int.TryParse(r.Field<string>(c), out i)))));
    var columnsToSum = new HashSet<DataColumn>(colsToSum);
    
    DataTable tblSum = table.Clone(); // empty table, same schema
    foreach (var group in rows.GroupBy(r => r[colToGroup]))
    {
        DataRow row = tblSum.Rows.Add();
        foreach(var col in columns)
        {
            if (columnsToSum.Contains(col))
            {
                int sum;
                if (col.DataType == typeof(int))
                    sum = group.Sum(r => r.Field<int>(col));
                else
                    sum = group.Sum(r => int.Parse(r.Field<string>(col)));
                row.SetField(col.ColumnName, sum);
            }
            else
                row[col.ColumnName] = group.First()[col];
        }
    }
    

    Tested with your sample data here:

    var table = new System.Data.DataTable();
    table.Columns.Add("PartnerName", typeof(string));
    table.Columns.Add("CreditCol", typeof(int));
    table.Columns.Add("DebitCol", typeof(string));
    table.Columns.Add("AmountCol", typeof(int));
    table.Rows.Add("P1", 10, "20", 30);
    table.Rows.Add("P2", 1, "2", 3);
    table.Rows.Add("P3", 3, "1", 10);
    table.Rows.Add("P2", 1, "100", 200);
    

    Result:

    PartnerName   CreditCol    DebitCol    AmountCol
    P1            10             20         30
    P2            2              102        203
    P3            3              1          10