Search code examples
c#linqaggregate

How can I perform an arbitrary aggregate method on an arbitrary column in a DataTable in c#?


That question is terrible; I know. If I could word it better, I might be able to find a solution. Here's the detail:

Assume I have a DataTable with three columns, named "one", "two", and "three", all of type int. If I want the sum of the third column, that's easy enough:

var sum = DataTable.AsEnumerable().ToList().Sum(row => row["three"] as int?).Value;

Since I'm working on infrastructure-level stuff and subsets of rows, it's better to think of it as a method on a separate class.

var sum = GetSummary();
...
public class calculator {
    List<DataRow> data;
    public int GetSummary() { return this.data.Sum(row => row["three"] as int?).Value; }
}

The hard part is that I need to allow the caller to specify the aggregate method and the column on which to perform it.

var desire = <get the average of the column called "two">
var result = GetAggregation(DataTable.AsEnumerable().ToList(), desire);
...
public class calculator {
    List<DataRow> data;
    public int GetAggregation(List<DataRow> data, <something> actionToPerform) { ... }
}

What is my desire? Is it a Func()? Something else entirely? What would it look like in the examples here?

The best I've been able to come up with is this:

public class calculator {
    List<DataRow> data;
    public T? DoAggregation<T> (string column, Func<IEnumerable<T?>, T> desire) {
        var subset = data.Select(row => row.Field<T>(column)).ToArray();
        return desire(subset);
    }
}

public class tester {
    public class DoTest() {
        var mycalc = new calculator();
        var avgResult = mycalc.DoAggregation<double>("two", desire: (data) => data.Average());
        var sumResult = mycalc.DoAggregation<int>("three", desire: (data) => data.Sum());
    }
}

It's a step forward, but it has its own issues. Since I need to store these desires, my calculator needs a collection of type List<Func<IEnumerable<T>, T>> ... but then I can't have desires of different types (int vs decimal). And how would I handle something like <add column one to column three on each row and take the average across all rows>?


Solution

  • Though not as generic as possible, I've found a solution that works for me. This component is being used to render reports, so anything and everything that it would be used for would result in a string. My last paragraph of the question was concerned about returning different types, but that's not really an issue.

    With that said, I've ended up with:

    public void AddDetailComponent(Func<DataRow, string> toPerform) {}
    

    Per @NetMage's comment, this component will be used by other programmers. I have to trust that they're capable of writing functions that return strings. If I can't trust that, I need to look at who's being hired.

    So typical usage would look like this:

    report.AddDetailComponent((data) =>
    {
        var value = (bool)data[DataTableColumns.FirstColumnName] ? (int?)data[DataTableColumns.SecondColumnName] * 2 : (int?)data[DataTableColumns.SecondColumnName] / 2;
                return value.ToString();
    });