Search code examples
c#linqcsvperformance-testing

LINQ Method - Optimization


I'm reading a CSV file splitting it into cols, then grouping into a new class.

It looks clunky just wondering is there is a more simple method for instance like not selecting them into the class first:

EDIT: so to clarify I'm trying to get the TimesheetHours grouped by all the other columns.

var rowList = csvFile.Rows.Select(row => row.Split(','))
    .Select(cols => new UtilisationRow {
        UploadId = savedUpload.Id,
        FullName = cols[0],
        TimesheetWorkDateMonthYear = Convert.ToDateTime(cols[1]),
        TimesheetTaskJobnumber = cols[2],
        TimesheetWorktype = cols[3],
        TimesheetHours = Convert.ToDouble(cols[4]),
        TimesheetOverhead = cols[5]
    })
    .GroupBy(d => new {
        d.FullName,
        d.TimesheetWorkDateMonthYear,
        d.TimesheetTaskJobnumber,
        d.TimesheetWorktype,
        d.TimesheetOverhead
    })
    .Select(g => new UtilisationRow {
        FullName = g.First().FullName,
        TimesheetWorkDateMonthYear = g.First().TimesheetWorkDateMonthYear,
        TimesheetTaskJobnumber = g.First().TimesheetTaskJobnumber,
        TimesheetWorktype = g.First().TimesheetWorktype,
        TimesheetHours = g.Sum(s => s.TimesheetHours),
        TimesheetOverhead = g.First().TimesheetOverhead
    })
    .ToList();

Many thanks, Lee.


Solution

  • The two problems in your code are that you call First() repeatedly on a group, while you should retrieve that same data from group's key, and that you are using UtilisationRow in the first Select, which should use an anonymous type instead:

    var rowList = csvFile.Rows.Select(row => row.Split(','))
        .Select(cols => new {
            UploadId = savedUpload.Id,
            FullName = cols[0],
            TimesheetWorkDateMonthYear = Convert.ToDateTime(cols[1]),
            TimesheetTaskJobnumber = cols[2],
            TimesheetWorktype = cols[3],
            TimesheetHours = Convert.ToDouble(cols[4]),
            TimesheetOverhead = cols[5]
        })
        .GroupBy(d => new {
            d.FullName,
            d.TimesheetWorkDateMonthYear,
            d.TimesheetTaskJobnumber,
            d.TimesheetWorktype,
            d.TimesheetOverhead
        })
        .Select(g => new UtilisationRow {
            FullName = g.Key.FullName,
            TimesheetWorkDateMonthYear = g.Key.TimesheetWorkDateMonthYear,
            TimesheetTaskJobnumber = g.Key.TimesheetTaskJobnumber,
            TimesheetWorktype = g.Key.TimesheetWorktype,
            TimesheetHours = g.Sum(s => s.TimesheetHours),
            TimesheetOverhead = g.Key.TimesheetOverhead
        })
        .ToList();
    

    Now the "pipeline" of your method looks pretty clean:

    • The first Select does the initial parsing into a temporary record
    • GroupBy bundles matching records into a group
    • The final Select produces records of the required type.