Search code examples
entity-frameworklinqinner-joinaggregate-functions

EF Linq query with join, group and sum


Trying to construct a LINQ query that performs a simple inner join, groups the data and sums two of the columns. From the examples I've seen it looks fairly straightforward but I must have missed something along the way.

public class Employee
{
  public int Id { get; set; }
  public int Name { get; set; }
}

public class Inventory
{
  public int Id { get; set; }
  public int EmployeeId { get; set; }
  public decimal OffSite { get; set; }
  public decimal OnSite { get; set; }
}

public class InventoryTotal
{
  public int EmployeeId { get; set; }
  public string EmployeeName { get; set; }
  public decimal EmployeeOffSite { get; set; }
  public decimal EmployeeOnSite { get; set; }
}

The query I have created looks like this

var result = from a in db.Inventory                            
             join b in db.Employee on a.EmployeeId equals b.Id
             group new { a, b } by a.EmployeeId into c                            
             select new InventoryTotal
             {
               EmployeeId = c.Key,
               EmployeeName = c.Name,
               EmployeeOffSite = c.Sum(d => d.a.OffSite),
               EmployeeOnSite = c.Sum(d => d.a.OnSite)
             };

One issue appears to be with the Name column, the only value I want to obtain from the join with Employee. I would like to understand how to properly access that column and better understand how to construct this query as a whole.

EmployeeName = c.Name is not valid, nor are a few other combination I've tried.


Solution

  • So you have two tables: Employees and Inventories. There is a one-to-many relation between these two: Every Employee has zero or more Inventories; every Inventory is the Inventory of exactly one Employee, namely the Employee that the foreign key EmployeeId refers to.

    Requirement: from every Employee get his Id and Name, and the total of all his OffSite and OnSite inventories.

    Since you are using entity framework, there are three methods to do this. One is to do the (Group-)Join yourself, the other is to let entity framework do the (Group-)Join, and finally, the most intuitive part is to use the virtual ICollection<Inventory.

    Do the GroupJoin yourself

    Whenever you have a one-to-many relation, like Schools with their Students, Customers with their Orders, or Employees with their Inventories, and you want to start at the "one" side, consider to use one of the overloads of Queryable.GroupJoin.

    On the other hand, if you want to start on the "Many" side, if you want the Student with the School he attends, the Order with the Customer who placed the order, consider to use Queryable.Join

    You want to fetch "Employees with (some information about) their Inventories, so we'll use a GroupJoin. I'll use the overload of GroupJoin with a parameter resultSelector, so we can specify what we want as result.

    var inventoryTotals = dbContext.Employees.GroupJoin(dbContext.Inventories,
    
    employee => employee.Id,            // from every Employee take the primary key
    inventory => inventory.EmployeeId,  // from every Inventory take the foreign key
    
    // parameter resultSelector: from every Employee, and all Inventories that have a foreign
    // key that refers to this Employee, make one new
    (employee, inventoriesOfThisEmployee) => new InventoryTotal
    {
        EmployeeId = employee.Id,
        EmployeeName = employee.Name,
    
        EmployeeOffSite = inventoriesOfThisEmployee
            .Select(inventory => inventory.OffSite).Sum(),
        EmployeeOnSite = inventoriesOfThisEmployee
            .Select(inventory => inventory.OnSite).Sum(),
    });
    

    Let Entity Framework do the GroupJoin

    This one feels a bit more natural, for every Employee we Select one InventoryTotal, as requested.

    var inventoryTotals = dbContext.Employees.Select(employee => new InventoryTotal
    {
        // Select the Employee properties that you want.
        EmployeeId = employee.Id,
        EmployeeName = employee.Name,
    
        // Get the inventories of this Employee:
        EmployeeOffSite = dbContext.Inventories
            .Where(inventory => inventory.EmployeeId == employee.Id)
            .Select(inventory => inventory.OffSite).Sum(),
    
        EmployeeOnSite = dbContext.Inventories
            .Where(inventory => inventory.EmployeeId == employee.Id)
            .Select(inventory => inventory.OnSite).Sum(),
    });
    

    Use the virtual ICollections

    This one feels the most natural. It is also very easy to unit test your usage without a real database.

    If you've followed the entity framework conventions, you will have classes similar to:

    public class Employee
    {
        public int Id { get; set; }
        public int Name { get; set; }
        ... // other properties
    
        // Every Employee has zero or more Inventories (one-to-many)
        public ICollection<Inventory> Inventories {get; set;}
    }
    
    public class Inventory
    {
        public int Id { get; set; }
        public decimal OffSite { get; set; }
        public decimal OnSite { get; set; }
        ... // other properties
    
        // Every Inventory is the Inventory of exactly one Employee, using foreign key
        public int EmployeeId { get; set; }
        public virtual Employee Employee {get; set;}  
    }
    

    This is enough for entity framework to detect the tables, the columns of the tables and the relations with the tables (one-to-many, many-to-many, ...). Only if you want to deviate from the conventions: different identifiers for tables and columns, non-default column types etc Attributes or fluent API is needed.

    In Entity framework the columns of the tables are represented by the non-virtual properties. The virtual properties represent the relations between the tables.

    The foreign key is a column in the table, hence it is non-virtual. The Inventory has no Employee column, hence property Employee is virtual.

    Once you've defined the virtual ICollection, the query is simple:

    Requirement: from every Employee get his Id and Name, and the total of all his OffSite and OnSite inventories.

    var inventoryTotals = dbContext.Employees.Select(employee => new InventoryTotal
    {
        // Select the Employee properties that you want.
        EmployeeId = employee.Id,
        EmployeeName = employee.Name,
    
        EmployeeOffSite = employee.Inventories
            .Select(inventory => inventory.OffSite).Sum(),
    
        EmployeeOnSite = employee.Inventories
            .Select(inventory => inventory.OnSite).Sum(),    
    });
    

    Simple comme bonjour!