Search code examples
c#linqentity-framework-core

How to flatten a nested entity using Linq (opposite of SelectMany)?


I am trying to convert SQL to Linq that flattens two related tables into one record per parent table row (opposite of SelectMany).

My first attempt is below, but the resulting SQL is different although the result is the same. The example uses Product and ProductImage where a single Product can have multiple ProductImages.

Is my Ling correct and the SQL produced is equivalent, or what Linq can produce the SQL below?

Tables:

  • Product (productid int, name varchar(50))

  • ProductImage (productid int, imageid int, imagepath varchar(450))

    • ProductImage.productid is a foreign key pointing to Product
    • ProductImage.imageid is numbered 1,2,3,4 for each ProductImage

SQL select to flatten the first three photos of a product

select 
    p.productid, p.name,
    i1.imagepath as i1imagepath,
    i2.imagepath as i2imagepath,
    i3.imagepath as i3imagepath 
from 
    product as p
join 
    productimage as i1 on p.productid = i1.productid
join 
    productimage as i2 on p.productid = i2.productid
join 
    productimage as i3 on p.productid = i3.productid
where 
    i1.imageid = 1 and
    i2.imageid = 2 and
    i3.imageid = 3

First attempt at Linq to convert this query:

Product.Select( p => new { p.productid, p.name,
    i1imagepath = p.ProductImage.AsQueryable()
                 .Where(i => i.imageid == 1)
                 .Select(i => i.imagepath).FirstOrDefault(),
    i2imagepath = p.ProductImage.AsQueryable()
                 .Where(i => i.imageid == 2)
                 .Select(i => i.imagepath).FirstOrDefault(),
    i3imagepath = p.ProductImage.AsQueryable()
                 .Where(i => i.imageid == 3)
                 .Select(i => i.imagepath).FirstOrDefault()
    }).Take(1).Dump();  // LinqPad Dump()

but it produces this SQL instead of above which looks less efficient (a side note, the actual execution numbers for the below SQL are nearly half of the original SQL above.)

select top 1 p.productid, p.name,
    (select top 1 p0.imagepath
     from productimage as p0
     where p.productid = p0.productid and
           p0.imageid = 1) as i1imagepath,
    (select top 1 p1.imagepath
     from productimage as p1
     where p.productid = p1.productid and
           p1.imageid = 2) as i2imagepath,
    (select top 1 p2.imagepath
     from productimage as p2
     where p.productid = p2.productid and
           p2.imageid = 3) as i3imagepath
from product

Solution

  • Use navigation properties. Product should have a ProductImages collection set up.

    public virtual ICollection<ProductImage> ProductImages { get; } = [];
    

    This is how you leverage EF to handle related entities rather than explicit Joins like in SQL. EF manages all of the table joining automatically and you just build Linq expressions through the relations. Explicit joins are an exception to the norm where you have unofficial or unconventional relationships that cannot be expressed with FKs.

    From there you just use projection:

    var product = _context.Products
        .Where(p => p.ProductId == productId)
        .Select(p => new 
        {
            p.ProductId,
            p.Name,
            ImagePath1 = p.ProductImages.FirstOrDefault(pi => pi.ImageId == 1).ImagePath,
            ImagePath2 = p.ProductImages.FirstOrDefault(pi => pi.ImageId == 2).ImagePath,
            ImagePath3 = p.ProductImages.FirstOrDefault(pi => pi.ImageId == 3).ImagePath
       }.Single();
    

    Normally with a Linq expression in memory you would need to handle the possibility of #null on the FirstOrDefault i.e. p.ProductImages.FirstOrDefault(pi => pi.ImageId == 1)?.ImagePath but when EF translates this down to SQL it will handle the possibility of null results (I.e. if there is no image #3 associated with the product) and it may complain if you explicitly add the short=hand null check. "?.". (Disclaimer: I haven't tried ?. with EF Core 9 yet to see if that behaviour has changed)

    Alternatively a cleaner structure would be to project the images to a set, even if you just want the top 3:

    var product = _context.Products
        .Where(p => p.ProductId == productId)
        .Select(p => new 
        {
            p.ProductId,
            p.Name,
            TopThreeImagePaths = p.ProductImages
               .OrderBy(pi => pi.ImageId)
               .Select(pi => pi.ImagePath)
               .Take(3)
               .ToList()
       }.Single();
    

    This would product an object for the product containing a collection of 0-3 image path strings.