Search code examples
c#joinlinq-to-sqllinq-group

LINQ to SQL join when there aren't results


Given the following database structure alt text http://dl.dropbox.com/u/26791/tables.png

I'm trying to write a LINQ query that will return images grouped by tags it's associated with. So far I've got this:

var images = from img in db.Images
    join imgTags in db.ImageTags on img.idImage equals imgTags.idImage
    join t in db.Tags on imgTags.idTag equals t.idTag
    where img.OCRData.Contains(searchText.Text)
    group img by new { t.TagName } into aGroup
    select new
    {
        GroupName = aGroup.Key.TagName,
        Items = from x in aGroup
        select new ImageFragment()
        {
             ImageID = x.idImage,
             ScanDate = x.ScanTime
        }
    };

Which works great. However, I also want to return Images that do not have any tags associated with them in a group of "(Untagged)" or something. I can't wrap my head around how I would do this without inserting a default tag for every image and that seems like generally not a very good solution.


Solution

  • It's a little tricky, but you can do it in one big query if you have the ability to instantiate new ImageTag and Tag instances for linq to work with. Essentially, when you're doing an outer join, you have to use the into keyword with the DefaultIfEmpty(...) method to deal with the "outer join gaps" (e.g., when the right side of the joined key is null in a typical SQL left outer join).

    var images = from img in db.Images
         join imgTags in db.ImageTags on img.idImage equals imgTags.idImage
         into outerImageRef 
         from outerIR in outerImageRef.DefaultIfEmpty(new ImageTag() { idImage = img.idImage, idTag = -1 })
         join t in db.Tags on imgTags.idTag equals t.idTag
         into outerRefTags 
         from outerRT in outerRefTags.DefaultIfEmpty(new Tag(){ idTag=-1, TagName ="untagged"})
         group img by outerRT.TagName into aGroup
         select new {
             GroupName = aGroup.Key,
             Items = from x in aGroup
                 select new ImageFragment() {
                     ImageID = x.idImage,
                     ScanDate = x.ScanTime
                 }
         };
    

    Hopefully the above compiles since I don't have your exact environment, I built my solution using my own data types and then converted it to your question's description. Basically the key parts are the extra into and DefaultIfEmpty lines that essentially help add the extra "rows" into the massively joined table that's in memory if you're thinking about it in the traditional sql sense.

    However, there's a more readable solution that doesn't require the in memory instantiation of linq entities (you'll have to convert this one yourself to your environment):

    //this first query will return a collection of anonymous types with TagName and ImageId,
    // essentially a relation from joining your ImageTags x-ref table and Tags so that
    // each row is the tag and image id (as Robert Harvey mentioned in his comment to your Q)
    var tagNamesWithImageIds = from tag in Tags
           join refer in ImageTags on tag.IdTag equals refer.IdTag
           select new {
               TagName = tag.Name,
               ImageId = refer.IdImage
           };
    //Now we can get your solution by outer joining the images to the above relation
    // and filling in the "outer join gaps" with the anonymous type again of "untagged"
    // and then joining that with the Images table one last time to get your grouping and projection.
    var images = from img in Images
         join t in tagNamesWithImageIds on img.IdImage equals t.ImageId
         into outerJoin
         from o in outerJoin.DefaultIfEmpty(new { TagName = "untagged", ImageId = img.IdImage })
         join img2 in Images on o.ImageId equals img2.IdImage
         group img2 by o.TagName into aGroup
         select new {
             TagName = aGroup.Key,
             Images = aGroup.Select(i => i.Data).ToList() //you'll definitely need to replace this with your code's logic. I just had a much simpler data type in my workspace.
         };
    

    Hope that makes sense. Of course, you can always just set your application to tag everything by default w/ "untagged" or do some much simpler LINQ queries to create a list of image id's that are not present in your ImageTag table, and then union or something.