Search code examples
linqsql-server-2008linq-to-sqlc#-4.0linq-to-entities

Recommend a fitting LINQ provider to me (SQL server, complex queries)


I've been using LINQ to SQL & to entities for a while and am overall very happy with them. However i know of their limitations and one in particular is becoming a large issue for me. When you do a complex nested query in the form of

MyContext.SomeTable
.Select(item=>new{
    item.SomeProperty1,
    item.SomeProperty2,
    item.NavigationProperty1
        .Select(nav1=> new {// retrieve some properties}), // This triggers a single query as long as don't have more than one subquery
    item.NavigationProperty2
        .Select(nav2=> new {// retrieve some properties}) // This triggers one query PER ROW in the original query
});

The providers i have tested are LINQ TO SQL / LINQ TO entities (and even worse, devart LINQConnect that fares worse and generates 1 per row on the first navigation property)

What i get now that is generated(pseudocode):

select t1.a,t1.b,t2.c,t2.d from mytable as t1
join navproperty1table as t2

and 1 millions (if there is 1 million results in the first set) of queries like this: select t3.e,t3.f from navproperty2table as t3 where id = X (X changing on X query to next element returned by first query)

What i want:

select t1.a,t1.b,t2.c,t2.d,t3.e,t3.f from mytable as t1 
join navproperty1table as t2
join navproperty2table as t3

Now of course if there were 3 rows in the original table it wouldn't be an issue, but i have 10s of thousands to millions of rows in my tables "and" i need a much much much more complex query in a single select (i want to get a complex graph at once). Think 20 + tables with 3-6 levels of nesting accessing an additional 2-5 tables each.

My SQL server can perfectly cope with it, i don't care for the bandwidth either, it's on an instance linked by a gigabit connection, i can't get that data in deferred manner, i actually "use" all of it immediately so it's not just laziness. Right now for performance reasons i had to split the query in many small queries and join them manually on the LINQ to object size, which gives some really nasty code for whoever maintains it but was the only actual solution i had, so overall including all the small queries and final joining, I'm at over 600 lines of unsplitable code in a single method that is totally unmaintainable.

Are there actually "any" LINQ providers production ready today before i go and evaluated them all that work in such a mindset or am i better off coding and commercializing my own? (I'm very surprised that they don't all work that way actually, i can't see a single instance where you'd be better off with the foreach case and the ones i've tried that claim to get rid of n+1 with loadwith, don't get rid of it as they still do n+1 queries but just batch it in a single call, 1 round trip & n+1 queries isn't satisfying when 1 is 10 000 then 10 000 000 and then 10 000 000 000)

  • (note that I'm speculating on what exactly triggers this, but it isn't the question, no matter what triggers this "exactly" I'm sure to hit it in my current context)

PS: Note that I'm running .NET 4.0 full profile on a windows server 2008 or higher and on SQL server 2008 or higher, a provider that doesn't support anything else would be fine, i have zero requirements for migration, portability, lower .net versions, lower sql server support etc. Migrating to even more recent versions is an option if required. I also don't have any prerequisites for modeling or advanced features, the DB is already there, i only want to query tables, so something with no modeling / views / DML / stored procedure / functions support is fine, my one and only requirement is sensible SQL generation on complex queries and object graphs

EDIT: for clarification here is an actual example of the issue on a DB everyone can get, adventureworks

Querying employees for each contact

Contacts
.Select(cont=>new 
{
    cont.EmailAddress,
    cont.EmailPromotion,
    Employees = cont.Employees
        .Select(emp=>new
        {
            emp.Gender,
            emp.HireDate
        }).ToList()
}).ToList()

Generates

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]

FROM [Person].[Contact] AS [t0] LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID] ORDER BY [t0].[ContactID], [t1].[EmployeeID]

Now querying just vendors for each Contact Contacts .Select(cont=>new { cont.EmailAddress, cont.EmailPromotion, Vendors = cont.VendorContacts.Select(vend=>new { vend.ContactTypeID, vend.ModifiedDate }).ToList() }).ToList()

still ok:

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[ContactTypeID], [t1].[ModifiedDate], (
SELECT COUNT(*)
FROM [Purchasing].[VendorContact] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value]

FROM [Person].[Contact] AS [t0] LEFT OUTER JOIN [Purchasing].[VendorContact] AS [t1] ON [t1].[ContactID] = [t0].[ContactID] ORDER BY [t0].[ContactID], [t1].[VendorID]

Now querying both at once (triggers X row query)

Contacts
.Select(cont=>new 
{
    cont.EmailAddress,
    cont.EmailPromotion,
    Employees = cont.Employees
        .Select(emp=>new
        {
            emp.Gender,
            emp.HireDate
        }).ToList(),
    Vendors = cont.VendorContacts.Select(vend=>new
    {
        vend.ContactTypeID,
        vend.ModifiedDate
    }).ToList()
}).ToList()

Generates the ugly and slow (not pasting it all for obvious reasons but you get the point):

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], (
SELECT COUNT(*)
FROM [HumanResources].[Employee] AS [t2]
WHERE [t2].[ContactID] = [t0].[ContactID]
) AS [value], [t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
ORDER BY [t0].[ContactID], [t1].[EmployeeID]
GO

-- Region Parameters
DECLARE @x1 Int = 1
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

 -- Region Parameters
DECLARE @x1 Int = 2
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 3
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 4
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 5
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 6
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 7
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 8
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 9
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

-- Region Parameters
DECLARE @x1 Int = 10
-- EndRegion
SELECT [t0].[ContactTypeID], [t0].[ModifiedDate]
FROM [Purchasing].[VendorContact] AS [t0]
WHERE [t0].[ContactID] = @x1
GO

What i expect / would like to see generated:

SELECT [t0].[EmailAddress], [t0].[EmailPromotion], [t1].[Gender], [t1].[HireDate], [t2].[ContactTypeID], [t2].[ModifiedDate] ,[t0].[ContactID]
FROM [Person].[Contact] AS [t0]
LEFT OUTER JOIN [HumanResources].[Employee] AS [t1] ON [t1].[ContactID] = [t0].[ContactID]
LEFT OUTER JOIN [Purchasing].[VendorContact] AS [t2] ON [t2].[ContactID] = [t0].[ContactID]
GO

Solution

  • I've found a provider that seems to handle my core problem (sane generation of SQL vs generating millions of statements for subqueries), not sure if it is a good fit yet as this depends on their answers.

    http://www.llblgen.com/tinyforum/Messages.aspx?ThreadID=20658&StartAtMessage=0&#116494

    Any other providers i should know of? If i managed to miss this one untill now there may be others and i'd be most happy to compare them. What i have now is

    Totally fails my requirement for no 1 query per row issue: - linq to SQL - linq to Entities - devart linqconnect

    Seems to work - llblgen

    Untested / need feedback - Telerik OpenAccess - NHibernate - Mindscape lightspeed

    any others i should know off?