Search code examples
c#postgresqllinqef-core-6.0

LINQ for "Outer Apply" / "Left Outer Join" kinda Query for PostgreSQL


I have a problem with my linq query, it performs really slow. And when I debug and see the translated query, I see what the problem is. My LinQ isn't really make the query I want.

Here's the LinQ I've got now:

from doc in context.document
from lasthistory in context.documenthistory
    .Where(x => x.documentid == doc.id)
    .OrderByDescending(x => x.actiondatetime)
    .Take(1)
    .DefaultIfEmpty()
where lasthistory.actiondatetime >= periodFrom
    && lasthistory.actiondatetime < periodTo.AddDays(1)
select new
{
    id = doc.id,
    lastactionby = lasthistory.actionby,
    lastactiondatetime = lasthistory.actiondatetime
}

and here's the translated query from this linq

SELECT d.id, t0.actionby AS lastactionby, t0.actiondatetime AS lastactiondatetime
FROM dbo.document AS d
LEFT JOIN (
    SELECT t.actionby, t.actiondatetime, t.documentid
    FROM (
        SELECT d0.actionby, d0.actiondatetime, d0.documentid, ROW_NUMBER() OVER(PARTITION BY d0.documentid ORDER BY d0.actiondatetime DESC) AS row
        FROM dbo.documenthistory AS d0
    ) AS t
    WHERE t.row <= 1
) AS t0 ON d.id = t0.documentid
WHERE (t0.actiondatetime >= @__periodFrom_1) AND (t0.actiondatetime < @__AddDays_2)

It tries to select all data in documenthistory first, that's the reason my query performs really slow.

And I already update the LinQ now to be like this:

from doc in context.document
let lasthistory = context.documenthistory
    .Where(x => x.documentid == doc.id)
    .OrderByDescending(x => x.actiondatetime)
    .FirstOrDefault()
where lasthistory.actiondatetime >= periodFrom
    && lasthistory.actiondatetime < periodTo.AddDays(1)
select new
{
    id = doc.id,
    lastactionby = lasthistory.actionby,
    lastactiondatetime = lasthistory.actiondatetime
}

It performs better than the first one, but the translated query kinda "ugly" with so many subqueries

SELECT d.id, 
    (
        SELECT d2.actionby 
        FROM dbo.documenthistory AS d2 
        WHERE d2.documentid = d.id 
        ORDER BY d2.actiondatetime DESC 
        LIMIT 1
    ) AS lastactionby, 
    (
        SELECT d3.actiondatetime 
        FROM dbo.documenthistory AS d3 
        WHERE d3.documentid = d.id 
        ORDER BY d3.actiondatetime DESC 
        LIMIT 1
    ) AS lastactiondatetime
FROM dbo.document AS d
WHERE ((SELECT d0.actiondatetime FROM dbo.documenthistory AS d0 WHERE d0.documentid = d.id ORDER BY d0.actiondatetime DESC LIMIT 1) >= @__periodFrom_1)
    AND ((SELECT d1.actiondatetime FROM dbo.documenthistory AS d1 WHERE d1.documentid = d.id ORDER BY d1.actiondatetime DESC LIMIT 1) < @__AddDays_2)

the query I want is like this:

SELECT d.id, t.actionby AS lastactionby, t.actiondatetime AS lastactiondatetime
FROM dbo.document AS d
LEFT JOIN LATERAL (
    SELECT d0.actionby, d0.actiondatetime
    FROM dbo.documenthistory AS d0
    WHERE d0.documentid = d.id
    ORDER BY d0.actiondatetime DESC
    FETCH FIRST 1 ROW ONLY
) t ON true
WHERE (t.actiondatetime >= @__periodFrom_1) AND (t.actiondatetime < @__AddDays_2)

Is there a way to do this in Linq? Or maybe Linq with result similar to this query?

Thanks in advance!


Solution

  • One workaround which come to mind is to use Distinct. In this case, it is a little bit difficult to build INNER JOIN with Window function. DefaultIfEmpty() removed, according to filter you need CROSS APPLY

    var query =
        from doc in context.document
        from lasthistory in context.documenthistory
            .Where(x => x.documentid == doc.id)
            .OrderByDescending(x => x.actiondatetime)
            .Take(1)
            .Select(x => new { x.actionby, x.actiondatetime })
            .Distinct()
        where lasthistory.actiondatetime >= periodFrom
            && lasthistory.actiondatetime < periodTo.AddDays(1)
        select new
        {
            id = doc.id,
            lastactionby = lasthistory.actionby,
            lastactiondatetime = lasthistory.actiondatetime
        };
    

    From other side you can limit scanned range of records

    var documents = context.document
        .Where(doc = context.documenthistory.Any(h => h.documentid == doc.id 
            && h.actiondatetime >= periodFrom
            && h.actiondatetime < periodTo.AddDays(1))
        );
    
    var query =
        from doc in documents
        from lasthistory in context.documenthistory
            .Where(x => x.documentid == doc.id)
            .OrderByDescending(x => x.actiondatetime)
            .Take(1)
        where lasthistory.actiondatetime >= periodFrom
            && lasthistory.actiondatetime < periodTo.AddDays(1)
        select new
        {
            id = doc.id,
            lastactionby = lasthistory.actionby,
            lastactiondatetime = lasthistory.actiondatetime
        };