Search code examples
postgresqlselectviewpsql

Postgresql Query vs View execution speed issue


I have a few tables that I join to produce a result data set I need (as shown below).

Query:

    select distinct on (e."Col1", e."Col2")
        s."SrcId",
        s."CreatedTime",
        s."ColM",
        s."ColN",
        m."ColX" as "m_ColX",
        m."ColY" as "m_ColY",
        n."ColZ1" as "n.ColZ1",
        n."ColZ2" as "n.ColZ2",
        e."Col1",
        e."Col2"
    from schema1."Table1" s
    left join schema1."Table2" e on s."SrcId" = e.sid and date_trunc('second',s."CreatedTime")=date_trunc('second',e."ReportedTime")
    left join schema1."Table3" m on m."Sid"=s."SrcId" and m."IsActive"=TRUE
    left join schema1."Table4" n on n."Sid"=s."SrcId" and n."IsActive"=TRUE;

When I execute the query directly with a where clause (shown below), the results are almost instantaneous.

select distinct on (e."Col1", e."Col2")
        s."SrcId",
        s."CreatedTime",
        s."ColM",
        s."ColN",
        m."ColX" as "m_ColX",
        m."ColY" as "m_ColY",
        n."ColZ1" as "n.ColZ1",
        n."ColZ2" as "n.ColZ2",
        e."Col1",
        e."Col2"
    from schema1."Table1" s
    left join schema1."Table2" e on s."SrcId" = e.sid and date_trunc('second',s."CreatedTime")=date_trunc('second',e."ReportedTime")
    left join schema1."Table3" m on m."Sid"=s."SrcId" and m."IsActive"=TRUE
    left join schema1."Table4" n on n."Sid"=s."SrcId" and n."IsActive"=TRUE
    where s."SrcId"=10 and s."CreatedTime" between '2024-10-10T00:00:00.000Z' and '2024-10-10T10:10:10.000Z'

But if I use the same query in a view, the results take way longer. Assume the view is called view1 for simplicity and I invoke it as below:

select * from schema1.view1 where "SrcId"=10 and "CreatedTime" between '2024-10-10T00:00:00.000Z' and '2024-10-10T10:10:10.000Z'

Can someone point out why this is the case and if there is anything I can do to improve/optimize calling the view so it results in a similar performance?

The first thought that comes to mind is the view might be taking time to correlate the where clause fields with the fields in the actual tables to figure out which ones to map them to. Could that be the culprit (or worse could that be skewing my results as well?) If so, what is a more efficient way to pass the where clause?


Solution

  • The queries are not equivalent:

    • in the query that runs fast for you, the WHERE condition is executed before the DISTINCT, and it can probably filter rows efficiently

    • in the slow query, the WHERE condition is applied after the DISTINCT, which prevents such filtering and may give you a different query result

    My recommendation is to lay off the DISTINCT in the view definition. If you really need it, write it into the query that uses the view.