Search code examples
sqlpostgresqlcommon-table-expressionquery-performance

Joining more than one common table expression (CTE) is taking too long in SQL (query will not end) is there something wrong with my code?


I am running a query with two CTEs that join and then a query that uses the joined CTEs. However this query was running for several hours yesterday and never completed. Queries where I am usually complete within the minute.

This is the query I am runnning:

WITH cte_second_try AS (
SELECT primary_key as timing,
MIN(timedate) AS second_try,
FROM table t1
WHERE timedate > (SELECT MIN(timedate) FROM table t2 WHERE t1.primary_key = t2.primary_key group by primary_key)
AND criteria1='x' AND timedate BETWEEN 'x' AND 'x' GROUP BY 1),

cte_timing AS (
SELECT MIN(timedate) AS first_try, MIN(second_try) AS second_try,
FROM table t3
LEFT JOIN cte_second_try ON t3.primary_key = cte_second_try.timing
WHERE criteria1='x' AND timedate BETWEEN 'x' AND 'x' group by 1)

SELECT id, t4.primary_key, var2, var3, var4, var5 etc
 CASE
 WHEN var2 <> 'string'
 THEN 0
 WHEN cte_timing.first_try = t4.timedate
 THEN 1
 WHEN cte_timing.second_try = t4.timedate
 THEN 2
 ELSE 3 END
FROM table t4
 LEFT JOIN cte_timing ON t4.primary_key = cte_timing.primary_key
WHERE criteria1='x' AND timedate BETWEEN 'x' AND 'x'
ORDER BY timedate;

Any suggestions as to what is causing this bottleneck?

Furthermore this is my first time joining CTEs so there might be something wrong in how I have joined the CTEs. The general idea is that I find the second attempt at a task in the first CTE and the first attempt at a task in the second CTE and then reference both via the second in the query at the end.

I am using PostgreSQL


Solution

  • I have the answer to this question.

    I have just heard back from a tech guy and it is because of indexing.

    By adding an index the cost of the query went from:

    cost=151491121951.08 to cost=820386.91

    You can do:

    EXPLAIN SELECT primary_key as timing,
    MIN(timedate) AS second_try,
    FROM table t1
    WHERE timedate > (SELECT MIN(timedate) FROM table t2 WHERE t1.primary_key = t2.primary_key group by primary_key)
    AND criteria1='x' AND timedate BETWEEN 'x' AND 'x' GROUP BY 1
    

    to get the cost of the query using EXPLAIN