I am running a query with two CTEs that join and then a query that uses the joined CTEs. However this query was running for several hours yesterday and never completed. Queries where I am usually complete within the minute.
This is the query I am runnning:
WITH cte_second_try AS (
SELECT primary_key as timing,
MIN(timedate) AS second_try,
FROM table t1
WHERE timedate > (SELECT MIN(timedate) FROM table t2 WHERE t1.primary_key = t2.primary_key group by primary_key)
AND criteria1='x' AND timedate BETWEEN 'x' AND 'x' GROUP BY 1),
cte_timing AS (
SELECT MIN(timedate) AS first_try, MIN(second_try) AS second_try,
FROM table t3
LEFT JOIN cte_second_try ON t3.primary_key = cte_second_try.timing
WHERE criteria1='x' AND timedate BETWEEN 'x' AND 'x' group by 1)
SELECT id, t4.primary_key, var2, var3, var4, var5 etc
CASE
WHEN var2 <> 'string'
THEN 0
WHEN cte_timing.first_try = t4.timedate
THEN 1
WHEN cte_timing.second_try = t4.timedate
THEN 2
ELSE 3 END
FROM table t4
LEFT JOIN cte_timing ON t4.primary_key = cte_timing.primary_key
WHERE criteria1='x' AND timedate BETWEEN 'x' AND 'x'
ORDER BY timedate;
Any suggestions as to what is causing this bottleneck?
Furthermore this is my first time joining CTEs so there might be something wrong in how I have joined the CTEs. The general idea is that I find the second attempt at a task in the first CTE and the first attempt at a task in the second CTE and then reference both via the second in the query at the end.
I am using PostgreSQL
I have the answer to this question.
I have just heard back from a tech guy and it is because of indexing.
By adding an index the cost of the query went from:
cost=151491121951.08 to cost=820386.91
You can do:
EXPLAIN SELECT primary_key as timing,
MIN(timedate) AS second_try,
FROM table t1
WHERE timedate > (SELECT MIN(timedate) FROM table t2 WHERE t1.primary_key = t2.primary_key group by primary_key)
AND criteria1='x' AND timedate BETWEEN 'x' AND 'x' GROUP BY 1
to get the cost of the query using EXPLAIN