What is the best way to tune the below SQL for PostgreSQL which seems to be very costly? Will creating a temporary table gives optimal cost?
UPDATE table1
SET id = qry.crmId
FROM (
SELECT b.id AS crmId, a.row
FROM table1 AS a INNER JOIN table2 AS b ON lower(a.email) = lower(b.email) AND b.id = (
SELECT MIN(id)
FROM table2
WHERE email = b.email AND email IS NOT NULL AND
created = (
SELECT MIN(created)
FROM table2
WHERE email = b.email
)
LIMIT 1
)
WHERE a.email IS NOT NULL AND b.id IS NOT NULL AND a.id IS NULL
) AS qry
WHERE table1.row = qry.row;
If you phrase the query like this:
update table1 t11
set id = (select id
from table1 t12
where t12.email = t11.email and
t12.id is not null
order by t12.created
limit 1
)
where id is null and email is not null;
Then it can take advantage of indexes on table1(id)
and table1(email, created, id)
.
You might want to add a check such as and exists (select 1 from table1 t12 where t12.email = t11.email and t12.email is not null and t12.id is not null)
.