Search code examples
sqlpostgresqlpostgresql-performance

Tuning a query in PostgreSQL


What is the best way to tune the below SQL for PostgreSQL which seems to be very costly? Will creating a temporary table gives optimal cost?

UPDATE table1
SET    id = qry.crmId
FROM   (
    SELECT b.id AS crmId, a.row
    FROM   table1 AS a INNER JOIN table2 AS b ON lower(a.email) = lower(b.email) AND b.id = (
                SELECT MIN(id)
                FROM   table2
                WHERE  email = b.email AND email IS NOT NULL AND
                created = (
                            SELECT MIN(created)
                            FROM   table2
                            WHERE  email = b.email
                            )
                LIMIT 1
                )
    WHERE a.email IS NOT NULL AND b.id IS NOT NULL AND a.id IS NULL
    ) AS qry
WHERE  table1.row = qry.row;

Solution

  • If you phrase the query like this:

    update table1 t11
        set id = (select id
                  from table1 t12
                  where t12.email = t11.email and
                        t12.id is not null
                  order by t12.created
                  limit 1
                 )
        where id is null and email is not null;
    

    Then it can take advantage of indexes on table1(id) and table1(email, created, id).

    You might want to add a check such as and exists (select 1 from table1 t12 where t12.email = t11.email and t12.email is not null and t12.id is not null).