I have 2 tables where I copy file name from one table to another in an update operation. Using INNER JOIN makes the query run in 22 seconds when there are just ~4000 rows. Using a WHERE clause allows it to run it in about 200 milliseconds. How and why is this happening, does the INNER JOIN result in additional looping?
Example 1 using INNER JOIN - Takes 22 seconds when table a has about 4k records.
UPDATE table_a SET file_name = tmp.file_name FROM
(
SELECT b.customer_id, b.file_name, b.file_id FROM table_b AS b WHERE b.status = 'A'
) tmp
INNER JOIN table_a AS a
ON tmp.customer_id=a.customer_id AND tmp.file_id=a.file_id;
Example 2 using WHERE runs in about 200 ms.
UPDATE table_a AS a SET file_name = tmp.file_name FROM
(
SELECT b.customer_id, b.file_name, b.file_id FROM table_b AS b WHERE b.status = 'A'
) tmp
WHERE tmp.customer_id=a.customer_id AND tmp.file_id=a.file_id;
The queries are doing totally different things. The first is updating every row in table_a
with the expression. I am guessing that there are even multiple updates on the same row.
The two table_a
s in the first version are two different references to the table. The effect is a cross join
because you have no conditions combining them.
The second method is the correct syntax for what you want to do in Postgres.