This has probably been asked before, but googling for keywords like "IN" doesn't work very well.
This is my query:
UPDATE tblCustomer SET type = 2
WHERE idcustomer
IN (SELECT fidcustomer1
FROM tblorder
UNION
SELECT fidcustomer2
FROM tblorder
)
To break it down: I want to set the type (just an int) of all customers to 2 for all customers that appear in the order-table, in one of either column.
On my test data, none of these tables contain more than a few hundred rows, yet the query runs for many minutes (even without the UNION, that doesn't seem to make much of a difference), apparently re-doing the inner query once per row in customer. I could obviously rewrite it into a single SELECT DISTINCT(id), followed by a few hundred single-row updates and do the logic in whatever language I use for my ODBC-access, but that's just a hack.
How can I rewrite this properly?
Addendum: The table I want to update contains a lot of relatively large BYTEA blobs, a few MB per row. They are set to Storage External or Extended, but I wonder if that could make the sequential scans slow. All updates seem to take a long time, not just this one.
-------------------------------
-- Use two EXISTS:
-------------------------------
UPDATE tblCustomer tc
SET type = 2
WHERE EXISTS (
SELECT *
FROM tblorder ex
WHERE ex.fidcustomer1 = tc.idcustomer
)
OR EXISTS (
SELECT *
FROM tblorder ex
WHERE ex.fidcustomer2 = tc.idcustomer
);
-------------------------------
-- or combine the two EXISTS::
-------------------------------
UPDATE tblCustomer tc
SET type = 2
WHERE EXISTS (
SELECT *
FROM tblorder ex
WHERE ex.fidcustomer1 = tc.idcustomer
OR ex.fidcustomer2 = tc.idcustomer
);
My gut feeling is that the first version (with two separate exists) will perform better, because the executor could short-circuit if one of the existss would yield True
. That would avoid the removal-of-duplicates phase (and probably sorting), which is inherent to the UNION construct.