I have 2 tables, around 300M rows each
CREATE TABLE AllIDs
(
ID int NULL,
isUsed BOOLEAN NULL
)
CREATE TABLE UsedIDs
(
ID int NULL
)
I want to find all the IDs that are not used, that is, all the ID rows in AllIDs that do not appear in UsedIDs
I have tried the below and a few similar variations but the query runs indefinitely (16+ hours) even with a Medium warehouse.
UPDATE AllIDs t
SET t.IsUsed = 0
from AllIDs
where not exists (select 1 FROM UsedIDs f
WHERE t.ID = f.ID )
Any ideas how this can be done efficiently?
I believe the from
clause in the update statement is unnecessary and results in a a more complex query plan, including an unnecessary Cartesian join.
The following may be more performant:
UPDATE allids t
SET t.isused = 0
WHERE NOT EXISTS (
SELECT 1
FROM usedids f
WHERE t.id = f.id
)