Trying to figure out the most efficient way write a query to keep a table up to date.
For example I have 3 tables and let's call them TableA
, TableB
and TableC
TableA
is the one needs to be kept up to date with unique codes. All 3 tables have a column named Code
.
The query must look inside TableB
and TableC
to find unique codes that do not exist in TableA
and insert them into TableA
.
Note that both TableB
and TableC
tables have ~55k records in each and TableA
~1k
. Number of records in tables rarely changes. Max 1k records a year in total.
First option I came up with the following but didn't like the execution plan:
INSERT INTO TableA (Code)
SELECT DISTINCT Code FROM TableB WHERE Code NOT IN(SELECT Code FROM TableA)
UNION
SELECT DISTINCT Code FROM TableC WHERE Code NOT IN(SELECT Code FROM TableA)
Also looked into this link which shows LEFT OUTER JOIN
option. But couldn't figure out to join 3 tables to get codes that is missing in TableA
but exist in either TableB
Or
TableC
.
Here is one option that UNION ALL
both tables, then filters out with NOT EXISTS
and a correlated subquery.
INSERT INTO TableA (code)
SELECT DISTINCT code
FROM (SELECT code FROM TableB UNION ALL SELECT code FROM TableC) t
WHERE NOT EXISTS (SELECT 1 FROM TableA ta WHERE ta.code = t.code)
You do want an index on TableA(code)
to speed up the subquery.
I am wondering whether deduplicating first, using UNION
would offer better performance:
INSERT INTO TableA (Code)
SELECT Code
FROM (SELECT code FROM TableB UNION SELECT code FROM TableC) t
WHERE NOT EXISTS (SELECT 1 FROM TableA ta WHERE ta.code = t.code)