I have a requirement to compare with one table and update another table if there are mismatches. Table LEGACY_CUST_INFO has correct CUST_PRIVILEGE_NUMBER. So, I need to findout the wrong CUST_PRIVILEGE_NUMBER in second table CUST_INFO and update with the right CUST_PRIVILEGE_NUMBER from first table LEGACY_CUST_INFO. ROW_ID in both the tables match are an exact match. Both tables have approximately 20 million records.
Below are the two table structures
Table 1 Name : LEGACY_CUST_INFO
ROW_ID CUST_PRIVILEGE_NUMBER
Table 2 Name : CUST_INFO
ROW_ID CUST_PRIVILEGE_NUMBER
I am getting performance issues when I check with the below statements. Is there a better way to create a table with mismatch records and update the column CUST_PRIVILEGE_NUMBER in CUST_INFO table ? Below is my code and logic.
CREATE TABLE EXACT_RECORDS
AS
(SELECT A.* FROM LEGACY_CUST_INFO A, CUST_INFO B WHERE A.ROW_ID = B.ROW_ID AND A.CUST_PRIVILEGE_NUMBER = B.CUST_PRIVILEGE_NUMBER);
DELETE FROM LEGACY_CUST_INFO WHERE ROW_ID IN (SELECT ROW_ID FROM EXACT_RECORDS);
UPDATE CUST_INFO B SET B.CUST_PRIVILEGE_NUMBER = (SELECT A.CUST_PRIVILEGE_NUMBER FROM LEGACY_CUST_INFO A WHERE A.ROW_ID = B.ROW_ID);
I know this logic is very basic. Could you help in better way so that 20 millions records are updated in less than 30 minutes ?
I suggest to using a merge statement containing a query, returning prepared not matched records for update.
merge into cust_info dst
using (
select l.row_id, l.cust_privilege_number
from legacy_cust_info lci
inner join cust_info ci on ci.row_id = lci.row_id
where ci.cust_privilege_number != lci.cust_privilege_number
) src
on (dst.row_id = src.row_id)
when matched then
update set dst.cust_privilege_number = src.cust_privilege_number;
Also consider create indexes for your tables:
CREATE INDEX legacy_cust_info#rid#cpn#idx ON LEGACY_CUST_INFO (row_id, cust_privilege_number);
CREATE INDEX cust_info#row_id#idx ON CUST_INFO (row_id);
UPD. You can estimate the time by running the nested select without merging its contents into the destination table:
select count(1) from (
select l.row_id, l.cust_privilege_number
from legacy_cust_info lci
inner join cust_info ci on ci.row_id = lci.row_id
where ci.cust_privilege_number != lci.cust_privilege_number
)
UPD2. To considering NULLs in ci.cust_privilege_number
simply add alternative condition into the query:
where (
ci.cust_privilege_number != lci.cust_privilege_number
or (
ci.cust_privilege_number is null
and lci.cust_privilege_number is not null
)
)