I have this code, and it is taking 8 to 9 hours for 450k rows of data to complete. How to improve performance in that case? tab_a
has 10M of data.
Declare
cursor cur_dup is
Select col_a,col_b,count(1) from tab_a group by col_a,col_b having count(1) > 1;
Begin
for cur_dup_rec in cur_dup loop
update tab_a
set qty = 0
where col_a = cur_dup_rec.col_a and col_b = cur_dup_rec.col_b
and trunc(u_created_date) <> (Select max(trunc(u_created_date)) from tab_a
where col_a = cur_dup_rec.col_a and col_b = cur_dup_rec.col_b);
update tab_a
set (u_modified_date) = trunc(sysdate)
where col_a = cur_dup_rec.col_a and col_b = cur_dup_rec.col_b;
for cur_dup2_rec in (Select distinct item,customer,location,requested_date
from tab_a
where col_a = cur_dup_rec.col_a and col_b = cur_dup_rec.col_b) loop
update tab_a
set (u_modified_date) = trunc(sysdate)
where item = cur_dup2_rec.item and customer = cur_dup2_rec.customer and location = cur_dup2_rec.location
and to_char(requested_date,'DD-MON-YYYY') = cur_dup2_rec.requested_date;
end loop;
end loop;
End;
Stop using loops and use a single SQL statement.
For the first two UPDATE
s you can use something like (untested because we do not have the CREATE TABLE
or INSERT
statements for your sample data):
MERGE INTO tab_a dst
USING (
SELECT COUNT(1) OVER (PARTITION BY col_a, colb) AS cnt,
MAX(u_created_date) OVER (PARTITION BY col_a, colb) AS max_u_created_date
FROM tab_a
) src
ON (dst.ROWID = src.ROWID AND src.cnt > 1)
WHEN MATCHED THEN
UPDATE
SET u_modified_date = TRUNC(SYSDATE),
qty = CASE
WHEN TRUNC(dst.u_modified_date) = TRUNC(src.max_u_created_date)
THEN dst.qty
ELSE 0
END;
I'm unsure what you are trying to achieve with the final UPDATE
in the nested loop but you almost certainly don't want to use nested loops and want to convert it to an SQL statement.