We have a load plan in ODI. We get a repeating error in some on our scenarios due to duplicate records in the I% table. What we do is manually run the script every time the load plan fails.
DELETE FROM adw12_dw. I$_1558911580_4
WHERE (EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID) IN
( SELECT EFFECTIVE_FROM_DT,
DATASOURCE_NUM_ID,
INTEGRATION_ID
FROM adw12_dw . I$_1558911580_4
GROUP BY EFFECTIVE_FROM_DT,
DATASOURCE_NUM_ID,
INTEGRATION_ID
HAVING COUNT (1) > 1)
AND ROWID NOT IN
( SELECT MIN (ROWID)
FROM adw12_dw . I$_1558911580_4
GROUP BY EFFECTIVE_FROM_DT,
DATASOURCE_NUM_ID,
INTEGRATION_ID
HAVING COUNT (1) > 1)
commit;
Is there a way to automate the deletion of duplicate records in the Integration table?
If you have duplicates in the source, best would be to handle that in the logic of the mapping.
What could work is to add an expression component to add a row_rank
column using an analytical function to rank the duplicates : row_number() over (partition by EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID order by ROWID)
.
You can then add a filter with the condition row_rank = 1
.
If you prefer to do a delete after inserting, you can edit the IKM and add the delete step before loading the target table.
You could also divide the integration in 3 different steps :