sql oracle-database duplicates oracle-data-integrator

Delete Duplicate in I$ table in ODI

We have a load plan in ODI. We get a repeating error in some on our scenarios due to duplicate records in the I% table. What we do is manually run the script every time the load plan fails.

DELETE FROM adw12_dw. I$_1558911580_4
      WHERE     (EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID) IN
                    (  SELECT EFFECTIVE_FROM_DT,
                              DATASOURCE_NUM_ID,
                              INTEGRATION_ID
                         FROM adw12_dw . I$_1558911580_4
                     GROUP BY EFFECTIVE_FROM_DT,
                              DATASOURCE_NUM_ID,
                              INTEGRATION_ID
                       HAVING COUNT (1) > 1)
            AND ROWID NOT IN
                    (  SELECT MIN (ROWID)
                         FROM adw12_dw . I$_1558911580_4
                     GROUP BY EFFECTIVE_FROM_DT,
                              DATASOURCE_NUM_ID,
                              INTEGRATION_ID
                       HAVING COUNT (1) > 1)
                
                commit;

Is there a way to automate the deletion of duplicate records in the Integration table?

Solution

If you have duplicates in the source, best would be to handle that in the logic of the mapping. What could work is to add an expression component to add a row_rank column using an analytical function to rank the duplicates : row_number() over (partition by EFFECTIVE_FROM_DT, DATASOURCE_NUM_ID, INTEGRATION_ID order by ROWID). You can then add a filter with the condition row_rank = 1.

If you prefer to do a delete after inserting, you can edit the IKM and add the delete step before loading the target table.

You could also divide the integration in 3 different steps :

a mapping that would load a staging table instead of your final target table, with the duplicate
an ODI procedure that would perform the delete to remove the duplicates in the staging table
a mapping that would load the data from the staging area to the target table