Search code examples
datastageibm-infosphere

IBM DataStage : Cannot drop duplicated records with look up stage


I'm trying to match the xls file input with the records in the database, if the records of the xls file matches with the records in the database then the records of the xls file will not be inserted to the database (preventing duplication), and if the records of the xls file does not match with the records in the database then the records of the xls file will be input since it means the records does not exist yet. This is my connection and the details.

general lookup setting

The problem is, no matter how i set the Lookup Failure options, the lookup will feed the records from the reference db that matches with the csv_rec to the target db. Which makes duplication, not preventing it. How am i supposed to do to only insert the main input to the target if it doesn't match with the reference db and doesn't insert the main input if there's match with the reference db? I'm new to this so i'm very confused..


Solution

  • Direct the stream output of the Lookup stage into a Copy stage with no output. Add a Reject link from the Lookup stage to the ODBC stage. This link will carry those records that weren't found on the target table. Note that you'll need to set the Lookup Failed property of the Lookup stage appropriately.