I look for a solution to perform SSIS lookup in Pentaho Data Integration. I'll try to explain with an exemple : I have two tables A and B. Here , data in table A : 1 2 3 4 5 Here , data in table B: 3 4 5 6 7 After my process : All rows in A and not in B ==> will be insert to B All rows in B and not in A ==> will be deleted to A So , here my final Table B : 3 4 5 1 2 someone can help me please ?
There is indeed a step that does this, but it doesn't do it alone. It's the Merge rows(diff)
step and it has some requirements. In your case, A is the "compare" table and B is the "reference" table.
First of all, both inputs (rows from A and B in your case, Dev and Prod in mine) need to be sorted by a key value. In the step you specify the key fields to match on, and then the value fields to compare. The step adds a field to the output (by default called 'flagfield'). After comparing each row, this field is given one of four values: "new", "changed", "deleted", or "identical". Note in my example below I have explicit sort steps. That's because the sorting scheme of my database is not compatible with PDI's, and for this step to work, your data must be in PDI's sort order. You may not need these.
You can follow this with a Synchronize after merge
step to apply the identified changes. In this step you specify the flagfield and the values that correspond to insert, update, and delete. FYI these are specified on the "Advanced" tab, and they must be filled out for the step to work.
For a very small table like your example, I would favor just a truncate and full load with a Table output
step, but if your tables are large and the number of changes relatively small (<= ~25%) and replication is not available, this step is usually the way to go.