Search code examples
etlpentahodata-warehousepentaho-spoonpentaho-data-integration

Inserting new records based on the date in pentaho


i wanted to asks if this is possible in pentaho

I have a data looks like this yesterday 26/02/2021 , START_DATE TAKEN FROM SYSDATE

ID|CODE|NAME|MARKS|START_DATE|END_DATE
1|A01|KUMAR|90|26/02/2021|NULL
2|A02|RAJESH|80|26/02/2021|NULL
3|A03|SINGH|70|26/02/2021|NULL

So let's say that SINGH with CODE A03, he get a better marks(90) on 27/02/2021, so it will look like this

ID|CODE|NAME|MARKS|START_DATE|END_DATE
1|A01|KUMAR|90|26/02/2021|NULL
2|A02|RAJESH|80|26/02/2021|NULL
3|A03|SINGH|70|26/02/2021|27/02/2021
4|A03|SINGH|90|27/02/2021|NULL

As SINGH has different values in his MARKS, the old record of SINGH will meet it's END_DATE and SINGH new record with his latest MARKS will has START_DATE on the day the record being inserted and his END_DATE will still be a null.So the job will see if there are any differences in it's values, then if there are a some differences,it means the row has an 'updated' values, and the latest ones from the source that will be insert. Can I implement this in pentaho?

My Pentaho Jobs


Solution

  • You can do this easily using 'Update' and 'Insert/Update' step in only one transformation in Pentaho data integration.

    enter image description here

    You can see the sample from Here