I have a table result_slalom where data is populated via ETL Jobs of Pentaho. When ETL runs for the first time it creates version-1.
Now, if data is changed after new calculations it becomes version-2.
I need to make changes only in the Calculation Version -2 and no more than 2 versions should be there in the table result_slalom. ( Version-1 and Version-2 )
So the logic is : Check if data exists in table
o
When data exists and existing version is 1, then set the version of new data=2
--> Insert new dataset
o When data exists and existing version is 2, then set the version of new data=2
--> Update existing dataset
o When no data exists, then set version = 1
--> Insert new dataset
How do I make my Pentaho formula for this logic?
currently it is:
if([VersionInDB]=1;[Calculationversion];[VersionInDB]+1)
The dimension lookup/update
is a step which does exactly that.
In addition it has validity dates : at the time version2 is created, version1 receives a end-date of now and version2 receives a start-date of now. It make it easy to retrieve historical information with a where date between start-date and end-date
. Plus, you have a single button that writes the create/alter table
and create index
for you.
The other neat solution is to put a trigger on the tables.
Forget to reinvent the wheel in that direction. Although I usually like to inventing the wheel, it is one case on which redeveloping the logic will lead you in an infinite number of tests and bugs.