I am using Pentaho Data Integration to do a SCD type 1 transformation. I am using combination lookup/update transform to generate the surrogate key value (upon insert). The commit size is 100000 and the cache size is 99999. My source table has 19763 rows and when I run the job to load data into the destination (dimension table), the combination lookup/update just processes 10000/19763 rows every single time.
How can I get it to process all the records (19763) in the source table ????
Finally !!!!!!!!! I found the answer. Its simple. Click on Edit -> Setting -> Miscellaneous -> Nr of rows in rowset - Change it from 10000 to the desired number of records coming from source. For me, the value was set to 10000 and hence it used to only write 10000 records to my destination dimension table. I changed it to a million and now I am getting all my 19763 records in my destination table.