Search code examples
pentahokettlepentaho-data-integration

Kettle Execute for every n row


In a job, you can have a transformation in the flow and have it set to execute for every one row.

Is there a way to have the transformation execute for every n rows?

I want to be able to pass a batch of n rows to a transformation.

What is the best (any, really) way to do this?

Thanks.


Solution

  • Yes. Let your job call transformation A which will act as a parent/controller, and have transformation A call transformation B using a Transformation Executor step. In this step you can specify the number of rows to pass to transformation B each time under the 'Row grouping' tab as shown.

    Your inner transformation (B) will need to start with a step that can receive the rows from A; I have used a 'Get rows from result' step to accomplish this in some of my work.

    enter image description here