Search code examples
pentahopentaho-spoonpentaho-data-integration

How to execute X times a Job Executor step


Introduction

To keep it simple, let's imagine a simple transformation. This transformation gets an input of 4 rows, from a Data Grid step. The stream passes through a Job Executor, referencing to a simple job, with a Write Log component.

Expectations

I would like the simple job executes 4 times, that means 4 log messages.

Results

It turns out that the Job Executor step launches the simple job only once, instead of 4 times : I only have one log message.

Hints

The documentation of the Job Executor component specifies the following :

By default the specified job will be executed once for each input row.

This is parametrized in the "Row grouping" tab, with the following field :

The number of rows to send to the job: after every X rows the job will be executed and these X rows will be passed to the job.


Solution

  • Answer

    The step actually works well : an input of X rows will execute a "Job Executor" step X times. The fact is I wasn't able to see it with the logs.

    To verify it, I have added a simple transformation inside the "Job Executor" step, which writes into a text file. After I have checked this file, it appeared that the "Job Executor" was perfectly executed X times.

    Research

    Trying to understand why I didn't have X log messages after the X times execution of "Job Executor", I have added a "Wait for" component inside the initial simple job. Finally, adding two seconds allowed me to see X log messages appearing during the execution.

    The trick to see every log

    Hope this helps because it's pretty tricky. Please feel free to provide further details.