Search code examples
pentahokettle

Value from last row of output file and specify into variable


I have a ETL (pentaho) which give an excel file output from following steps.

Transformation 1:

  • Table input (has got SQL statement with created > DATEVALUE ORDER BY created ASC)
  • Sort rows
  • Excel output

Now how can I read last row of the excel output (created column) value and store into text file? So I can make sure when the job re-run then SQL statement created date is grater than text file stored value.

Transformation 1:

  • Table input (SQL statement like created > (get the value from text file) ORDER BY created ASC)
  • Sort rows
  • Excel output

What would be the simplest way of achieving this?


Solution

  • You can save last row of the data stream, which matches to last row written to Excel, using a combination of Group by and Text file output which you can apped right after your Excel output step:

    • Group by step: Set Last value in Type column of Aggregates tab. Take your date field as a Subject and give it some Name e.g. last_date.
    • Text file output step: Write last_date into a file.

    You trasformation would then start by a step which reads the last_date from file (Text file input) and passes it to the Table input step where it is used as a parameter of your SQL query.

    enter image description here