Search code examples
pentahopentaho-spoonpentaho-data-integration

Pentaho Spoon get -1 day formula of the next data entry


hope you can help me solve this data processing. I have no idea which step can solve this. Is it javascript? or is there a simple formula that can solve this

I have a stream that has date column and I want to have another column which will get the (date - 1) of each date entry. Please see the details below:

AuditDate
9/16/2021
9/17/2021
10/07/2021
10/17/2021
09/26/2021

This is will be the expected output

AuditDate Column       Date -1 column

9/16/2021              9/16/2021
9/17/2021             10/06/2021
10/07/2021            10/16/2021
10/17/2021            09/25/2021
09/26/2021            Latest Date

Thanks!


Solution

  • You have the Analytic query step, with the LEAD operation to retrieve the next (or the second, third... next) row. You have to provide the rows ordered, so it can find the row you are looking for, I don't know if in the example you provided it was intended that the 09/26/2021 date was the last date, so Date -1 column for 10/17/2021 is before the audit date. If it's so, you can skip the Sort rows in my example. When I use the Analytic query step I have a Sort rows step previously.

    Once you have the value of AuditDate in next row, you calculate the day before. Because trying to calculate the previous day on a null date gave me error, I have also added a If field value is null step to give to the LEAD date an "impossible" value I can quickly identify in case I need to return it to a null value after applying the Calculator step.

    This is a screenshot of the transformation: Transformation example