Search code examples
talend

How to apply row limit in talend


Using tdbInput I'm able to read data from Talend and after that, I'm using tmap to filter the rows. After filtration, the filtered number of rows will vary from time to time as I keep changing data in the SQL table.

Now the scenario is that when rows get filtered after using tmap if the percentage of rows pass the filtration is more than 80% then it will go to component tdboutput or else the process should not move further.

Eg: Suppose in my table no. of rows is 1000 which when filtered through tmap turns to only 600 rows then my process should stop or the flow should be redirected to tlogrow component. On the other hand, if the no. of rows is 100 and after filtration through tmap only 95 rows remain which is more than 85% then it should go to tdboutput.

Through my research I learned that in Talend I have a "global variable called NB_LINE which you can use to get the number of rows written to the component's file or table" but I'm not able to figure out how to use this information of achieve the above use case.

Please let me know how to make flow for the above scenario.


Solution

  • global variable is the way to go I think, but this variable is only accessible on "AFTER" mode, meaning that you have to exit subjob to get the number of line processed.

    useful components in this case are the pair tHashInput/tHashOutput (hidden by default but you can activate them through the "palette" menu in project settings). With them you are able to use a cache for your data, and separate in multiple subjobs.

    • one subjob to filter data
    • then with IF links supporting your threshold condition, you can route to desired output.

    enter image description here