I am working on a Talend transformation process (we are using Talend 6.4).
, and I don't know how to implement the current requirement.
I have an input consisting in :
- Two columns that are my group keys (Account and Product), but are not unique (the same Account x Product couple can happen in multiple rows)
- A criterion column (Contract end date), which will help me decide which row I want to keep for each group
- Some "tail" data that need to be passed to the following step of the processing (the contract number)
The rule to implement is:
- Keep only one record per group
- The selected record must be one with no end date or, if all have end date, with the biggest end date
- The selected record can be random in case there is a tie
See the transformation applying those rules on some dummy data:

I thought first to do the following:
- sort by Account, Product, End_date (nulls first)
- "select first" in each group
but I am not skilled enough to know whether the second transformation exists in Talend.

Regards,
Pierre