Search code examples
kettle

kettle sample rows for each type


I have a set of rows let's say "rowId","type","value". I need on output set of 10 sample rows for each "type". How can I do it? "type" has aprox. 100 different, and changing values, so switch is not good option.


Solution

  • Well I've figured a walkaround from this situation. I splited transformation in parts. First part collects all data to a temp table, finds unique types, and copies them to the result.

    The second one runs for every input row (where we have types), and collects data of a given type from temp table. Then you need no grouping to do stratified sample.