Search code examples
pentahopentaho-spoonpentaho-data-integrationpdi

Transpose thousands rows to columns in Pentaho


I have the dataset:

enter image description here

What I need is have all accounts for each concat group in one field, separated by a comma. I was able to achieve it with de-normalizer and then, some regex. It is just fine when you have a few accounts but now I have a case with more that 10K accounts. How can I achieve it?

enter image description here


Solution

  • Both the row denormaliser as well as the field concat step (which would achieve the second objective for this task) do not allow dynamic field names, as far as I can tell. So one unorthodox solution for dealing with a large number of possible values in the denormalisation and concatenation would be to simply specify them all. For example, a field in the denormaliser step is defined as

    <field>
        <field_name/>
        <key_value/>
        <target_name>field_1</target_name>
        <target_type>None</target_type>
        <target_format/>
        <target_length>-1</target_length>
        <target_precision>-1</target_precision>
        <target_decimal_symbol/>
        <target_grouping_symbol/>
        <target_currency_symbol/>
        <target_null_string/>
        <target_aggregation_type>-</target_aggregation_type>
    </field>
    

    So you could write a script printing the template for all fields and insert it into the place of the <fields> tag in the transformation's XML.

    Note: This is not fit for production. This is a solution if you need to do a task once or maybe twice to import some data. I wouldn't want to have to deal with a ETL process where this was deployed. A proper solution probably involves a custom step or an external script. I will gladly be proven wrong on this.