Search code examples
pentahokettlepentaho-spoon

Generate a new row for a set of fields of the input row (and generate a query for each new row)


We have a .csv file that has information about the migration flows of people across districts in a city. We are creating a transformation that loads data from a .csv file to a database (2 tables):

each row has the following information:
- field 1: Name of the origin district
- field 2 (name of the field = name of the destination district): Value of the field = number of people that have changed from origin district to this destination district
This repeats for each destination district.

Suppose there are 20 districts so the total number of fields is 21

We want a step that generates the following output (transform data structure): A new row with the following structure:

  • Field 1: Name of the origin district
  • Field 2: Name of the destination district
  • Field 3: Number of people that has changed from district "Field 1" to district "Field 2"

So the output of this step must contain 20x20 rows.We will then insert the 400 rows in the following database table: enter image description here

We can not find any transformation step that can generate this new data structure. We will try the javascript step to manually implement a loop for each origin district and then generate the insert into the database table for each new row. enter image description here enter image description here


Solution

  • To move columns from where they are listen in columns in one row (pivoted tables), to have one row pr. column, and a key column you should use Row Normaliser.