Search code examples
csvone-to-manypentahordbmskettle

What is the best practice in Pentaho DI(Kettle) for converting a csv into a One to Many Relationship


I am new to Pentaho DI(formerly Kettle) and I am tasked with taking CSV data and populating tables in an RDBMS that are in a "One to Many" relationship.

Here is an example of how the CSV data is structured:

OrderID,CustomerName, Date,Total Cost,LineItemNumber, LineItemDesc,LineItemQty,LineItemCost
101655,Mary Smith,2016-02-08,6.25,1|2|3|4,Lettuce|Tomatoes|Green Onions|Cucumbers,1|2|4|2,1.00|2.50|0.75|2.00

NOTE: the multiple values are seperated by pipes(|)

I need to convert it into the following tables:

enter image description here

I have been looking into several ways of doing this but didn't want to miss something simple due to my inexperience. Any advise would be greatly appreciated.

I have come up with a solution but it looks awfully complicated.

My Solution


Solution

  • Try using the Split fields to Rows STEP in PDI. Give Delimiter as | (pipe) and give the column/field that you want to split.

    Check the Pentaho wiki for more.

    Hope this helps :)