Search code examples
mysqlcsvpentahoetlkettle

Using Pentaho Kettle, how to load data in output table from CSV file with intermediary step reading in an input table?


I have a CSV file which contains data I need to load in a database table using Pentaho ETL Kettle. But before loading them in the table, I need to build each row to be inserted with values I get from the CSV file and values I have to collect from another input table (which needs data from the CSV as parameter to collect wanted info). I'm a bit lost with the steps I have to use to build my transformation. Can anyone help ?


Solution

  • I think the easiest solution is this:

    • first you read the CSV file with a Text Input step
    • then you use a Database Join step where you get additional columns from the table, something like:

      SELECT col1, col2, col3
      FROM input_table
      WHERE common_column = ?
      

      with a placeholder ? that will be substituted from a parameter that you have to specify inside this step.

    • then you can send yor data to an output step, like a "Table Output".

    Something like this:

    Read CSV and add columns from a table

    it's not necessarily the most performant solution, it depends on how's your data, or you can use "Database Lookup" step or a "Table Input" step and then a join.