Search code examples
sql-serverparameterspentahokettle

Use a field from a previous hop as a parameter in a Table input using Pentaho Kettle


I'm using Pentaho Kettle in order to recreate the data blending process for putting together various data sources to a single data set.

At one point I'm using a Row Filter to check if certain fails are null (in this case length and weight). Right after the true part of the filter I'm using a Table input (MSSQL) to run a query and try to find additional data in a database. Is there a way I can use a field from the Row Filter (the foreign key patient_id, in my case) to use it in the Table input to write down a query like

SELECT field1, field2, ... 
FROM my_table WHERE patient_id in ( 
    reference_to_the_patient_id_field_from_the_row_filter
)

I've been trying around with variables, but they seem to be environment only, not context. Also the option Replace variables in script doesn't seem to do much for me.


Solution

  • Table Input is a row generator step, not an intermediate step. If you will use option "Insert data from step" it replace actual question marks with values, and will send built query to database. Which have really pour performance and error prone (what is value from previous step is null or empty string, or it might even contains sql injection?)

    DatabaseJoin step fits better in such place. It uses PreparedStatement, this is much faster.

    You can change sql query to pass parameters

    select field1, field2 form my_table join (values(?)) as t(patient_id) on t.patient_id = my_table.patient_id
    

    You might consider to use DatabaseLookup step, since there is only one table is used to extract data. It has well performance, and provide cache mechanism.