Search code examples
databasevariablesconnectionetlkettle

Using variable DB Connections in different identical Kettle Jobs/Transformatioins


I've read thru many related topics here, but don't seem to find a solution. Here's my scenario:

  • I have multiple identical customer databases
  • I use ETL to fill special tables within these databases in order to use as a source for PowerBI reports
  • Instead of copying (and thus maintaining) the ETLs for each customer, I want to pass the DB connection to the Jobs/Transformations dynamically

My plan is to create a text file of DB connections for each Customer:

cust1, HOST_NAME, DATABASE_NAME, USER_NAME, PASSWORD cust2, HOST_NAME, DATABASE_NAME, USER_NAME, PASSWORD and so on...

The Host will stay the same always.

The jobs will be started monthly using Pentaho kitchen in a linux box.

So when I run a Job for a specific customer, I want to tell the job to use the DB connection for that specific customer, i.e. Cust2. from the Connection file.

Any help is much appreciated. Cheers & Thanks, Heiko


Solution

  • I run a similar scenario daily in my work. What we do is we use Batch files with named parameters for each client, this way we have the same package KJB/KTR's that run for a different client based on these parameters entirely.

    What you want to do is set variables on a master job, that are used throughout the entire execution.

    As for your question directly, in the connection creation tab you can use those variables in Host and DBname. Personally, we have the same user/pw set on every client DB so we don't have to change those or pass user/pw as variables for each connection, we only send the host name and database with the Named Parameters. We also have a fixed scheduled run that executes the routine for every database, for this we use a "Execute for each input row" type JOB. enter image description here