I currently have a transformation setup with 2 table inputs and one Merge Rows (Diff), the SQL select statement in both table inputs are constant, they are not changing except for the table name. So I have:
select * from THIS_WILL_CHANGE
I have around 100 tables and I don't want to manually enter the table names every iteration, especially because this is automation...
What is the best way to achieve this? Is there any way to read like a CSV file with all the table names and loop that way? Any help is appreciated..
This is something I've had to do before too!
You can do this with a variable and a job which executes once for each row of the previous step.
- Create a parent job to host these steps
- Create a transformation which gets the table names from 'somewhere' eg. CSV file, or database query - a select on all_tables for tables with the same column names might be a nice way to do this for all time...
- In this same transformation, use copy rows to result step to push the data back to the job
- Create a new 'sub job', which executes once for each row, and has a hop from the 'get data' step in the main job
- In the sub job, create two transformations, one to set the variable from the results field, and one to do your select
- In your select query, check the box 'substitute variables' and place your variable with the same name as your set variables step into your SQL as ${yourVariableHere}
I've put this in an image below, which hopefully helps you.
data:image/s3,"s3://crabby-images/65afb/65afbda77fdcb1e061f638c0b370b59c064a3e44" alt="enter image description here"