Search code examples
ssispackagedataflowtask

Multiple data flows for 1000+ tables ssis


Hi I have created a ssis package which loads source to target table and for each table I have to create a data flow, duplicate, rinse and repeat. It’ll take quite a bit of time to do 1000 tables and I was wondering if there was a faster way to created these data flows?


Solution

  • Automating SSIS package creation is embarrassingly easy with Biml (Business Intelligence Markup Language).

    1. Install the free (registration required) BimlExpress
    2. Create 1 package as "normal" and then right click and use the Reverse Engineer option to have BimlExpress create the raw Biml that describes the package you built
    3. Automate the pattern however it makes sense.

    BimlScript is a free, community oriented patterns and learning site. It has an excellent learning tutorial on the matter

    • Cathrine Wilhelmsen has great resources
    • Stairway to Biml is an excellent series by Andy Leonard
    • Replicate-o-Matic is a short example of how I can use a query to make the the #3 "automate" the pattern. It queries all the tables in a database and creates a package per table with a basic table source to table destination.

    The final bit of advice I'd offer is: do not make 1 package with 1000+ data flows in it. Every time you go to open the package, run it, etc, you will spend a long time validating the metadata of each and every source and sink.

    Instead, create a package per entity to transfer (worker) and then create an Orchestrator package that runs the workers. The orchestrator can be as simple as hard coding every package in serial or something more interesting like a table of packages to be run, when they were last run and dynamically invokes them based on expected run time with N workers running in parallel.