Search code examples
azure-data-factory

How can we use a single Azure Data Factory Copy activity to query multiple tables in an on prem database and send payload to logic app to parse


We have this requirement in the project wherein we were using a parent-child pipeline design to dynamically generate a set of queries for each table in a database and then run a for each loop to pass these queries to child pipelines ( one instance of child pipeline per table to be queried) and these child pipelines pass the data to logic hub end point to publish that to Event hubs.

Parent pipeline:

enter image description here

Child Pipeline:

enter image description here

These pipelines run every 2 minutes. There is good amount of cost associated with these runs and we wanted to reduce the number of orchestration runs.

The proposed solution is to - Run a single query at the source and bundle all result set into a single set - parse it out in logic app and then publish ( all entities have different structures).

Is there a way to handle this scenario ?


Solution

  • It might not be possible to achieve your requirement only using a single copy activity because of the following reasons.

    • You are dealing with multiple tables and single copy activity cannot deal with this.
    • There are set of queries for each table, and you cannot execute query for each table using single copy activity.
    • With your proposed solution, it might be possible with a single query but again whatever that you are doing (multiple iterations for multiple queries) in ADF, you need to do that again in logic apps and you might need an extra intermediate staging location which is target for copy activity and source for the logic app.

    So, your current solution is the better approach than your proposed solution. But you can make some changes as per your pipeline structure.

    • Here, even though you are using a child pipeline, you are not using another ForEach in the child pipeline. You can avoid the nested pipeline by giving the child pipeline logic inside the parent pipeline for loop itself.
    • As you are dealing with one query at a time (only single for loop needed), give the same activities of child pipeline inside the parent pipeline for loop. So, one iteration of for loop per table to be queried in the parent pipeline.