Search code examples
azureazure-databricksazure-synapseazure-data-factory

Azure Data Flow- Source query push down


My dataflow job has both source & sink as synapse database.

I have a source query with joins & transformations in the dataflow while extracting data from the synapse database.

As we know, dataflow under the hood will spin up the databricks cluster to execute the dataflow code.

My question here, the source query I am using in the data flow will that be executed on the synapse db/databricks cluster?


Solution

  • The data flow requires a compute context, which is Spark. When you use a query in the transformation, that query will get executed from that Spark cluster, which essentially gets pushed down into the database engine for resolution.