Search code examples
azure-data-factory

Difference between dataflow and flowlet in ADF


Wanted to know the difference between these in Azure Data Factory, at a glance it looks same and how flowlet's can be reusable


Solution

  • A data flow, once known as a mapping data flow, takes data from inputs (known as sources) and outputs it into sinks. Between these two steps, the data can be transformed using the many "tools" available to you. Flowlets are one of these tools.

    For the most part, flowlets are very similar to data flows. They have inputs, can transform the data, and then output it. The main differences are that while flowlets can output data, they don't use sinks. Meaning you can't output the data from a flowlet into a data source without using it in a data flow. Flowlets can also be thought of as being "unaware of the outside world", meaning they can only read data which is already in memory. Data flows, on the other hand, get data from external sources using datasets.

    The easiest way to think of flowlets are as subroutines of data flows. Small, reusable transformations which reduce the amount of repeating you'd have to do in your data flows.