Search code examples
azureazure-sql-databasepipelineazure-data-factory

Using an incremental id as watermark for copying data in azure data factory pipeline instead of date time


I'm able to incrementally load data from an source Azure MSSQL DB to a sink Azure MSSQL DB using a timestamp. For some reason i wish to incrementally load the data using an incremental id in the source database instead of a timestamp. Is this possible?

I need to run the Copy Data activity only once a day. So i would also want to store the batch id for each of the copy data activity in a batch_details table.

I'm using ADF v.2

I'm a new to azure, How do i do it?


Solution

  • What you ask is basically what is explained in this tutorial from the official documentation: https://learn.microsoft.com/en-us/azure/data-factory/tutorial-incremental-copy-overview

    You can use whatever you want as a watermark, the tutorial uses a datetime value, but an incremental id works too.

    To run the pipeline once a day, use a trigger: https://learn.microsoft.com/en-us/azure/data-factory/concepts-pipeline-execution-triggers#schedule-trigger

    Hope this helped!