Search code examples
sql-serverssisetlcdc

How do you reload incremental data using SQL Server CDC?


I haven't been able to find documentation/an explanation on how you would reload incremental data using Change Data Capture (CDC) in SQL Server 2014 with SSIS.

Basically, on a given day, if your SSIS incremental processing fails and you need to start again. How do you stage the recently changed records again?


Solution

  • So I did figure out how to do this in SSIS.

    I record the min and max LSN number everytime my SSIS package runs in a table in my data warehouse.

    If I want to reload a set of data from the CDC source to staging, in the SSIS package I need to use the CDC Control Task and set it to "Mark CDC Start" and in the text box labelled "SQL Server LSN to start...." I put the LSN value I want to use as a starting point.

    I haven't figured out how to set the end point, but I can go into my staging table and delete any data with an LSN value > then my endpoint.

    You can only do this for CDC changes that have not been 'cleaned up' - so only for data that has been changed within the last 3 days.

    As a side point, I also bring across the lsn_time_mapping table to my data warehouse since I find this information historically useful and it gets 'cleaned up' every 4 days in the source database.