Search code examples
sql-serverchange-tracking

SQL Server change tracking - see only the last updated columns on the row?


I am struggling to understand how to use Change tracking for a simple task (not so simple it looks like) of tracking individual column changes to the same row?

Consider the following example:

I have a table:

CREATE TABLE [dbo].[Products](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Name] [nvarchar](max) NULL,
    [Manufacturer] [nvarchar](max) NULL,
    [Country] [nvarchar](max) NULL,
 CONSTRAINT [PK_Products] PRIMARY KEY CLUSTERED ([Id] ASC)

Now I insert a row:

insert into Products ([Name],[Manufacturer],[Country])
values ('phone', 'sony', 'japan')

I add change tracking:

ALTER DATABASE [Test]
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 90 MINUTES, AUTO_CLEANUP = ON);

ALTER TABLE [dbo].[Products]
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON);

I run 1st update:

update Products set Manufacturer = 'Sega' where Name = 'phone'

I check 2 column updates:

select 
    CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('dbo.Products'),'Manufacturer', 'ColumnId'), CT.SYS_CHANGE_COLUMNS) 'Changed?', *
FROM
    CHANGETABLE(CHANGES dbo.Products, 0) AS CT
order by sys_change_version;

select 
    CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('dbo.Products'),'Country', 'ColumnId'), CT.SYS_CHANGE_COLUMNS) 'Changed?', *
FROM
    CHANGETABLE(CHANGES dbo.Products, 0) AS CT
order by sys_change_version;

This is what I get back: enter image description here

So far so good, as you can see only the Manufacturer column is reported as changed as expected.

However if I now try to run a 2nd update on the same row for a different column:

update Products set Country = 'USA' where Name = 'phone'

and run the same query above to check columns that have been updated, I get back that BOTH! columns have been updated: enter image description here

I would expect that the 2nd time I run update the change tracking would only show me the 2nd column as the one that has changed, not both of them!

It gets worse. If I then try and separate the updates by their SYS_CHANGE_VERSION number, that is supposed to track separate updates, I realise that both of these updates have been assigned the same number!:

select * from CHANGETABLE(CHANGES dbo.Products, 0) as changes

enter image description here

Which looks like a bug to me!

I don't want to trigger processing for Manufacturer column when I am updating a Country column on the row.

I scoured the internet but couldn't find any obvious solution to this problem. The obvious solution of cleaning table manually also seems not so straightforward and comes with its own can of worms.

Obviously change tracking has been around for a long time, so it is either I am missing something, or I am very surprised that bug has not been fixed (or even encountered by many people)

Am I missing something obvious here? Or does anyone knows how to make change tracking work for identifying column changes ONE AT A TIME like I imagine they were supposed to work?


Solution

  • Here's a little script demostrating change tracking.

    create procedure spTrackChanges
    AS
    BEGIN
    
        begin tran
        
        declare @synchronization_version bigint;
        declare @last_synchronization_version bigint;
    
        -- Obtain the current synchronization version. This will be used next time that changes are obtained.   
        select  @synchronization_version = CHANGE_TRACKING_CURRENT_VERSION();
        -- Get version ID to get changes SINCE
        select  @last_synchronization_version = change
        from    tracking
        where   tablename = 'products'
        -- Return changes
        SELECT     CHANGE_TRACKING_IS_COLUMN_IN_MASK(COLUMNPROPERTY(OBJECT_ID('dbo.Products'),'Manufacturer', 'ColumnId'), CT.SYS_CHANGE_COLUMNS) 'Changed?',*
        FROM    CHANGETABLE(CHANGES Products, @last_synchronization_version) AS CT;
        -- Set new value
        update t
        set change = @synchronization_version
        from tracking t
        where tablename = 'Products'
        
        commit tran;
    
    END
    

    I create a separate procedure to do the change tracking "thing", because it's a repetitive task.

    ALTER DATABASE [YourDB]
    SET CHANGE_TRACKING = ON
    (CHANGE_RETENTION = 90 MINUTES, AUTO_CLEANUP = ON);
    
    if object_id('Products') IS NOT NULL
        drop table Products;
    if object_id('Tracking') IS NOT NULL
    drop table Tracking;
    
    create table Tracking (tableName sysname PRIMARY KEY, change INT)
    -- Initiate
    insert into tracking (tableName, change)
    select 'Products', 0
    
    CREATE TABLE [dbo].[Products](
        [Id] [int] IDENTITY(1,1) NOT NULL,
        [Name] [nvarchar](max) NULL,
        [Manufacturer] [nvarchar](max) NULL,
        [Country] [nvarchar](max) NULL,
     CONSTRAINT [PK_Products] PRIMARY KEY CLUSTERED ([Id] ASC)
    )
    
    ALTER TABLE [dbo].[Products]
    ENABLE CHANGE_TRACKING
    WITH (TRACK_COLUMNS_UPDATED = ON);
    
    -- Seed with value
    insert into Products ([Name],[Manufacturer],[Country])
    values ('phone', 'sony', 'japan');
    
    -- Run procedure that gets initial sync
    exec sptrackchanges; -- returns 0 as changed
    
    update Products set Manufacturer = 'Sega' where Name = 'phone';
    
    exec sptrackchanges; -- returns 1 as changed
    
    
    update Products set Country = 'USA' where Name = 'phone';
    
    exec sptrackchanges; -- returns 0 as changed
    
    exec sptrackchanges; -- returns nothing, no changes
    

    Change tracking relies on the "current version" value, which needs to be advanced all the time. For this reason, one needs to store the value in some table, i've implemented a simple per table tracking by storing it in the tracking table.

    Every time changes occur, you get them by using the previous value you obtain by calling CHANGE_TRACKING_CURRENT_VERSION, these changes can affect both multiple rows and multiple columns.

    By using the CHANGE_TRACKING_IS_COLUMN_IN_MASK you can obtain which columns were affected by the changes.

    I don't quite understand why this doesn't work for you, since you:

    • are able to obtain specific columns involved in changes since last
    • update of one column doesn't generate update of another column, unless there has been change to other columns since last processing.
    • there's no duplicate processing of same changes occuring at all

    If you're only interested in changes occuring to specific columns you can do the CHANGE_TRACKING_IS_COLUMN_IN_MASK thing and it should work

    There is no difference here if multiple changes are collapsed into one or not, unless you're really interested in knowing that column X was changed Y number of times - but this is seldom needed.

    The collapse changes example is:

    update Products set Manufacturer = 'Apple' where Name = 'phone';
    update Products set Manufacturer = 'M$' where Name = 'phone';
    update Products set Country = 'USA' where Name = 'phone';
    
    exec sptrackchanges; -- returns 1 as changed