Search code examples
sql-servert-sqlsql-server-2008sql-server-2000performance

Most efficient method to detect column change in MS SQL Server


Our system runs on SQL Server 2000, and we are in the process of preparing for an upgrade to SQL Server 2008. We have a lot of trigger code where we need to detect a change in a given column and then operate on that column if it has changed.

Obviously SQL Server provides the UPDATE() and COLUMNS_UPDATED() functions, but these functions only tell you which columns have been implicated in the SQL statement, not which columns have actually changed.

To determine which columns have changed, you need code similar to the following (for a column that supports NULLs):

IF UPDATE(Col1)
    SELECT @col1_changed = COUNT(*) 
    FROM Inserted i
        INNER JOIN Deleted d ON i.Table_ID = d.Table_ID
    WHERE ISNULL(i.Col1, '<unique null value>') 
            != ISNULL(i.Col1, '<unique null value>')

This code needs to be repeated for every column you are interested in testing. You can then check the 'changed' value to determine whether or not to perform expensive operations. Of course, this code is itself problematic, as it only tells you that at least one value in the column has changed over all the rows that were modified.

You can test individual UPDATE statements with something like this:

UPDATE Table SET Col1 = CASE WHEN i.Col1 = d.Col1 
          THEN Col1 
          ELSE dbo.fnTransform(Col1) END
FROM Inserted i
    INNER JOIN Deleted d ON i.Table_ID = d.Table_ID

... but this doesn't work well when you are needing to invoke a stored procedure. In those cases you have to fall back on other approaches as far as I can tell.

My question is whether anyone has insight (or, better yet, hard data) as to what the best/cheapest approach is to the problem of predicating a database operation in a trigger on whether a particular column value in a modified row has actually changed or not. Neither of the methods above seem ideal, and I was wondering if a better method exists.


Solution

  • Although HLGEM gave some good advice above, it wasn't exactly what I needed. I've done quite a bit of testing over the past few days, and I figured I'd at least share the results here given that it looks like no more information will be forthcoming.

    I set up a table that was effectively a narrower subset (9 columns) of one of our system's primary tables, and populated it with production data so that it was as deep as our production version of the table.

    I then duplicated that table, and on the first one wrote a trigger that attempted to detect every individual column change, and then predicated each column update on whether the data in that column had actually changed or not.

    For the second table, I wrote a trigger that used extensive conditional CASE logic to do all the updates to all the columns in a single statement.

    I then ran 4 tests:

    1. A single-column update to a single row
    2. A single-column update to 10000 rows
    3. A nine-column update to a single row
    4. A nine-column update to 10000 rows

    I repeated this test for both indexed and non-indexed versions of the tables, and then repeated the whole thing on SQL 2000 and SQL 2008 servers.

    The results I got were fairly interesting:

    The second method (one single update statement with hairy CASE logic in the SET clause) was uniformly better-performing than the individual change detection (to a greater or lesser extent depending on the test) with the single exception of a single-column change affecting many rows where the column was indexed, running on SQL 2000. In our particular case we don't do many narrow, deep updates like this, so for my purposes the single-statement approach is definitely the way to go.


    I'd be interested in hearing other people's results of similar types of tests, to see whether my conclusions are as universal as I suspect they are or whether they are specific to our particular configuration.

    To get you started, here is the test script I used -- you'll obviously need to come up with other data to populate it with:

    create table test1
    ( 
        t_id int NOT NULL PRIMARY KEY,
        i1 int NULL,
        i2 int NULL,
        i3 int NULL,
        v1 varchar(500) NULL,
        v2 varchar(500) NULL,
        v3 varchar(500) NULL,
        d1 datetime NULL,
        d2 datetime NULL,
        d3 datetime NULL
    )
    
    create table test2
    ( 
        t_id int NOT NULL PRIMARY KEY,
        i1 int NULL,
        i2 int NULL,
        i3 int NULL,
        v1 varchar(500) NULL,
        v2 varchar(500) NULL,
        v3 varchar(500) NULL,
        d1 datetime NULL,
        d2 datetime NULL,
        d3 datetime NULL
    )
    
    -- optional indexing here, test with it on and off...
    CREATE INDEX [IX_test1_i1] ON [dbo].[test1] ([i1])
    CREATE INDEX [IX_test1_i2] ON [dbo].[test1] ([i2])
    CREATE INDEX [IX_test1_i3] ON [dbo].[test1] ([i3])
    CREATE INDEX [IX_test1_v1] ON [dbo].[test1] ([v1])
    CREATE INDEX [IX_test1_v2] ON [dbo].[test1] ([v2])
    CREATE INDEX [IX_test1_v3] ON [dbo].[test1] ([v3])
    CREATE INDEX [IX_test1_d1] ON [dbo].[test1] ([d1])
    CREATE INDEX [IX_test1_d2] ON [dbo].[test1] ([d2])
    CREATE INDEX [IX_test1_d3] ON [dbo].[test1] ([d3])
    
    CREATE INDEX [IX_test2_i1] ON [dbo].[test2] ([i1])
    CREATE INDEX [IX_test2_i2] ON [dbo].[test2] ([i2])
    CREATE INDEX [IX_test2_i3] ON [dbo].[test2] ([i3])
    CREATE INDEX [IX_test2_v1] ON [dbo].[test2] ([v1])
    CREATE INDEX [IX_test2_v2] ON [dbo].[test2] ([v2])
    CREATE INDEX [IX_test2_v3] ON [dbo].[test2] ([v3])
    CREATE INDEX [IX_test2_d1] ON [dbo].[test2] ([d1])
    CREATE INDEX [IX_test2_d2] ON [dbo].[test2] ([d2])
    CREATE INDEX [IX_test2_d3] ON [dbo].[test2] ([d3])
    
    insert into test1 (t_id, i1, i2, i3, v1, v2, v3, d1, d2, d3)
    -- add data population here...
    
    insert into test2 (t_id, i1, i2, i3, v1, v2, v3, d1, d2, d3)
    select t_id, i1, i2, i3, v1, v2, v3, d1, d2, d3 from test1
    
    go
    
    create trigger test1_update on test1 for update
    as
    begin
    
    declare @i1_changed int,
        @i2_changed int,
        @i3_changed int,
        @v1_changed int,
        @v2_changed int,
        @v3_changed int,
        @d1_changed int,
        @d2_changed int,
        @d3_changed int
    
    IF UPDATE(i1)
        SELECT @i1_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.i1,0) != ISNULL(d.i1,0)
    IF UPDATE(i2)
        SELECT @i2_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.i2,0) != ISNULL(d.i2,0)
    IF UPDATE(i3)
        SELECT @i3_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.i3,0) != ISNULL(d.i3,0)
    IF UPDATE(v1)
        SELECT @v1_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.v1,'') != ISNULL(d.v1,'')
    IF UPDATE(v2)
        SELECT @v2_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.v2,'') != ISNULL(d.v2,'')
    IF UPDATE(v3)
        SELECT @v3_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.v3,'') != ISNULL(d.v3,'')
    IF UPDATE(d1)
        SELECT @d1_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.d1,'1/1/1980') != ISNULL(d.d1,'1/1/1980')
    IF UPDATE(d2)
        SELECT @d2_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.d2,'1/1/1980') != ISNULL(d.d2,'1/1/1980')
    IF UPDATE(d3)
        SELECT @d3_changed = COUNT(*) FROM Inserted i INNER JOIN Deleted d
            ON i.t_id = d.t_id WHERE ISNULL(i.d3,'1/1/1980') != ISNULL(d.d3,'1/1/1980')
    
    if (@i1_changed > 0)
    begin
        UPDATE test1 SET i1 = CASE WHEN i.i1 > d.i1 THEN i.i1 ELSE d.i1 END
        FROM test1
            INNER JOIN inserted i ON test1.t_id = i.t_id
            INNER JOIN deleted d ON i.t_id = d.t_id
        WHERE i.i1 != d.i1
    end
    
    if (@i2_changed > 0)
    begin
        UPDATE test1 SET i2 = CASE WHEN i.i2 > d.i2 THEN POWER(i.i2, 1.1) ELSE POWER(d.i2, 1.1) END
        FROM test1
            INNER JOIN inserted i ON test1.t_id = i.t_id
            INNER JOIN deleted d ON i.t_id = d.t_id
        WHERE i.i2 != d.i2
    end
    
    if (@i3_changed > 0)
    begin
        UPDATE test1 SET i3 = i.i3 ^ d.i3
        FROM test1
            INNER JOIN inserted i ON test1.t_id = i.t_id
            INNER JOIN deleted d ON i.t_id = d.t_id
        WHERE i.i3 != d.i3
    end
    
    if (@v1_changed > 0)
    begin
        UPDATE test1 SET v1 = i.v1 + 'a'
        FROM test1
            INNER JOIN inserted i ON test1.t_id = i.t_id
            INNER JOIN deleted d ON i.t_id = d.t_id
        WHERE i.v1 != d.v1
    end
    
    UPDATE test1 SET v2 = LEFT(i.v2, 5) + '|' + RIGHT(d.v2, 5)
    FROM test1
        INNER JOIN inserted i ON test1.t_id = i.t_id
        INNER JOIN deleted d ON i.t_id = d.t_id
    
    if (@v3_changed > 0)
    begin
        UPDATE test1 SET v3 = LEFT(i.v3, 5) + '|' + LEFT(i.v2, 5) + '|' + LEFT(i.v1, 5)
        FROM test1
            INNER JOIN inserted i ON test1.t_id = i.t_id
            INNER JOIN deleted d ON i.t_id = d.t_id
        WHERE i.v3 != d.v3
    end
    
    if (@d1_changed > 0)
    begin
        UPDATE test1 SET d1 = DATEADD(dd, 1, i.d1)
        FROM test1
            INNER JOIN inserted i ON test1.t_id = i.t_id
            INNER JOIN deleted d ON i.t_id = d.t_id
        WHERE i.d1 != d.d1
    end
    
    if (@d2_changed > 0)
    begin
        UPDATE test1 SET d2 = DATEADD(dd, DATEDIFF(dd, i.d2, d.d2), d.d2)
        FROM test1
            INNER JOIN inserted i ON test1.t_id = i.t_id
            INNER JOIN deleted d ON i.t_id = d.t_id
        WHERE i.d2 != d.d2
    end
    
    UPDATE test1 SET d3 = DATEADD(dd, 15, i.d3)
    FROM test1
        INNER JOIN inserted i ON test1.t_id = i.t_id
        INNER JOIN deleted d ON i.t_id = d.t_id
    
    end
    
    go
    
    create trigger test2_update on test2 for update
    as
    begin
    
        UPDATE test2 SET
            i1 = 
                CASE
                WHEN ISNULL(i.i1, 0) != ISNULL(d.i1, 0)
                THEN CASE WHEN i.i1 > d.i1 THEN i.i1 ELSE d.i1 END
                ELSE test2.i1 END,
            i2 = 
                CASE
                WHEN ISNULL(i.i2, 0) != ISNULL(d.i2, 0)
                THEN CASE WHEN i.i2 > d.i2 THEN POWER(i.i2, 1.1) ELSE POWER(d.i2, 1.1) END
                ELSE test2.i2 END,
            i3 = 
                CASE
                WHEN ISNULL(i.i3, 0) != ISNULL(d.i3, 0)
                THEN i.i3 ^ d.i3
                ELSE test2.i3 END,
            v1 = 
                CASE
                WHEN ISNULL(i.v1, '') != ISNULL(d.v1, '')
                THEN i.v1 + 'a'
                ELSE test2.v1 END,
            v2 = LEFT(i.v2, 5) + '|' + RIGHT(d.v2, 5),
            v3 = 
                CASE
                WHEN ISNULL(i.v3, '') != ISNULL(d.v3, '')
                THEN LEFT(i.v3, 5) + '|' + LEFT(i.v2, 5) + '|' + LEFT(i.v1, 5)
                ELSE test2.v3 END,
            d1 = 
                CASE
                WHEN ISNULL(i.d1, '1/1/1980') != ISNULL(d.d1, '1/1/1980')
                THEN DATEADD(dd, 1, i.d1)
                ELSE test2.d1 END,
            d2 = 
                CASE
                WHEN ISNULL(i.d2, '1/1/1980') != ISNULL(d.d2, '1/1/1980')
                THEN DATEADD(dd, DATEDIFF(dd, i.d2, d.d2), d.d2)
                ELSE test2.d2 END,
            d3 = DATEADD(dd, 15, i.d3)
        FROM test2
            INNER JOIN inserted i ON test2.t_id = i.t_id
            INNER JOIN deleted d ON test2.t_id = d.t_id
    
    end
    
    go
    
    -----
    -- the below code can be used to confirm that the triggers operated identically over both tables after a test
    select top 10 test1.i1, test2.i1, test1.i2, test2.i2, test1.i3, test2.i3, test1.v1, test2.v1, test1.v2, test2.v2, test1.v3, test2.v3, test1.d1, test1.d1, test1.d2, test2.d2, test1.d3, test2.d3
    from test1 inner join test2 on test1.t_id = test2.t_id
    where 
        test1.i1 != test2.i1 or 
        test1.i2 != test2.i2 or
        test1.i3 != test2.i3 or
        test1.v1 != test2.v1 or 
        test1.v2 != test2.v2 or
        test1.v3 != test2.v3 or
        test1.d1 != test2.d1 or 
        test1.d2 != test2.d2 or
        test1.d3 != test2.d3
    
    -- test 1 -- one column, one row
    update test1 set i3 = 64 where t_id = 1000
    go
    update test2 set i3 = 64 where t_id = 1000
    go
    
    update test1 set i3 = 64 where t_id = 1001
    go
    update test2 set i3 = 64 where t_id = 1001
    go
    
    -- test 2 -- one column, 10000 rows
    update test1 set v3 = LEFT(v3, 50) where t_id between 10000 and 20000
    go
    update test2 set v3 = LEFT(v3, 50) where t_id between 10000 and 20000
    go
    
    -- test 3 -- all columns, 1 row, non-self-referential
    update test1 set i1 = 1000, i2 = 2000, i3 = 3000, v1 = 'R12345123', v2 = 'Happy!', v3 = 'I am v3!!!', d1 = '1/1/1985', d2 = '1/1/1988', d3 = NULL
    where t_id = 3000
    go
    update test2 set i1 = 1000, i2 = 2000, i3 = 3000, v1 = 'R12345123', v2 = 'Happy!', v3 = 'I am v3!!!', d1 = '1/1/1985', d2 = '1/1/1988', d3 = NULL
    where t_id = 3000
    go
    
    -- test 4 -- all columns, 10000 rows, non-self-referential
    update test1 set i1 = 1000, i2 = 2000, i3 = 3000, v1 = 'R12345123', v2 = 'Happy!', v3 = 'I am v3!!!', d1 = '1/1/1985', d2 = '1/1/1988', d3 = NULL
    where t_id between 30000 and 40000
    go
    update test2 set i1 = 1000, i2 = 2000, i3 = 3000, v1 = 'R12345123', v2 = 'Happy!', v3 = 'I am v3!!!', d1 = '1/1/1985', d2 = '1/1/1988', d3 = NULL
    where t_id between 30000 and 40000
    go
    
    -----
    
    drop table test1
    drop table test2