Search code examples
sqlsql-serversql-server-2008-r2

Cannot use UPDATE with OUTPUT clause when a trigger is on the table


I'm performing an UPDATE with OUTPUT query:

UPDATE BatchReports
SET IsProcessed = 1
OUTPUT inserted.BatchFileXml, inserted.ResponseFileXml, deleted.ProcessedDate
WHERE BatchReports.BatchReportGUID = @someGuid

This statement is well and fine; until a trigger is defined on the table. Then my UPDATE statement will get the error 334:

The target table 'BatchReports' of the DML statement cannot have any enabled triggers if the statement contains an OUTPUT clause without INTO clause

Now this problem is explained in a blog post by the SQL Server team -- UPDATE with OUTPUT clause – Triggers – and SQLMoreResults:

The error message is self-explanatory

And they also give solutions:

The application was changed to utilize the INTO clause

Except I cannot make head nor tail of the entirety of the blog post.

So let me ask my question: What should I change my UPDATE to so that it works?

See also


Solution

  • Given the kludge needed to make UPDATE with OUTPUT work in SQL Server 2008 R2, I changed my query from:

    UPDATE BatchReports  
    SET IsProcessed = 1
    OUTPUT inserted.BatchFileXml, inserted.ResponseFileXml, deleted.ProcessedDate
    WHERE BatchReports.BatchReportGUID = @someGuid
    

    to:

    SELECT BatchFileXml, ResponseFileXml, ProcessedDate FROM BatchReports
    WHERE BatchReports.BatchReportGUID = @someGuid
    
    UPDATE BatchReports
    SET IsProcessed = 1
    WHERE BatchReports.BatchReportGUID = @someGuid
    

    Basically I stopped using OUTPUT. This isn't so bad as Entity Framework itself uses this very same hack!

    Hopefully 2012 2014 2016 2018 2019 2020 will have a better implementation.


    Update: using OUTPUT is harmful

    The problem we started with was trying to use the OUTPUT clause to retrieve the "after" values in a table:

    UPDATE BatchReports
    SET IsProcessed = 1
    OUTPUT inserted.LastModifiedDate, inserted.RowVersion, inserted.BatchReportID
    WHERE BatchReports.BatchReportGUID = @someGuid
    

    That then hits the well-know limitation ("won't-fix" bug) in SQL Server:

    The target table 'BatchReports' of the DML statement cannot have any enabled triggers if the statement contains an OUTPUT clause without INTO clause

    Workaround Attempt #1

    So we try something where we will use an intermediate TABLE variable to hold the OUTPUT results:

    DECLARE @t TABLE (
       LastModifiedDate datetime,
       RowVersion timestamp, 
       BatchReportID int
    )
      
    UPDATE BatchReports
    SET IsProcessed = 1
    OUTPUT inserted.LastModifiedDate, inserted.RowVersion, inserted.BatchReportID
    INTO @t
    WHERE BatchReports.BatchReportGUID = @someGuid
    
    SELECT * FROM @t
    

    Except that fails because you're not allowed to insert a timestamp into the table (even a temporary table variable).

    Workaround Attempt #2

    We secretly know that a timestamp is actually a 64-bit (aka 8 byte) unsigned integer. We can change our temporary table definition to use binary(8) rather than timestamp:

    DECLARE @t TABLE (
       LastModifiedDate datetime,
       RowVersion binary(8), 
       BatchReportID int
    )
      
    UPDATE BatchReports
    SET IsProcessed = 1
    OUTPUT inserted.LastModifiedDate, inserted.RowVersion, inserted.BatchReportID
    INTO @t
    WHERE BatchReports.BatchReportGUID = @someGuid
    
    SELECT * FROM @t
    

    And that works, except that the value are wrong.

    The timestamp RowVersion we return is not the value of the timestamp as it existed after the UPDATE completed:

    • returned timestamp: 0x0000000001B71692
    • actual timestamp: 0x0000000001B71693

    That is because the values OUTPUT into our table are not the values as they were at the end of the UPDATE statement:

    Event Timestamp
    UPDATE statement starting 2
    row is modified 2
    timestamp is updated 2 ⇒ 3
    OUTPUT retrieves new timestamp (3) 3
    trigger starts 3
    trigger modifies row 3
    timestamp is updated 3 ⇒ 4
    trigger completes 4
    UPDATE statement completed 4
    OUTPUT returns 3 (the incorrect value) 4

    This means:

    • When the UPDATE statement completes the timestamp is 4
    • but the OUTPUT returns a timestamp of 3 (the wrong value)

    We do not get the timestamp as it exists at the end of the UPDATE statement (4). Instead we get the timestamp as it was in the indeterminate middle of the UPDATE statement (3).

    We do not get the correct timestamp

    The same is true of any trigger that modifies any value in the row. The OUTPUT will not OUTPUT the value as of the end of the UPDATE.

    This means you cannot trust OUTPUT to return any correct values ever.

    What SQL Server should do

    Event Timestamp
    UPDATE statement starting 2
    row is modified 2
    timestamp is updated 2 ⇒ 3
    trigger starts 3
    trigger modifies row 3
    timestamp is updated 3 ⇒ 4
    trigger completes 4
    OUTPUT retrieves new timestamp (4) 4
    UPDATE statement completed 4
    OUTPUT returns 4 (the correct value) 4

    This painful reality is documented in the BOL:

    Columns returned from OUTPUT reflect the data as it is after the INSERT, UPDATE, or DELETE statement has completed but before triggers are executed.

    How did Entity Framework solve it?

    The .NET Entity Framework uses rowversion for Optimistic Concurrency. The EF depends on knowing the value of the timestamp as it exists after they issue an UPDATE.

    Since you cannot use OUTPUT for any important data, Microsoft's Entity Framework uses the same workaround that I do:

    Workaround #3 - Final - Do not use OUTPUT clause

    In order to retrieve the after values, Entity Framework issues:

    UPDATE [dbo].[BatchReports]
    SET [IsProcessed] = @0
    WHERE (([BatchReportGUID] = @1) AND ([RowVersion] = @2))
    
    SELECT [RowVersion], [LastModifiedDate]
    FROM [dbo].[BatchReports]
    WHERE @@ROWCOUNT > 0 AND [BatchReportGUID] = @1
    

    Don't use OUTPUT.

    Yes it suffers from a race condition, but that's the best SQL Server can do.

    What about INSERTs

    Do what Entity Framework does:

    SET NOCOUNT ON;
    
    DECLARE @generated_keys table([CustomerID] int)
    
    INSERT Customers (FirstName, LastName)
    OUTPUT inserted.[CustomerID] INTO @generated_keys
    VALUES ('Steve', 'Brown')
    
    SELECT t.[CustomerID], t.[CustomerGuid], t.[RowVersion], t.[CreatedDate]
    FROM @generated_keys AS g
       INNER JOIN Customers AS t
       ON g.[CustomerGUID] = t.[CustomerGUID]
    WHERE @@ROWCOUNT > 0
    

    Again, they use a SELECT statement to read the row, rather than placing any trust in the OUTPUT clause.

    Update: Entity Framework now broken-by-default

    In 2022, Entity Framework added a pre-mature optimization bug.

    • in the same way you have to disable any use of OUTPUT in T-SQL
    • you have to disable any use of OUTPUT by Entity Framework

    Microsoft explains how to undo the bug added by #27372:

    modelBuilder.Entity<Blog>().ToTable(tb => tb.UseSqlOutputClause(false));
    

    With the down-side that you have to retroactively apply it to every table you have, or ever will have.