Search code examples
sql-servert-sqlrecursive-query

Is recursive variable self-assignment valid?


I found some T-SQL code, and i'm trying to figure out if it is:

  • an elegant solution that i never knew about
  • a horrible abomination that happens to work, and should absolutely be removed
  • or somewhere in between.

Standardize Names

We have a requirement to try to cleanup names:

  • BobRobert
  • BillWilliam
  • DickRichard

So a table was created to hold the value and the value to replace it with. It holds the above words (as well as more esoteric, domain-specific, industry specific, mappings):

FillerWords:

Value ReplaceWith
dick Richard
bill William
bob Robert
ted William

And the code I saw used to perform the replacment was either genius or madness:

DECLARE @firstName varchar(200) = 'bill';
SELECT @firstName = REPLACE(@firstName, FillerWords.Value, FillerWords.ReplaceWith) FROM FillerWords;
SELECT @firstName;

Which gives:

(no column name)
----------------
William

1 row(s) affected

Even more impressive is that it can do multiple replacements at once:

DECLARE @firstName varchar(200) = 'teddickbobbill';
SELECT @firstName = REPLACE(@firstName, FillerWords.Value, FillerWords.ReplaceWith) FROM FillerWords;
SELECT @firstName;

Which gives:

(no column name)
----------------
WilliamRichardRobertWilliam

1 row(s) affected

It works, but is it valid?

How does it work?

If you eliminate the variable assignment during the select:

SELECT REPLACE(@firstName, fw.Value, fw.ReplaceWith) FROM #FillerWords fw

you see that it is performing the REPLACE on every row in the table:

(No column name)
teddickRobertbill
tedRichardbobbill
Williamdickbobbill
teddickbobWilliam

And so what happens is that each time a row is evaluated, the @firstName variable is updated:

@firstName Value ReplaceWith new value of @firstName
teddickbobbill dick Richard tedRichardbobbill
tedRichardbobbill bill William tedRichardbobWilliam
tedRichardbobWilliam bob Robert tedRichardRobertWilliam
tedRichardRobertWilliam ted William WilliamRichardRobertWilliam

So it seems to rely on a quirk of variable assignment, that you can keep revising the same variable row-after-row; it's almost like a cursor operation.

Is that style of T-SQL programming supported? I know recursive common-table-expressions (CTEs) do it all the time; but is it allowed here?

I know the code's success relies on the order in which rows are processed, which is generally not guaranteed in SQL. This could lead to inconsistent results, especially if the underlying data or the SQL Server's query execution plan changes. I do not care about this, here, in this question.

One Possible Answer

I asked one oratrice mechanique d'analyse cardinal, and it said:

The code exploits a specific behavior of T-SQL where a variable can be updated in a SELECT statement. However, this is not standard SQL behavior and may not be intuitive or maintainable. While this works in current versions of SQL Server, there's no guarantee that such behavior will remain consistent or supported in future versions, as it's not a documented feature for such purposes.

Is setting variables resursively not an accepted pattern?


Solution

  • This falls into the same bucket as the old string aggregation technique which sometimes worked and sometimes didn't (dependent on execution plan) and which has never been a supported technique.

    The documentation explicitly warns against "Antipattern use of recursive variable assignment" archive

    Your code:

    SELECT @firstName = REPLACE(@firstName, ...) 
    FROM
    

    Is certainly of the warned against structure:

    SELECT @Var = <expression containing @Var>
    FROM
    

    Definitely not something that should be in production code that you care about as it is not guaranteed to work.

    In the case the string concatenation approach "failed" then the resultant value of the variable just contained the result of a single assignment so I would expect there is some possibility of getting a plan that only effectively does a single REPLACE too.