I have been getting an error in a previously working stored procedure called by an SSRS report and I have traced it down to a LIKE statement in a scalar function that is called by the stored procedure, in combination with a 7000+ NVARCHAR(MAX) string. It is something similar to:
Msg 8152, Level 16, State 10, Line 14 String or binary data would be truncated.
I can reproduce it with the following code:
DECLARE @name1 NVARCHAR(MAX) = ''
DECLARE @name2 NVARCHAR(MAX) = ''
DECLARE @count INT = 4001
WHILE @count > 0
BEGIN
SET @name1 = @name1 + 'a'
SET @name2 = @name2 + 'a'
SET @count = @count - 1
END
SELECT LEN(@name1)
IF @name1 LIKE @name2
PRINT 'OK'
What's the deal? Is there anyway around this limitation, or is it there for good reason? Thanks.
You can also reproduce it without the terrible loop:
DECLARE @name1 NVARCHAR(MAX), @name2 NVARCHAR(MAX);
SET @name1 = REPLICATE(CONVERT(NVARCHAR(MAX), N'a'), 4000);
SET @name2 = @name1;
IF @name1 LIKE @name2
PRINT 'OK';
SELECT @name1 += N'a', @name2 += N'a';
IF @name1 LIKE @name2
PRINT 'OK';
Result:
OK
Msg 8152, Level 16, State 10, Line 30
String or binary data would be truncated.
In any case, the reason is clearly stated in the documentation for LIKE
(emphasis mine):
match_expression [ NOT ] LIKE pattern [ ESCAPE escape_character ]
...
pattern
Is the specific string of characters to search for in match_expression, and can include the following valid wildcard characters. pattern can be a maximum of 8,000 bytes.
And 8,000 bytes is used up by 4,000 Unicode characters.
I would suggest that comparing the first 4,000 characters is probably sufficient:
WHERE column LIKE LEFT(@param, 4000) + '%';
I can't envision any scenario where you want to compare the whole thing; how many strings contain the same first 4000 characters but then character 4001 is different? If that really is a requirement, I guess you could go to the great lengths identified in the Connect item David pointed out.
A simpler (though probably much more computationally expensive) workaround might be:
IF CONVERT(VARBINARY(MAX), @name1) = CONVERT(VARBINARY(MAX), @name2)
PRINT 'OK';
I suggest that it would be far better to fix the design and stop identifying rows by comparing large strings. Is there really no other way to identify the row you're after? This is like finding your car in the parking lot by testing the DNA of all the Dunkin Donuts cups in all the cup holders, rather than just checking the license plate.