I want to count the characters in an ntext field. Following Pinal Dave's advice, I am using datalength. But this function seems to double the value I am looking for. When I copy the value in the field into word and count the characters, I get 1502. But when I do
select datalength(result) from myTable
I get a value of 3004 characters.
Why?
Unicode is two bytes per character. Your NText
field is a Unicode string. DataLength()
returns the number of bytes required to store a field, Len()
returns the number of characters.
From Len()
: "Returns the number of characters of the specified string expression, excluding trailing blanks." DataLength
does not exclude trailing blanks. For Unicode strings you can use DataLength( UnicodeStringExpression ) / DataLength( N'#' )
to get the length in characters.
In general DataLength( Left( Coalesce( StringExpression, '#' ), 1 ) )
will return the number of bytes per character since Coalesce
returns a value based on data type precedence where Unicode strings are higher precedence than byte-string types (char
and varchar
).
declare @Foo as VarChar(10) = 'Foo and ';
declare @Bar as NVarChar(10) = N'Bar and ';
select @Foo as [@Foo],
Len( @Foo ) as [Len (trimmed)], DataLength( @Foo ) as [DataLength (bytes)],
DataLength( Left( Coalesce( @Foo, '#' ), 1 ) ) as BytesPerCharacter,
DataLength( @Foo ) / DataLength( Left( Coalesce( @Foo, '#' ), 1 ) ) as 'Characters';
select @Bar as [@Bar],
Len( @Bar ) as [Len (trimmed)], DataLength( @Bar ) as [DataLength (bytes)],
DataLength( Left( Coalesce( @Bar, '#' ), 1 ) ) as BytesPerCharacter,
DataLength( @Bar ) / DataLength( Left( Coalesce( @Bar, '#' ), 1 ) ) as 'Characters';