Search code examples
sqlsql-serverdatalength

why does sql server's datalength function double the length of my field?


I want to count the characters in an ntext field. Following Pinal Dave's advice, I am using datalength. But this function seems to double the value I am looking for. When I copy the value in the field into word and count the characters, I get 1502. But when I do

select datalength(result) from myTable 

I get a value of 3004 characters.

Why?


Solution

  • Unicode is two bytes per character. Your NText field is a Unicode string. DataLength() returns the number of bytes required to store a field, Len() returns the number of characters.

    From Len(): "Returns the number of characters of the specified string expression, excluding trailing blanks." DataLength does not exclude trailing blanks. For Unicode strings you can use DataLength( UnicodeStringExpression ) / DataLength( N'#' ) to get the length in characters.

    In general DataLength( Left( Coalesce( StringExpression, '#' ), 1 ) ) will return the number of bytes per character since Coalesce returns a value based on data type precedence where Unicode strings are higher precedence than byte-string types (char and varchar).

    declare @Foo as VarChar(10) = 'Foo and ';
    declare @Bar as NVarChar(10) = N'Bar and ';
    
    select @Foo as [@Foo],
      Len( @Foo ) as [Len (trimmed)], DataLength( @Foo ) as [DataLength (bytes)],
      DataLength( Left( Coalesce( @Foo, '#' ), 1 ) ) as BytesPerCharacter,
      DataLength( @Foo ) / DataLength( Left( Coalesce( @Foo, '#' ), 1 ) ) as 'Characters';
    
    select @Bar as [@Bar],
      Len( @Bar ) as [Len (trimmed)], DataLength( @Bar ) as [DataLength (bytes)],
      DataLength( Left( Coalesce( @Bar, '#' ), 1 ) ) as BytesPerCharacter,
      DataLength( @Bar ) / DataLength( Left( Coalesce( @Bar, '#' ), 1 ) ) as 'Characters';