Search code examples
cwindowsmemory-addresstchar

Why address difference between Unicode is incorrect here?


Look at the code below,

#include<Windows.h>
#include<tchar.h>
int main()
{
    TCHAR szStr[] = TEXT("C++中文你好");
    printf("sizeof(szStr) = %u\n", sizeof(szStr)); //16
    LPTSTR lp = _tcschr(szStr, TEXT('好'));
    _tprintf(TEXT("szStr = %p, lp = %p \n"), szStr, lp); //szStr = 0000001F9F51FAA8, lp = 0000001F9F51FAB4
    _tprintf(TEXT("difference= %u"), lp - szStr); //4
}

TCHAR is interpreted as Utf-16 here because each letter in szStr occupies two bytes.However, the address difference seems not so: although there are 6 letters between then Chinese letter '好' and the begin of the array, the difference is 6 not 12. Could someone explain the reason to me?


Solution

  • Subtracting 2 pointer you get the difference in elements not in bytes.

    From the C11 standard, 6.5.6 Additive operators, Paragraph 9:

    When two pointers are subtracted, ... the result is the difference of the subscripts of the two array elements ...

    When you want to calculate the difference in bytes, you can either multiply the result by the size of 1 element or by casting them to a character pointer (char, unsigned char or signed char) before subtracting.

    _tprintf(TEXT("difference elements= %td"), lp - szStr); //4
    _tprintf(TEXT("difference size    = %td"), (lp - szStr)*sizeof *lp);
    _tprintf(TEXT("difference bytes   = %td"), (char*)lp - (char*)szStr); 
    

    The last 2 should always output the same value.