I thought strcmp was supposed to return a positive number if the first string was larger than the second string. But this program
#include <stdio.h>
#include <string.h>
int main()
{
char A[] = "A";
char Aumlaut[] = "Ä";
printf("%i\n", A[0]);
printf("%i\n", Aumlaut[0]);
printf("%i\n", strcmp(A, Aumlaut));
return 0;
}
prints 65
, -61
and -1
.
Why? Is there something I'm overlooking?
I thought that maybe the fact that I'm saving as UTF-8 would influence things.. You know because the Ä
consists of 2 chars there. But saving as an 8-bit encoding and making sure that the strings both have length 1 doesn't help, the end result is the same.
What am I doing wrong?
Using GCC 4.3 under 32 bit Linux here, in case that matters.
The strcmp
and similar comparison functions treat the bytes in the strings as unsigned char
s, as specified by the standard in section 7.24.4, point 1 (was 7.21.4 in C99)
The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp is determined by the sign of the difference between the values of the first pair of characters (both interpreted as unsigned char) that differ in the objects being compared.
(emphasis mine).
The reason is probably that such an interpretation maintains the ordering between code points in the common encodings, while interpreting them a s signed char
s doesn't.