Search code examples
cstringstrcmp

Mystery with strcmp output - How strcmp actually compares the strings?


I want to know why strcmp() returns different values if used more than once in the same function. Below is the program. The first case I am aware of why it prints -6. But in the second case, why does it print -1?

#include<stdio.h>
#include<string.h>
int main()
{
    char a[10] = "aa";
    char b[10] = "ag";
    printf("%d\n",strcmp(a, b));
    printf("%d\n",strcmp("aa","ag"));
    return 0;
}

And the output it produces is below

[sxxxx@bhlingxxx test]$ gcc -Wall t51.c
[sxxxx@bhlingxxx test]$ ./a.out
    -6
    -1

Why is the output of second strcmp() -1? Is it the Compiler who plays here? If so What is the exact optimization it does?


Solution

  • The C standard says the following regarding the return value of strcmp:

    Section 7.24.4.2p3:

    The strcmp function returns an integer greater than, equal to, or less than zero, accordingly as the string pointed to by s1 is greater than, equal to, or less than the string pointed to by s2

    So as long as the result fits that description it is compliant with the C standard. That means the compiler can perform optimizations to fit that definition.

    If we look at the assembly code:

    .loc 1 7 0
    leaq    -32(%rbp), %rdx
    leaq    -48(%rbp), %rax
    movq    %rdx, %rsi
    movq    %rax, %rdi
    call    strcmp
    movl    %eax, %esi
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    .loc 1 8 0
    movl    $-1, %esi      # result of strcmp is precomputed!
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    

    In the first case, arrays are passed to strcmp to a call to strcmp and a call to printf are generated. In the second case however, string constants are passed to both. The compiler sees this and generates the result itself, optimizing out the actual call to strcmp, and passes the hardcoded value -1 to printf.