Will compilers optimize "strnlen(mystring, 32) > 2" to stop looping as soon as the running length exceeds 2?

Do modern compilers (or perhaps these have been in place since C89) substitute in short circuit evaluated code for cases like the one below during conditional expression evaluations?

char mystring[32] = "this is a long line";
if((strnlen(mystring, 32)) > 2)
{
    return 1;
}

As in the right operand is taken into account during processing strnlen(...), and the moment the running length of the C string within strnlen(...) exceeds the right operand of the outer conditional expression (2 in this case), strnlen(...) breaks out?

Would it have mattered if I hadn't preassigned the string length?
Would it have mattered if I had removed the parentheses from the IF inner expression?
Would it have mattered if I had switched the operands and the operator to a <?

Solution

Maybe, depending on the compiler. Let's look at some examples, compiled with gcc 13.2.0 and clang 17.0.1, both at optimization level -O3 and with extensions enabled (note that strnlen is POSIX, not standard C).

int p() {
    char mystring[32] = "this is a long line";
    return strnlen(mystring, 32) > 2;
}

Both clang and gcc optimize this to mov eax, 1; ret. This is because they know the behavior of strnlen and can substitute in the return value of the call without needing to evaluate it at runtime. (In gcc this is implemented via __builtin_strnlen).

If the strnlen function isn't a known builtin, but can be inlined:

inline size_t my_strnlen(char const* s, size_t n) {
    for (size_t i = 0; i != n; ++i)
        if (s[i] == 0)
            return i;
    return n;
}
int p() {
    char mystring[32] = "this is a long line";
    return my_strnlen(mystring, 32) > 2;
}

Here clang optimizes to mov eax, 1 but gcc emits a loop.

Finally, for an unknown predicate that is marked pure to tell the optimizer that it has no side effects:

__attribute__((pure)) int f(char);
inline size_t my_strnlen_f(char const* s, size_t n) {
    for (size_t i = 0; i != n; ++i)
        if (f(s[i]))
            return i;
    return n;
}
int p() {
    char mystring[32] = "this is a long line";
    return my_strnlen_f(mystring, 32) > 2;
}

gcc again emits a loop; clang emits some rather clumsy code (what's up with ebx?) that nevertheless shows that it knows that f needs to be called no more than 3 times, with the character codes of the first 3 characters - it optimizes out the full string:

p:                                      # @p
        push    rbx
        mov     edi, 116 # 't'
        call    f@PLT
        xor     ebx, ebx
        test    eax, eax
        je      .LBB2_1
.LBB2_3:
        mov     eax, ebx
        pop     rbx
        ret
.LBB2_1:
        mov     edi, 104 # 'h'
        call    f@PLT
        test    eax, eax
        jne     .LBB2_3
        mov     edi, 105 # 'i'
        call    f@PLT
        xor     ebx, ebx
        test    eax, eax
        sete    bl
        mov     eax, ebx
        pop     rbx
        ret

Would it have mattered if I hadn't preassigned the string length?

No, in that case the C language would just set the buffer size to the size of the string literal (string length + 1 for the terminator).

Would it have mattered if I had removed the parentheses from the IF inner expression?

No, the optimizer runs on a program representation that does not include these details of syntax.

Would it have mattered if I had switched the operands and the operator to a <?

Almost certainly not, the optimizer is capable of understanding that these are equivalent.