In intsafe.h of win32api, why is the code for unsigned representation of the absolute value of an integer so strange?

I'm not familiar with c/c++, but I know that this code takes the unsigned absolute value. It's strange why Microsoft coder has to +1 to the integer number, then negative it, and then +1. Why can't it just be it and then convert to unsigned integer?

//intsafe.h
    if (llMultiplicand < 0)
    {
        //
        // Avoid negating the most negative number.
        //
        ullMultiplicand = ((ULONGLONG)(- (llMultiplicand + 1))) + 1;
    }
    else
    {
        ullMultiplicand = (ULONGLONG)llMultiplicand;
    }

For example, for an integer with only 4 bits, the minimum value is -8, which is "1000" in binary. I know that the negative number is still 1000, but if it is converted to an unsigned integer at this time, its unsigned absolute value can already be obtained.

I thought it might be a performance issue, but after running it directly in C language several times, I found it difficult to see the difference. And it is strange that the position of the function in the code will affect the overall time. For example, first write the abs1 function declaration and implementation above the abs2 function, or vice versa. The location of the call is the same, but the results are different. I can not understand

unsigned int abs1(int x) {
    if (x < 0) {
        return (unsigned int)(-x);
    }
    return (unsigned int)x;
}
unsigned int abs2(int x) {
    if (x < 0) {
        return (unsigned int)(-(x + 1) + 1);
    }
    return (unsigned int)x;
}


int main() {
    int numIterations = 100000000;
    clock_t start, end;
    double cpu_time_used;

    // Test abs1
    start = clock();
    for (int i = 0; i < numIterations; i++) {
        abs1(-5); // Replace -5 with your test input
    }
    end = clock();
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
    printf("Time taken by abs1: %f seconds\n", cpu_time_used);

    // Test abs2
    start = clock();
    for (int i = 0; i < numIterations; i++) {
        abs2(-5); // Replace -5 with your test input
    }
    end = clock();
    cpu_time_used = ((double)(end - start)) / CLOCKS_PER_SEC;
    printf("Time taken by abs2: %f seconds\n", cpu_time_used);

    return 0;
}

===============

Okay, I understand now that the fundamental reason is that many behaviors in the C/C++ standard are undefined. A sufficiently standardized library should not detect any undefined behavior, as the undefined behavior itself is undetectable. even if it has specific behaviors in certain implementations. I have also learned that some compilers will try to optimize these undefined behaviors, making them really unpredictable. This is different from many new languages that are mainly implemented by the official.

===============

Solution

The code in the else attempts to find the arithmetic negation of llMultiplicand.

(unsigned long long int)-llMultiplicand isn't used as it could result in undefined behaviour for the largest negative value.

Imagine if the long long int type was 8 bits long.^[1] Its range on a two's complement machine^[2] would be -128 to +127. Trying to negate the largest negative value (-128) would result in an unrepresentable value (+128). This is an overflow. The same thing happens if long long int is larger.

Overflows from signed integer arithmetic are undefined behaviour.^[3] The code is written as it is to avoid this undefined behaviour.

Let's go back to our example of negating -128 on a machine with a 8-bit long long int.

Naïve approach:

  (unsigned long long int)-llMultiplicand
⇒ (unsigned long long int)-(-128ll)
⇒ XXX Undefined behaviour.

Approach taken:

  ((unsigned long long int)(-(llMultiplicand + 1))) + 1
⇒ ((unsigned long long int)(-((-128ll) + 1))) + 1
⇒ ((unsigned long long int)(-(-127ll))) + 1
⇒ ((unsigned long long int)(127ll)) + 1
⇒ 127ull + 1
⇒ 128ull

Note that a good compiler will generate a simple negation for this code when the CPU provides a negation operator that provides the desired behaviour. So while it looks more complicated, there is no performance penalty.

It can't legally be that small, but I just want to work with manageable and familiar numbers.
Such as the x86 and the x86-64.
All unsigned integer arithmetic is performed with a modulo equal to one more than the largest number representable by the result type, so overflows aren't possible with unsigned integers.