Search code examples
gccposixclockinteger-overflowicc

The computation in "int64var = int32var * int32var" does not overflow as expected. Why?


I strongly believe that there is something strange going on, so I want to pose this question.

#include <time.h>
#include <stdint.h>

// shall return a monotonically increasing time in microseconds
int64_t getMonotonicTime() {
   struct timespec ts;
   clock_gettime(CLOCK_MONOTONIC, &ts);

   int64_t ret;
   ret = ts.tv_sec * 1000000; // #1 HERE
   ret += ts.tv_nsec / 1000;
   return ret;
}

The problematic line is ts.tv_sec * 1000000 which causes an overflow on systems where time_t and int are 32bits large (which happens to be the case on my system), every time that ts.tv_sec is greater than 2 147 .

When I wrote testcases for this bug, I found that the Intel icc compiler did not overflow, even though I assured that its time_t is in fact also 32 bit wide and even if I disabled optimzations. The GCC behavior did cause overflow as expected.

Beyond undefined behavior, what could be the reason that intel does not overflow? What could be the rationale of the intel developers? Or is it simply a misunderstanding on my part?


Solution

  • Looking at this simple code:

    long f(int a,int b){
      return a*b;
    }
    

    We see 3 different asm generated by gcc:

    movl    %edi, %eax
    imull   %esi, %eax
    cltq
    

    clang:

    imull   %esi, %edi
    movslq  %edi, %rax
    

    intel:

    movslq    %esi, %rsi
    movslq    %edi, %rdi
    imulq     %rsi, %rdi
    movq      %rdi, %rax
    

    Basically, you can multiply 2 32-bit numbers (imull) then sign-extend the result. Or you can sign-extend the operands then multiply them as 64-bit numbers (imulq), then you should in principle keep only the low 32 bits and sign-extend them, but that's unnecessary because the cases where that matters are those where there was an overflow (undefined behavior), and this optimization (removing the final sign-extension) is precisely what you observed.