Search code examples
clinuxgettimeofday

Program stops unexpectedly when I use gettimeofday() in an infinite loop


I've written a code to ensure each loop of while(1) loop to take specific amount of time (in this example 10000µS which equals to 0.01 seconds). The problem is this code works pretty well at the start but somehow stops after less than a minute. It's like there is a limit of accessing linux time. For now, I am initializing a boolean variable to make this time calculation run once instead infinite. Since performance varies over time, it'd be good to calculate the computation time for each loop. Is there any other way to accomplish this?

void some_function(){
struct timeval tstart,tend;
while (1){
   gettimeofday (&tstart, NULL);
   ...
   Some computation
   ...
   gettimeofday (&tend, NULL);
   diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
   usleep(10000-diff);
   }
}

Solution

  • Well, the computation you make to get the difference is wrong:

    diff = (tend.tv_sec - tstart.tv_sec)*1000000L+(tend.tv_usec - tstart.tv_usec);
    

    You are mixing different integer types, missing that tv_usec can be an unsigned quantity, which your are substracting from another unsigned and can overflow.... after that, you get as result a full second plus a quantity that is around 4.0E09usec. This is some 4000sec. or more than an hour.... aproximately. It is better to check if there's some carry, and in that case, to increment tv_sec, and then substract 10000000 from tv_usec to get a proper positive value.

    I don't know the implementation you are using for struct timeval but the most probable is that tv_sec is a time_t (this can be even 64bit) while tv_usec normally is just a unsigned 32 bit value, as it it not going to go further from 1000000.

    Let me illustrate... suppose you have elapsed 100ms doing calculations.... and this happens to occur in the middle of a second.... you have

    tstart.tv_sec = 123456789; tstart.tv_usec = 123456;
    tend.tv_sec = 123456789; tend.tv_usec = 223456;  
    

    when you substract, it leads to:

    tv_sec = 0; tv_usec = 100000;
    

    but let's suppose you have done your computation while the second changes

    tstart.tv_sec = 123456789; tstart.tv_usec = 923456;
    tend.tv_sec =  123456790; tend.tv_usec = 23456;
    

    the time difference is again 100msec, but now, when you calculate your expression you get, for the first part, 1000000 (one full second) but, after substracting the second part you get 23456 - 923456 =*=> 4294067296 (*) with the overflow. so you get to usleep(4295067296) or 4295s. or 1h 11m more. I think you have not had enough patience to wait for it to finish... but this is something that can be happening to your program, depending on how struct timeval is defined.

    A proper way to make carry to work is to reorder the summation to do all the additions first and then the substractions. This forces casts to signed integers when dealing with signed and unsigned together, and prevents a negative overflow in unsigneds.

    diff = (tend.tv_sec - tstart.tv_sec) * 1000000 + tstart.tv_usec - tend.tv_usec;
    

    which is parsed as

    diff = (((tend.tv_sec - tstart.tv_sec) * 1000000) + tstart.tv_usec) - tend.tv_usec;