Search code examples
clinuxtimergettimeofday

Why does a timerfd periodic Linux timer expire a little before than expected?


I am using a Linux periodic timer, in particular, timerfd, which I set to expire periodically, for instance, every 200 ms.

I noticed, however, that the timer seems to expire, sometimes, a little before with respect to the timeout I set.

In particular, I'm using the following C code to perform a simple test:

#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <poll.h>
#include <unistd.h>
#include <inttypes.h>
#include <sys/timerfd.h>
#include <sys/time.h>

#define NO_FLAGS_TIMER 0
#define NUM_TESTS 10

// Function to perform the difference between two struct timeval.
// The operation which is performed is out = out - in
static inline int timevalSub(struct timeval *in, struct timeval *out) {
    time_t original_out_tv_sec=out->tv_sec;

    if (out->tv_usec < in->tv_usec) {
        int nsec = (in->tv_usec - out->tv_usec) / 1000000 + 1;
        in->tv_usec -= 1000000 * nsec;
        in->tv_sec += nsec;
    }

    if (out->tv_usec - in->tv_usec > 1000000) {
        int nsec = (out->tv_usec - in->tv_usec) / 1000000;
        in->tv_usec += 1000000 * nsec;
        in->tv_sec -= nsec;
    }

    out->tv_sec-=in->tv_sec;
    out->tv_usec-=in->tv_usec;

    // '1' is returned when the result is negative
    return original_out_tv_sec < in->tv_sec;
}

// Function to create a timerfd and set it with a periodic timeout of 'time_ms', in milliseconds
int timerCreateAndSet(struct pollfd *timerMon,int *clockFd,uint64_t time_ms) {
    struct itimerspec new_value;
    time_t sec;
    long nanosec;

    // Create monotonic (increasing) timer
    *clockFd=timerfd_create(CLOCK_MONOTONIC,NO_FLAGS_TIMER);
    if(*clockFd==-1) {
        return -1;
    }

    // Convert time, in ms, to seconds and nanoseconds
    sec=(time_t) ((time_ms)/1000);
    nanosec=1000000*time_ms-sec*1000000000;
    new_value.it_value.tv_nsec=nanosec;
    new_value.it_value.tv_sec=sec;
    new_value.it_interval.tv_nsec=nanosec;
    new_value.it_interval.tv_sec=sec;

    // Fill pollfd structure
    timerMon->fd=*clockFd;
    timerMon->revents=0;
    timerMon->events=POLLIN;

    // Start timer
    if(timerfd_settime(*clockFd,NO_FLAGS_TIMER,&new_value,NULL)==-1) {
        close(*clockFd);
        return -2;
    }

    return 0;
}


int main(void) {
    struct timeval tv,tv_prev,tv_curr;
    int clockFd;
    struct pollfd timerMon;
    unsigned long long junk;

    gettimeofday(&tv,NULL);
    timerCreateAndSet(&timerMon,&clockFd,200); // 200 ms periodic expiration time

    tv_prev=tv;

    for(int a=0;a<NUM_TESTS;a++) {
        // No error check on poll() just for the sake of brevity...
        // The final code should contain a check on the return value of poll()
        poll(&timerMon,1,-1);
        (void) read(clockFd,&junk,sizeof(junk));

        gettimeofday(&tv,NULL);
        tv_curr=tv;
        if(timevalSub(&tv_prev,&tv_curr)) {
            fprintf(stdout,"Error! Negative timestamps. The test will be interrupted now.\n");
            break;
        }
        printf("Iteration: %d - curr. timestamp: %lu.%lu - elapsed after %f ms - real est. delta_t %f ms\n",a,tv.tv_sec,tv.tv_usec,200.0,
            (tv_curr.tv_sec*1000000+tv_curr.tv_usec)/1000.0);

        tv_prev=tv;
    }

    return 0;
}

After compiling it with gcc:

gcc -o timertest_stackoverflow timertest_stackoverflow.c

I'm getting the following output:

Iteration: 0 - curr. timestamp: 1583491102.833748 - elapsed after 200.000000 ms - real est. delta_t 200.112000 ms
Iteration: 1 - curr. timestamp: 1583491103.33690 - elapsed after 200.000000 ms - real est. delta_t 199.942000 ms
Iteration: 2 - curr. timestamp: 1583491103.233687 - elapsed after 200.000000 ms - real est. delta_t 199.997000 ms
Iteration: 3 - curr. timestamp: 1583491103.433737 - elapsed after 200.000000 ms - real est. delta_t 200.050000 ms
Iteration: 4 - curr. timestamp: 1583491103.633737 - elapsed after 200.000000 ms - real est. delta_t 200.000000 ms
Iteration: 5 - curr. timestamp: 1583491103.833701 - elapsed after 200.000000 ms - real est. delta_t 199.964000 ms
Iteration: 6 - curr. timestamp: 1583491104.33686 - elapsed after 200.000000 ms - real est. delta_t 199.985000 ms
Iteration: 7 - curr. timestamp: 1583491104.233745 - elapsed after 200.000000 ms - real est. delta_t 200.059000 ms
Iteration: 8 - curr. timestamp: 1583491104.433737 - elapsed after 200.000000 ms - real est. delta_t 199.992000 ms
Iteration: 9 - curr. timestamp: 1583491104.633736 - elapsed after 200.000000 ms - real est. delta_t 199.999000 ms

I would expect the real time difference, estimated with gettimeofday(), to never be less than 200 ms (also due to the time needed to clear the event with read()), but there are also some values which are a bit less than 200 ms, like 199.942000 ms.

Do you know why I'm observing this behavior?

Can it be due to the fact that I'm using gettimeofday() and, sometimes, tv_prev is taken a bit later (due to some varying delay when calling read() or gettimeofday() itself) and tv_curr, in the next iteration, is not, causing an estimated time less than 200 ms, while the timer is actually precise in expiring every 200 ms?

Thank you very much in advance.


Solution

  • This is related to a process scheduling. The timer is indeed quite precise and signal the timeout every 200 ms, but your program will not register the signal until it will actually get control back. That means that the time which you get from gettimeofday() call can show some later moment in the future. When you subtract such delayed values you can get result bigger or smaller than 200 ms.

    How to estimate the time between the actual signal of the timer and your call to gettimeofday()? It is related to the process scheduling quantum of time. This quantum has some default value set by RR_TIMESLICE in include/linux/sched/rt.h. You can check it on your system like this:

    #include <sched.h>
    #include <sys/types.h>
    #include <unistd.h>
    #include <stdio.h>
    
    int main(void) {
        struct timespec tp;
        if (sched_rr_get_interval(getpid(), &tp)) {
          perror("Cannot get scheduler quantum");
        } else {
          printf("Scheduler quantum is %f ms\n", (tp.tv_sec * 1e9 + tp.tv_nsec) / 1e6); 
        }
    }
    

    Output on my system:

    Scheduler quantum is 4.000000 ms
    

    So, you may need to wait for the scheduler quantum of another process to finish before you get the control and will be able to read the current time. On my system it can lead to approximately ±4 ms deviation of the resulting delay from the expected 200 ms. After performing almost 7000 iterations I get the following distribution of the registered waiting times:

    Distribution of waiting times

    As you can see, most of the times are lying in the interval ±2 ms around the expected 200 ms. The minimum and maximum time among all iterations was 189.992 ms and 210.227 ms respectively:

    ~$ sort times.txt | head
    189.992000
    190.092000
    190.720000
    194.402000
    195.250000
    195.746000
    195.847000
    195.964000
    196.256000
    196.420000
    ~$ sort times.txt | tail
    203.746000
    203.824000
    203.900000
    204.026000
    204.273000
    205.625000
    205.634000
    208.974000
    210.202000
    210.227000
    ~$
    

    Deviations bigger than 4 ms are caused by rare situations when the program needed to wait for several quantums, not just one.