Search code examples
creal-timecentos8

CentOS8 real-time FIFO process not receiving XCPU signal


Background

I have a very precise program on my mind and I need to use real-time scheduling policy for it. What the program is supposed to do is out of the question. Before writing the actual program though, I wanted to test out stuff a little - some of the differences between real-time and "normal" scheduling policy that is (for instance, I noticed that nanosleep has 10 times better precision with real-time FIFO scheduling and 99 priority), and of course the SIGXCPU signal, which is supposed to be sent to a real-time process if it exceeds soft allowed CPU time in one go. I am not receiving it though, even though my process is burning the CPU time with an infinite loop.

Environment

I am using CentOS8 hosted on vultr - 1 core, 512MB RAM, latest version of kernel and every package. Without my real time process running, top shows 2-3 processes running (systemd is the main one) and more than 80 sleeping. The active processes all seem to have normal scheduling policy with priority set to 20, which is the lowest.
My real-time process' code looks as the following:

#define _GNU_SOURCE

// most of these are useless yes, were used before for testing
// and I just did not care to remove them, but that shouldn't change anything, right

#include <sched.h>
#include <unistd.h>
#include <sys/types.h>

#include <stdio.h>
#include <stdatomic.h>
#include <signal.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <errno.h>

static uint64_t GetTimeoutTime(const uint64_t nanoseconds) {
  struct timespec tp = { .tv_sec = 0, .tv_nsec = 0 };
  (void) clock_gettime(CLOCK_MONOTONIC, &tp);
  return (uint64_t)(tp.tv_sec) * 1000000000 + (uint64_t)(nanoseconds / 1000000000) * 1000000000 + (uint64_t)(tp.tv_nsec) + nanoseconds - (uint64_t)(nanoseconds / 1000000000) * 1000000000;
}

static int siga(int signum, void (*handler)(int)) {
  return sigaction(signum, &((struct sigaction){ .sa_handler = handler, .sa_flags = 0, .sa_mask = 0 }), NULL);
}

static void xcpu(int sig) {
  puts("xcpu called");
  (void) pthread_yield(); // to not get killed
}

int main() {
  uint64_t g = 0;
  uint64_t t1, t2;
  int err = siga(SIGXCPU, xcpu);
  if(err != 0) {
    puts("e");
    printf("%d\n", err);
  }
  if(sched_setscheduler(getpid(), SCHED_FIFO, &((struct sched_param){ .sched_priority = 99 })) != 0) {
    puts("err");
  }
  while(1) {
    t1 = GetTimeoutTime(0); // just some stuff to make the process busy
    t2 = GetTimeoutTime(0) - t1; // I was using this code before, thus left it there
    g += t2;
  }
  printf("avg %lf\n", (double)(g) / 10000.0); // just to make it seem as g is not useless
  return 0;
}

The results are - the process constantly taking above 90% of CPU, running and running, and no signal is seen to be received. I actually kept the program running for about 15 minutes, and nothing happened at all - the process did not get killed. I mean, FIFO scheduling is not supposed to remove the threads while they are running, right? That's what Round Robin does, so I don't quite understand what would be causing this phenomenon. Is my thread put to sleep without me knowing?
Would DEADLINE scheduling with the deadline times set to some 2^63 - (1, 2, 3) number work better than the current FIFO solution? I would just like to get most of the CPU for myself most of the time, since really nothing but my own process is going to use the CPU anyway (the only difference is that real-time scheduling policies give some perks, and one of them I noticed and described at the beginning here - increased precision of nanosleep. Are there other perks?).


Solution

  • Alright, I found the answer.
    The problem lied in the RTIME soft and hard limits. I thought that by default they were quite low, but now I double checked to be sure. Both soft and hard limits were 2^63 numbers. After lowering soft limit to 1e6 and hard to 1e7, my process started receiving XCPU signals each second. Checked and lowered using getrlimit and setrlimit functions (see man for more info).