c++c kernel benchmarking inline-assembly

Role of preempt_disable/enable and raw_local_irq_save/restore in benchmarking

The following paper by intel (link) describe a method to accurately benchmark code. The core of the benchmark reads as follow (see page 31):

preempt_disable();
raw_local_irq_save(flags);

asm volatile ( 
    "CPUID\n\t" 
    "RDTSC\n\t" 
    "mov %%edx, %0\n\t" 
    "mov %%eax, %1\n\t": "=r" (cycles_high), "=r" (cycles_low):: "%rax", "%rbx", "%rcx", "%rdx"
);

/*call the function to measure here*/ 

asm volatile( 
    "CPUID\n\t" 
    "RDTSC\n\t" 
    "mov %%edx, %0\n\t"
    "mov %%eax, %1\n\t": "=r" (cycles_high1), "=r" (cycles_low1):: "%rax", "%rbx", "%rcx", "%rdx"
); 

raw_local_irq_restore(flags);  
preempt_enable();

I was wondering:

What raw_local_irq_save and raw_local_irq_restore do?
What preempt_disable and preempt_enable do?
What is their role in that specific context?
What would be the consequence of removing them from the benchmarking code? Would it prevent correct benchmarking? What could go wrong?

Solution

In the link you have provided, if you read section 2.2 where they actually implement the kernel module, you can see there are some comments -

preempt_disable(); /*we disable preemption on our CPU*/

This is a Linux kernel function which basically disables the processor to switch context to a different process.

The second call -

raw_local_irq_save(flags); /*we disable hard interrupts on our CPU*/  
/*at this stage we exclusively own the CPU*/

This masks all the interrupts on the hardware. Again another Linux kernel function.

These two together imply that nothing, even hardware interrupts can disturb the processor till the benchmarking is done. This is to ensure exclusive access to the processors and other resources like cache, TLBs etc. I assume you can figure why that would be necessary for correct benchmarking.

The other two functions, as their names suggest, re-enable preemption and restore the interrupt masks after the benchmarking is done.

As to what will happen, if these calls are removed, well "something" can interrupt your benchmarking process and you can get very high variance in your measurements.