How can you ensure that interrupt latency will not exceed a certain value when there may be other variables and factors involved, like the hardware ?
Hardware latency is predictable. It doesn't have to be constant, but it definitely is bounded - for example interrupt entry is usually 12 cycles, but sometimes it may take 15 cycles.
RTOS latency is predictable. It also is not constant, but for example you can be certain, that the RTOS does not block the interrupts for longer than 1000 cycles at any time. Usually it will block them for much shorter periods of time, but never longer than stated.
If only your application doesn't do something strange (like a while (1);
in the thread with highest possible priority), then the latency of the whole system will be a sum of hardware latency and RTOS latency.
The important fact here is that using real-time operating system to write your application is not the only requirement for the application to also be real-time. In your application you have to ensure that the real-time constraints are not violated. The main job of RTOS is to NOT get in your way of doing that, so it may not introduce random/unpredictable delays.
Generally the most important of the "predictable" things in RTOS is that the highest priority thread that is not blocked is executing. Period. In a GPOS (like the one on your desktop computer, in tablets or in smartphones), this is not true, because the scheduler actively prevents low priority threads from starvation, by allowing them to run for some time, even if there are more important things to do right now. This makes the behaviour of the application unpredictable, because one day it may react within 10us, while on the other day it may react within 10s, because the scheduler decided it's a great moment to save the logs to hard drive or maybe do some garbage collection.
Alternatively you can think that for RTOS the latency is in the range of microseconds, maybe single milliseconds. For a GPOS the max latency would probably be something like dozens of seconds.