I was reading Designing Data Intensive Applications book by Martin Kelppmann, where I read a quote about SLA.
For example, percentiles are often used in service level objectives (SLOs) and service level agreements (SLAs), contracts that define the expected performance and availability of a service. An SLA may state that the service is considered to be up if it has a median response time of less than 200 ms and a 99th percentile under 1 s (if the response time is longer, it might as well be down), and the service may be required to be up at least 99.9% of the time. These metrics set expectations for clients of the service and allow customers to demand a refund if the SLA is not met.
My question is how can a service have 99th Percentile response time to be less 1s but have a 50th percentile (median) to be 200ms?
If I am understanding that sentence, it tells that at least 50% of the users will experience latency of 200ms or less but 99% of the users will experience at least 1 sec. Then shouldn't the median latency also be less than 1 sec?
Sorry id this sounds a like dumb question but could someone explain what that sentence means?
This comes from the definition of median:
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value.
So if distribution of response time is skewed to the "faster" ones the median can be be 200 ms while 99th percentile can be less than 1 second, for example lets consider the following 100 requests sorted by the response time:
1 | 2 | .. | 49 | 50 | 51 | .. | 99 | 100 |
---|---|---|---|---|---|---|---|---|
100ms | 120ms | 199ms | 200ms | 200ms | 999ms | 2sec |
Here we have median of 200ms ((50th + 51th)/2
-> (200+200)/2
) and 99th percentile under 1 second.
but 99% of the users will experience at least 1 sec
This is not "at least" it is "at most", or more precisely "less then"