network-programming cloud computer-science distributed-system system-design

Tail Latencies and SLA: Trying to understand a quote from Designing Data Intensive Applications

I was reading Designing Data Intensive Applications book by Martin Kelppmann, where I read a quote about SLA.

For example, percentiles are often used in service level objectives (SLOs) and service level agreements (SLAs), contracts that define the expected performance and availability of a service. An SLA may state that the service is considered to be up if it has a median response time of less than 200 ms and a 99th percentile under 1 s (if the response time is longer, it might as well be down), and the service may be required to be up at least 99.9% of the time. These metrics set expectations for clients of the service and allow customers to demand a refund if the SLA is not met.

My question is how can a service have 99th Percentile response time to be less 1s but have a 50th percentile (median) to be 200ms?

If I am understanding that sentence, it tells that at least 50% of the users will experience latency of 200ms or less but 99% of the users will experience at least 1 sec. Then shouldn't the median latency also be less than 1 sec?

Sorry id this sounds a like dumb question but could someone explain what that sentence means?

Solution

This comes from the definition of median:

In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value.

So if distribution of response time is skewed to the "faster" ones the median can be be 200 ms while 99th percentile can be less than 1 second, for example lets consider the following 100 requests sorted by the response time:

1	2	..	49	50	51	..	99	100
100ms	120ms		199ms	200ms	200ms		999ms	2sec

Here we have median of 200ms ((50th + 51th)/2 -> (200+200)/2) and 99th percentile under 1 second.

but 99% of the users will experience at least 1 sec

This is not "at least" it is "at most", or more precisely "less then"