Finding the ideal TTL using a histogram

Time to live is defined as "Time to live (TTL) refers to the amount of time or “hops” that a packet is set to exist inside a network before being discarded by a router. TTL is also used in other contexts including CDN caching and DNS caching."

I'm attempting to model the ideal TTL that will enable most packets to exist before being discarded by the router.

Taking a sample of the amount of time each packet exists in seconds results in the following dataset : {1,2,3,4,2,5,6,7,8,9}

The bin sizes I set are : {3,5,10}

I expect to extract the following output

3 -> {1,2,3}
5 -> {1,2,3,4,5}

From the above I deduce Setting a TTL of 3 seconds will result in packets {4,5,6,7,8,9} being discarded. Setting a TTL of 5 seconds will result in packets {5,6,7,8,9} being discarded.

Is my logic correct ?

Converting this logic to Jupyter notebook code results in :

%reset -f
import datetime
import json
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

packet_exist_times = [1,2,3,4,2,5,6,7,8,9]
packet_times_bins = [3,5,10]
print(packet_exist_times)
hist, bins = np.histogram(packet_exist_times, bins = packet_times_bins)

print(hist)
print(bins)

which produces output :

[1, 2, 3, 4, 2, 5, 6, 7, 8, 9]
[2 5]
[ 3  5 10]

Displaying the histogram :

plt.hist(packet_exist_times, bins='auto')

renders :

From the numpy histogram (https://numpy.org/doc/stable/reference/generated/numpy.histogram.html) how to extract the information to enable modelling of ideal TTL time to allow most packets to not be discarded ?

Solution

Model

Let's say you know the exact distribution of your variable. Example, we know that the variable distribution is a Weibull of specific parameters.

Then it is as simple as taking the PPF of this distribution for a defined threshold (eg. setting the threshold to 95% means we accept to loose 5% of packets).

With Python you can model it as follow:

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

X = stats.exponweib(a=3.05, c=0.95)
xcut = X.ppf(0.95)  # 4.408962640288286

The yellow area (0.95 units) represent observations below the threshold xcut accounting for 95% of the population.

Real world

If you don't know the distribution (most probably the case) but you do have data as a collection of number of router hop required to reach a target - or you already have it in the form of an histogram - then we can assess this quantity as well.

First we create a trial dataset and we compute its histogram:

X = stats.exponweib(a=3.05, c=0.95, loc=10, scale=2)   # Router hop
x = X.rvs(size=10000).round()                          # Rounded as it is an integral quantity
bins = np.arange(x.min(), x.max()+1)                   # Unitary bins
hist = np.histogram(x, bins=bins)                      # Histograms

# [ 128, 1368, 2045, 1840, 1507, 1089,  738,  437,  286,  197,  142, 77,   63,   36,   19,    8,    5,    2,    8,    3,    1,    0,   1]
# [10., 11., 12., 13., 14., 15., 16., 17., 18., 19., 20., 21., 22., 23., 24., 25., 26., 27., 28., 29., 30., 31., 32., 33.]

Then we create a random variable from this histogram and extract its PPF at 95% as we did previously:

dist = stats.rv_histogram(hist)
xcut = dist.ppf(0.95)            # 19.132978723404246

In this case setting the TTL to 20 will allow at least 95% of all packets to live until they reach their target. To better know why the threshold is 20 have a look on CDFs:

dist.cdf(19)  # 0.9475000000000001
dist.cdf(20)  # 0.9663000000000002

Setting TTL to 20 will keep 96.6% of packets while setting it to 19 will keep only 94.8% of packets.