A normal distribution has a kurtosis of 3. With an increase in outliers in the distribution, the tails become "fat" and the kurtosis increases beyond 3.
How do I generate a random distribution between two numbers with kurtosis greater than 3 (preferably around 5-7)?
Imports
import numpy as np
import scipy.stats import kurtosis
Random Uniform between 0.01-0.10
# Random Uniform Distribution
runif = np.random.uniform(0.01, 0.10, 10000)
kurtosis(runif, fisher=False)
1.8124891901330156
Random Normal between 0.01-0.10
lower = 0.01
upper = 0.10
mu = (upper)/2
sigma = 0.01
N = 10000
retstats = scipy.stats.truncnorm.rvs((lower-mu)/sigma,(upper-mu)/sigma,loc=mu,scale=sigma,size=N)
mean = .05
stdev = .01 # 99.73% chance the sample will fall in your desired range
values = [gauss(mean, stdev) for _ in range(10000)]
kurtosis(values, fisher=False)
3.015004351756201
Random Normal with fat-tails between 0.01-0.10
???
A normal distribution always has a kurtosis of 3. A uniform distribution has a kurtosis of 9/5. Long-tailed distributions have a kurtosis higher than 3. Laplace, for instance, has a kurtosis of 6. [Note that typically these distributions are defined in terms of excess kurtosis, which equals actual kurtosis minus 3.] See the table here: http://mathworld.wolfram.com/KurtosisExcess.html
By cutting off the tails, however, you only reduce the kurtosis. By cutting tails, it is impossible to generate a normal distribution with kurtosis higher than 3. In order to generate a distribution with limited range and high kurtosis, you will need to ensure that the cut has a minimal effect on the tails and start with a long-tailed (not normal) distribution. Colloquially, you'll need to have a very spiky distribution. I produce one below using Laplace with a small exponential decay parameter.
import numpy as np
from scipy.stats import kurtosis
min_range = 0.01
max_range = 0.10
midpoint = (max_range + min_range)/2
samples = 10000
def filter_tails(x):
return x[(x >= min_range) & (x <= max_range)]
runif = np.random.uniform(min_range, max_range, samples)
value = kurtosis(filter_tails(runif), fisher=False)
print(f"uniform kurtosis = {value}")
sigma = 0.01
runif = np.random.normal(midpoint, sigma, samples)
value = kurtosis(filter_tails(runif), fisher=False)
print(f"gaussian kurtosis = {value}")
exponential_decay = 0.001
runif = np.random.laplace(midpoint, exponential_decay, samples)
value = kurtosis(filter_tails(runif), fisher=False)
print(f"laplace kurtosis = {value}")
Running the script, I get:
uniform kurtosis = 1.8011863970680828
gaussian kurtosis = 3.0335178694177785
laplace kurtosis = 5.76290423111418