Search code examples
pythonoptimizationparametersstatisticsestimation

My statistic parameters estimations in Python are taking too many time. How do I optmizate my code to run it faster?


I know that Python isn't the fastest language for speed, but I was trying estimate parameters and simulate to calculate statistics, but this take a lot of time. Is any way of optimazating this code to make it faster on Python? With 100 repetitions it took me 25 minutes and I want to make 10000.

from numpy import log, array, random, append
from scipy.stats import expon, kurtosis, skew
from pygosolnp import solve
from tabulate import tabulate
from time import time

inicial_time = time()


def simulation(n, re, alpha):
    def exp2(o):
        return -sum(log(expon.pdf(v, scale=o)))

    mean = array(["Mean"])
    variance = array(["Variance"])
    bias = array(["Bias"])
    eqm = array(["EQM"])
    skewness = array(["Skewness"])
    kurtose = array(["Kurtose"])
    for i in n:
        param = array([])
        for j in range(re):
            v = random.exponential(alpha, size=i)
            param = append(param, array(solve(exp2, [0], [10]).best_solution.parameters))

        med = param.mean()
        varia = param.var()
        b = Alpha - med
        eqma = b ** 2 + varia
        skewn = skew(param)
        kur = kurtosis(param)

        mean = append(mean, med)
        variance = append(variance, varia)
        bias = append(bias, b)
        eqm = append(eqm, eqma)
        skewness = append(skewness, skewn)
        kurtose = append(kurtose, kur)

    data = [mean, variance, bias, eqm, skewness, kurtose]

    print(tabulate(data, headers=["Statistics", "n = 30", "n = 50", "n = 100", "n = 200", "n = 300"]))
    

N = [30, 50, 100, 200, 300]
RE = 100
Alpha = 1/5

simulation(N, RE, Alpha)

print(f'{time()-inicial_time} seconds')

Solution

  • In addition to the helpful comments you got, I think the exponential function is also a function that can take time for the computer to compute. You can use JIT compilation with Numba to make the computation faster. It is a Just-in-Time (JIT) compiler for Python that translates a subset of Python and NumPy code into fast machine code. You can just do this to use it:

    from numba import njit
    
    @njit
    def exp2(o):
        return -sum(log(expon.pdf(v, scale=o)))
    

    This will save some time when exp2() is called.