Generating 5 millions points r[i]
recursively with:
import numpy as np
n, a, b, c = 5000000, 0.0000002, 0.5, 0.4
eps = np.random.normal(0, 1, n)
sigma = np.ones(n) * np.sqrt(a)
r = np.zeros(n)
for i in range(1,n):
sigma[i] = np.sqrt(a + b * r[i-1] ** 2 + c * sigma[i-1] ** 2)
r[i] = sigma[i] * eps[i]
uses approximatively 17 seconds on my standard i5 laptop computer.
I have used Cython quite often in the past and I know that using it here would probably optimize by a factor 10 < k < 100.
But before having to use Cython in such cases, I was wondering: would there be a plain Numpy/Python method that I don't know that would optimize this much?
Simply changing it to math.sqrt
instead of np.sqrt
gives you about 40% speedup here.
Since I'm quite a numba fanatic I tried the numba version versus your one (initial
) and the math-one (normal
)
import numpy as np
import math
import numba as nb
n, a, b, c = 500000, 0.0000002, 0.5, 0.4
eps = np.random.normal(0, 1, n)
sigma = np.ones(n) * np.sqrt(a)
r = np.zeros(n)
def initial(n, a, b, c, eps, sigma, r):
for i in range(1,n):
sigma[i] = np.sqrt(a + b * r[i-1] ** 2 + c * sigma[i-1] ** 2)
r[i] = sigma[i] * eps[i]
def normal(n, a, b, c, eps, sigma, r):
for i in range(1,n):
sigma[i] = math.sqrt(a + b * r[i-1] ** 2 + c * sigma[i-1] ** 2)
r[i] = sigma[i] * eps[i]
@nb.njit
def function(n, a, b, c, eps, sigma, r):
for i in range(1,n):
sigma[i] = math.sqrt(a + b * r[i-1] ** 2 + c * sigma[i-1] ** 2)
r[i] = sigma[i] * eps[i]
Then just to verify the results are the same:
sigma1 = sigma.copy()
sigma2 = sigma.copy()
sigma3 = sigma.copy()
r1 = r.copy()
r2 = r.copy()
r3 = r.copy()
initial(n, a, b, c, eps, sigma1, r1)
normal(n, a, b, c, eps, sigma2, r2)
function(n, a, b, c, eps, sigma3, r3)
np.testing.assert_array_almost_equal(sigma1, sigma2)
np.testing.assert_array_almost_equal(sigma1, sigma3)
np.testing.assert_array_almost_equal(r1, r2)
np.testing.assert_array_almost_equal(r1, r3)
Well what about speed (I used n=500000 to have some faster timeit results):
%timeit initial(n, a, b, c, eps, sigma1, r1)
1 loop, best of 3: 7.27 s per loop
%timeit normal(n, a, b, c, eps, sigma2, r2)
1 loop, best of 3: 4.49 s per loop
%timeit function(n, a, b, c, eps, sigma3, r3)
100 loops, best of 3: 17.7 ms per loop
I know you didn't want cython so numba is probably also out of the question but the speedup is amazing (410 times faster!)