Say I have the following numpy array
n = 50
a = np.array(range(1, 1000)) / 1000.
I would like to execute this line of code
%timeit v = [a ** k for k in range(0, n)]
1000 loops, best of 3: 2.01 ms per loop
However, this line of code will ultimately be executed in a loop, therefore I have performance issues.
Is there a way to optimize the loop? For example, the result of a specific calculation i
in the list comprehension is simply the result of the previous calculation result in the loop, multiplied by a
again.
I don't mind storing the results in a 2d-array instead of arrays in a list. That would probably be cleaner. By the way, I also tried the following, but it yields similar performance results:
k = np.array(range(0, n))
ones = np.ones(n)
temp = np.outer(a, ones)
And then performed the following calculation
%timeit temp ** k
1000 loops, best of 3: 1.96 ms per loop
or
%timeit np.power(temp, k)
1000 loops, best of 3: 1.92 ms per loop
But both yields similar results to the list comprehension above. By the way, n
will always be an integer
in my case.
In quick tests cumprod
seems to be faster.
In [225]: timeit v = np.array([a ** k for k in range(0, n)])
2.76 ms ± 1.62 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [228]: %%timeit
...: A=np.broadcast_to(a[:,None],(len(a),50))
...: v1=np.cumprod(A,axis=1)
...:
208 µs ± 42.3 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
To compare values I have to tweak ranges, since v
includes a 0 power, while v1
starts with a 1 power:
In [224]: np.allclose(np.array(v)[1:], v1.T[:-1])
Out[224]: True
But the timings suggest that cumprod
is worth refining.
The proposed duplicate was Efficient way to compute the Vandermonde matrix. That still has good ideas.