Getting two widely different execution times in the below
import numpy as np
import time
array = np.arange(0, 750000)
param = 20000
t1 = time.time()
for _ in range(param):
array <= 120
print(round(time.time() - t1), _)
# 9 19999
t2 = time.time()
for _ in range(param):
array - 120 <= 0
print(round(time.time() - t2), _)
# 19 19999
Expectation was that execution times would be similar in the two approaches.
What's the rationale behind this diff? Is numpy internally casting 120 as an array in the second approach?
What other similar bottlenecks to be aware for code optimisation? Happy to read docs on that. Thanks!
NumPy can't perform array - 120 <= 0
as a single fused operation, or rewrite the expression as array <= 120
. It needs to perform the operation as the two steps written:
array - 120
and
result <= 0
and each of these operations builds a new 750000-element array. One 750000-element array of subtraction results, and one 750000-element array of comparison results.
That's much slower than comparing each element to 120 and building an array of comparison results directly, as array <= 120
does.