Search code examples
pythonperformancefor-loopfor-in-loop

python massive performance difference array iteration vs "if in"


Both the code snippets below check if an element exists in the array but first approach takes < 100ms while the second approach takes ~6 seconds .

Does anyone know why ?

import numpy as np
import time

xs = np.random.randint(90000000, size=8000000)

start = time.monotonic()
is_present = -4 in xs

end = time.monotonic()

print( 'exec time:', round(end-start, 3) , 'sec ') // 100 milliseconds

start = time.monotonic()
for x in xs:
  if (x == -4):
    break

end = time.monotonic()

print( 'exec time:', round(end-start, 3) , 'sec ') // 6000 milliseconds ```

repl link


Solution

  • numpy is specifically built to accelerate this kind of code, it is written in c with almost all of the python overhead removed, comparatively your second attempt is pure python so it takes much longer to loop through all the elements