Search code examples
pythonperformancerandom

Performance of choice vs randint


I want to pick a random integer between a and b, inclusive.

I know 3 ways of doing it. However, their performance seems very counter-intuitive:

import timeit

t1 = timeit.timeit("n=random.randint(0, 2)", setup="import random", number=100000)
t2 = timeit.timeit("n=random.choice([0, 1, 2])", setup="import random", number=100000)
t3 = timeit.timeit("n=random.choice(ar)", setup="import random; ar = [0, 1, 2]", number=100000)

[print(t) for t in [t1, t2, t3]]

On my machine, this gives:

0.29744589625620965
0.19716156798482648
0.17500512311108346

Using an online interpreter, this gives:

0.23830216699570883
0.16536146598809864
0.15081614299560897

Note how the most direct version (#1) that uses the dedicated function for doing what I'm doing is 50% worse that the strangest version (#3) which pre-defines an array and then chooses randomly from it.

What's going on?


Solution

  • It's just implementation details. randint delegates to randrange, so it has another layer of function call overhead, and randrange goes through a lot of argument checking and step handling. In contrast, choice is much shorter.

    Here's the code path randint goes through for this call, with comments and unexecuted code stripped out:

    def randint(self, a, b):
        return self.randrange(a, b+1)
    
    def randrange(self, start, stop=None, step=_ONE):
        try:
            istart = _index(start)
        except TypeError:
            # Not executed
        if stop is None:
            # Not executed
    
        try:
            istop = _index(stop)
        except TypeError:
            # Not executed
        width = istop - istart
        try:
            istep = _index(step)
        except TypeError:
            # Not executed
        if istep == 1:
            if width > 0:
                return istart + self._randbelow(width)
    

    And here's the code path choice goes through:

    def choice(self, seq):
        if not len(seq):
            # Not executed
        return seq[self._randbelow(len(seq))]