Search code examples
pythonperformancepython-3.xif-statementreadability

Would it be better to use "if x in (y, z)" over "if x == y or x == z"?


Given this simple condition:

if x == y or x == z:
    print("Hello World!");

I understand that Python would first look to see if x is equal to y and if x is not equal to y it then it would check to see if x is equal to z, printing Hello World! if at least one of the conditions is True.

If I were to do this instead:

if x in (y, z):
    print("Hello World!");

To my understanding Python would iterate through the "yz" tuple and then print Hello World! if the value of x is in the "yz" tuple.

Which method would be faster / more efficient to use?
Would Python not bother to check if x was equal to z if x was equal to y?
Would Python still execute the code in the if statement if x was equal to y but not z?

Thank you in advance.


Solution

  • Let's test it out ourselves.

    Here is a class that overloads the equality operator to let us see what Python is doing:

    class Foo:
      def __init__(self, name):
        self.name = name
    
      def __eq__(self, other):
        print self.name, "==", other.name, "?"
        return self.name == other.name
    

    Let's test out short circuiting:

    # x and a are passed the same string because we want x == a to be True
    x = Foo("a")
    a, b = Foo("a"), Foo("b")
    if x in (a, b):
      print "Hello World!"
    

    For me, this outputs:

    a == a ?
    Hello World!
    

    a == b ? was not printed, so short-circuiting does work as desired. The block is also executed as desired.

    Now for speed. If we modify the above __eq__ method to remove the print statement (to avoid I/O in our benchmark) and use IPython's %timeit magic command, we can test it this way:

    c = Foo("c") # for comparison when x is not equal to either case
    %timeit x in (a, b) # equal to first case
    %timeit (x == a or x == b)
    %timeit x in (b, a) # equal to second case
    %timeit (x == b or x == a)
    %timeit x in (b, c) # not equal to either case
    %timeit (x == b or x == c)
    

    These are the average times per iteration (from 1 million iterations):

    Code               Time (ns)
    x in (a, b)        437
    x == a or x == b   397
    x in (b, a)        796
    x == b or x == a   819
    x in (b, c)        779
    x == b or x == c   787
    

    So, pretty comparable results. There is a slight difference, but it isn't big enough to worry about. Just use whichever is most readable in a case-by-case basis.