When to use == and when to use is?

Curiously:

>>> a = 123
>>> b = 123
>>> a is b
True
>>> a = 123.
>>> b = 123.
>>> a is b
False

Seems a is b being more or less defined as id(a) == id(b). It is easy to make bugs this way:

basename, ext = os.path.splitext(fname)
if ext is '.mp3':
    # do something
else:
    # do something else

Some fnames unexpectedly ended up in the else block. The fix is simple, we should use ext == '.mp3' instead, but nonetheless if ext is '.mp3' on the surface seems like a nice pythonic way to write this and it's more readable than the "correct" way.

Since strings are immutable, what are the technical details of why it's wrong? When is an identity check better, and when is an equality check better?

Solution

As far as I can tell, is checks for object identity equivalence. As there's no compulsory "string interning", two strings that just happen to have the same characters in sequence are, typically, not the same string object.

When you extract a substring from a string (or, really, any subsequence from a sequence), you will end up with two different objects, containing the same value(s).

So, use is when and only when you are comparing object identities. Use == when comparing values.