Search code examples
pythontabsindentationspace

Python's interpretation of tabs and spaces to indent


I decided, that I learn a bit of Python. The first introduction says that it uses indentation to group statements. While the best habit is clearly to use just one of these what happens if I interchange them? How many spaces will be considered equal to one tab? Or will it fail to work at all if tabs and spaces are mixed?


Please use I'm getting an IndentationError (or a TabError). How do I fix it? to close questions where OP has a TabError resulting from mixing tabs and spaces for indentation. This question is specifically about how such indentation is interpreted, not why it causes a problem or how to fix it.


Solution

  • Spaces are not treated as equivalent to tab. A line indented with a tab is at a different indentation from a line indented with 1, 2, 4 or 8 spaces.

    Proof by counter-example (erroneous, or, at best, limited - tab != 4 spaces):

    x = 1
    if x == 1:
    ^Iprint "fff\n"
        print "yyy\n"
    

    The '^I' shows a TAB. When run through Python 2.5, I get the error:

      File "xx.py", line 4
        print "yyy\n"
                    ^
    IndentationError: unindent does not match any outer indentation level
    

    Thus showing that in Python 2.5, tabs are not equal to spaces (and in particular not equal to 4 spaces).


    Oops - embarrassing; my proof by counter-example shows that tabs are not equivalent to 4 spaces. As Alex Martelli points out in a comment, in Python 2, tabs are equivalent to 8 spaces, and adapting the example with a tab and 8 spaces shows that this is indeed the case.

    x = 1
    if x != 1:
    ^Iprint "x is not 1\n"
            print "y is unset\n"
    

    In Python 2, this code works, printing nothing.


    In Python 3, the rules are slightly different (as noted by Antti Haapala). Compare:

    Python 2 says:

    First, tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight (this is intended to be the same rule as used by Unix). The total number of spaces preceding the first non-blank character then determines the line’s indentation. Indentation cannot be split over multiple physical lines using backslashes; the whitespace up to the first backslash determines the indentation.

    Python 3 says:

    Tabs are replaced (from left to right) by one to eight spaces such that the total number of characters up to and including the replacement is a multiple of eight (this is intended to be the same rule as used by Unix). The total number of spaces preceding the first non-blank character then determines the line’s indentation. Indentation cannot be split over multiple physical lines using backslashes; the whitespace up to the first backslash determines the indentation.

    (Apart from the opening word "First," these are identical.)

    Python 3 adds an extra paragraph:

    Indentation is rejected as inconsistent if a source file mixes tabs and spaces in a way that makes the meaning dependent on the worth of a tab in spaces; a TabError is raised in that case.

    This means that the TAB vs 8-space example that worked in Python 2 would generate a TabError in Python 3. It is best — necessary in Python 3 — to ensure that the sequence of characters making up the indentation on each line in a block is identical. PEP8 says 'use 4 spaces per indentation level'. (Google's coding standards say 'use 2 spaces'.)