Search code examples
pythonindexingslice

When using negative numbers to slice a string, why is 0 is disabled?


Let's say I have a string:

>>>a = 'akwkwas'
>>>
>>>a[-3:]
'was'
>>>a[-3:None]
'was'
>>>a[-3:0]
''

Why can't I use 0 as the end of the slice?

This is from docs:

One way to remember how slices work is to think of the indices as pointing between characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of n characters has index n, for example:

 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1

The first row of numbers gives the position of the indices 0...6 in the string; the second row gives the corresponding negative indices. The slice from i to j consists of all characters between the edges labeled i and j, respectively.

So when we use negative indices in loop we should check the end's value because the end 0 in negative indices does not exist, such as when we split a string to a money-like string:

>>>a = '12349878334'
>>>print(','.join([a[-i-3:-i if i else None] for i in range(0, len(a), 3)][::-1])) 
>>>12,349,878,334

Solution

  • 0 is the start of the sequence. Always, unambiguously. Changing its meaning to sometimes be the end would lead to a lot of confusion, especially when using variables for those two values.

    Using negative indices is also not a different mode; negative indices are converted to positive indices relative to the length. Changing what element 0 refers to because the other slice input (start or stop) was a negative number makes no sense.

    Because 0 always means the first element of the sequence, and there is no spelling for a negative zero, you cannot use 0 to mean the end of the sequence.

    You can use None as the stop element to mean this instead, if you need to parameterise your indices:

    start = -3
    stop = None
    result = a[start:stop]
    

    You can also create a slice() object; the same rules apply for how indices are interpreted:

    indices = slice(-3, None)
    result = a[indices]
    

    In fact, the interpreter translates the slice notation into a slice() object, which is then passed to the object to distinguish from straight-up indexing with a single integer; the a[start:stop] notation translates to type(a).__getitem__(a, slice(start, stop)) whereas a[42] becomes type(a).__getitem__(a, 42).

    So by using a slice() object you can record either slicing or single-element indexing with a single variable.