Search code examples
pythonpython-3.xpython-2.7indexoutofboundsexceptionstringindexoutofbounds

How is my python code going out of bound?


I've been trying to encode a string (ex: aabbbacc) to something like a2b3a1c2 this is the code i've tried:

string_value = "aabbbacc"
temp_string = ""
for i in range(0, len(string_value)):
    if i != len(string_value) or i > len(string_value):
        temp_count = 1
        while string_value[i] == string_value[i+1]:
            temp_count += 1
            i += 1
        temp_string += string_value[i] + str(temp_count)
print(temp_string)

the problem is even though I've added an if condition to stop out of bounds from happening, I still get the error

Traceback (most recent call last):
  File "C:run_length_encoding.py", line 6, in <module>
    while string_value[i] == string_value[i+1]:
IndexError: string index out of range

I've also tried

string_value = "aabbbacc"
temp_string = ""
for i in range(0, len(string_value)):
    count = 1
    while string_value[i] == string_value[i+1]:
        count += 1
        i += 1
        if i == len(string_value):
            break
    temp_string += string_value[i]+ str(count)
print(temp_string)

now, I know there might be a better way to solve this, but I'm trying to understand why I'm getting the out of bounds exception even though i have an if condition to prevent it, at what part of the logic am I going wrong please explain...


Solution

  • The problem is here:

    for i in range(0, len(string_value)): # if i is the last index of the string
        count = 1
        while string_value[i] == string_value[i+1]: # i+1 is now out of bounds
    

    The easiest way to avoid out-of-bounds is to not index the strings at all:

    def encode(s):
        if s == '':   # handle empty string
            return s
        current = s[0]  # start with first character (won't fail since we checked for empty)
        count = 1
        temp = ''
        for c in s[1:]:  # iterate through remaining characters (string slicing won't fail)
            if current == c:
                count += 1
            else: # character changed, output count and reset current character and count
                temp += f'{current}{count}'
                current = c
                count = 1
        temp += f'{current}{count}'  # output last count accumulated
        return temp
    
    print(encode('aabbbacc'))
    print(encode(''))
    print(encode('a'))
    print(encode('abc'))
    print(encode('abb'))
    

    Output:

    a2b3a1c2
    
    a1
    a1b1c1
    a1b2