Search code examples
pythonlistdictionaryindex-error

List index out of range error when adding a new dictionary key with iterating index numbers on said list


Not gonna go into too many specifics on the background. The myquery thing runs a script that pulls from my work database. Everything gets put into a list. From there it's supposed to dump everything into a Google Sheet, one line at a time. Worked great when it was just two variables I had to worry about and I could use something binary like a dictionary. Well, now I got more; 4 categories, each need to be on their own line.

I just need to split up the list (rawskudata) into a bunch of smaller lists (componant) assigned to a dictionary (skuandimages). The problem is on this line:

skuandimages[c_list] = [rawskudata[int(c_sku)], rawskudata[int(c_img_url)], rawskudata[int(c_name)], rawskudata[int(c_quantity)]]

and I get IndexError: list index out of range.

based on me staring at it for literally two straight hours and making every google search result for "indexerror" purple, this SHOULD work. the list it's pulling from DOES have the index number. i checked with all sorts of print statements. why. why is it not. i want to die

mycursor = mydb.cursor()

skuandimages = {
    
}

myquery2 = #insert top secret query here

mycursor.execute(myquery2)

rawskudata = []

c_tag = 0

c_sku = 0
c_img_url = 1
c_name = 2
c_quantity = 3

print(mycursor)

for xy in mycursor:
    for yx in range(2,6):
        rawskudata.append(str(xy[yx]))

print(rawskudata)

for z in range(0,len(rawskudata)):
  #skuandimages[str(x[2]) + "-" + str(x[3])] = x[4]
  c_list = "componant" + str(c_tag)
  skuandimages[c_list] = [rawskudata[int(c_sku)], rawskudata[int(c_img_url)], rawskudata[int(c_name)], rawskudata[int(c_quantity)]]
  #skuandimages[c_list] = [x for x]
  c_sku = c_quantity + 1
  c_img_url = c_quantity + 2
  c_name = c_quantity + 3
  c_quantity = c_quantity + 4
  c_tag += 1

The print(rawskudata) returns this (data altered for privacy stuff):

['222001-1', 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/64/Garden_strawberry_%28Fragaria_%C3%97_ananassa%29_single.jpg/440px-Garden_strawberry_%28Fragaria_%C3%97_ananassa%29_single.jpg', 'Strawberry', '1', '222014-1', 'https://upload.wikimedia.org/wikipedia/commons/thumb/7/78/Ripe%2C_ripening%2C_and_green_blackberries.jpg/440px-Ripe%2C_ripening%2C_and_green_blackberries.jpg', 'Blackberry', '1', '222053-1', 'https://upload.wikimedia.org/wikipedia/commons/thumb/e/e3/Oranges_-_whole-halved-segment.jpg/440px-Oranges_-_whole-halved-segment.jpg', 'Oranges', '1', '222123-1', 'https://upload.wikimedia.org/wikipedia/commons/thumb/9/9e/Autumn_Red_peaches.jpg/440px-Autumn_Red_peaches.jpg', 'Peaches', '1', '222203-1', 'https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/Pears.jpg/440px-Pears.jpg', 'Pears', '1']

Solution

  • You're grabbing data from rawskudata 4 at a time, so you want to loop over 1/4 the number of items in rawskudata

    for z in range(0, int(len(rawskudata)/4)):
    

    But there is an easier way to go about this. You can replace everything that happens after the line mycursor.execute(myquery2) with:

    for xy in mycursor:
        skuandimages["componant" + str(c_tag)] = [xy[2], xy[3], xy[4], xy[5]]
        c_tag = c_tag + 1
    

    Feedback

    Below I've written some feedback on the code you wrote that should hopefully help you out as you learn Python.

    When you grab the data

    for xy in mycursor:
        for yx in range(2,6):
            rawskudata.append(str(xy[yx]))
    

    It would be better to use the variable names

    for row in mycursor:
        for sku_item in range(2,6):
    

    While naming might not matter that much in smaller applications, it becomes one of the most important things about writing code in bigger applications, and it also makes things easier if you come back to your code in the future and you're trying to figure out what it does.

    The line rawskudata.append(str(xy[yx])) converts the data to a string. It's normally best to leave the data as is until you actually need it as a string. That way if you wanted to do something else with it such as comparisons with the product quantities you would have the ability to do so.

    On the line for z in range(0,len(rawskudata)): instead of using the variable name z it is a standard convention to use i short for index when you grab the index of each item in a list like this, or some people will use the variable name _ for variables you never actually use in the code. Although it's usually a red flag that you've coded something in a more cumbersome way when you are using the pattern for i in range(0, len(some_list)): instead of for some_value in some_list:.

    In dictionaries like skuandimages you have keys and values. The variable c_list could be better named c_key since it is a dictionary key and not a list.

    The line

    skuandimages[c_list] = [rawskudata[int(c_sku)], rawskudata[int(c_img_url)], rawskudata[int(c_name)], rawskudata[int(c_quantity)]]
    

    Doesn't need to convert everything to integers as the numbers are already integers. Perhaps this was just something you added while you trying to figure out the IndexError, but it's unnecessary here. In this situation we would want an error to naturally happen if one of the variables such as c_sku wasn't an integer.

    The lines

    c_sku = c_quantity + 1
    c_img_url = c_quantity + 2
    c_name = c_quantity + 3
    c_quantity = c_quantity + 4
    

    Seem odd as you are basing everything off the quantity. I would prefer creating a new variable that has the base value for that iteration, and then you can add +1, +2, +3, +4, etc. to it. It is also more common to get the product fields before you add them to skuandimages Something like:

    product_start_index = 0
    for _ in range(0, int(len(rawskudata)/4)):
    
        sku = rawskudata[product_start_index + 0]
        img_url = rawskudata[product_start_index + 1]
        name = rawskudata[product_start_index + 2]
        quantity = rawskudata[product_start_index + 3]
    
        key = "component" + product_start_index
        skuandimages[key] = [sku, img_url, name, quantity]
    
        product_start_index += 4
    

    Or, going back to those 4 lines, another alternative would have been

    c_sku += 4
    c_img_url += 4
    c_name += 4
    c_quantity += 4
    

    Adding 4 to those variables each time you go through the loop. (c_sku += 4 is shorthand for c_sku = c_sku + 4). And now there is no need to base the numbers off c_quantity

    One final improvement. There is a lesser known feature to range; a third parameter that allows us to count by 4s instead of counting by 1s. Knowing this we can really make things simple

    for i in range(0, len(rawskudata), 4):
    
        sku      = rawskudata[i + 0]
        img_url  = rawskudata[i + 1]
        name     = rawskudata[i + 2]
        quantity = rawskudata[i + 3]
    
        skuandimages["component" + i] = [sku, img_url, name, quantity]
    

    But like I mentioned before, the best solution would be to create skuandimages from the beginning instead of rawskudata:

    for xy in mycursor:
        skuandimages["componant" + str(c_tag)] = [xy[2], xy[3], xy[4], xy[5]]
        c_tag = c_tag + 1
    

    And if you really want to make things compact, this can be rewritten as

    for i, product_data in enumerate(mycursor):
        skuandimages["componant " + str(i)] = product_data[2:]
    

    enumerate will give us both a count, 1, 2, 3, etc. as i, each time we loop through the loop, alongside the actual product data. product_data[2:] is a shorthand way of getting a sublist starting at the second item and going to the end of the list.