Search code examples
pythonexcelopenpyxldefaultdict

Using defaultdict to append a list from an .xlsx file


I'm trying to take an excel file with two fields, ID and xy coordinates and create a dictionary so that each ID is a key to all of the xy coordinate values.

for example, the excel file looks like this: [1] [1]: https://i.sstatic.net/P2agc.png

but there are more than 900 oID values

I want the final format to be something like, [('0',[-121.129247,37.037939,-121.129247,37.037939,-121.056516,36.997779]), ('1',[all, the, coordinates,with,oID,of,1]),('2'[all,the,coordinate,with,oID,of,2]etc.)]

I am trying to use a for statement to iterate through the excel sheet to populate a list with the first 200 rows, and then putting that into a default dict.

Here is what I have done so far:

wb=openpyxl.load_workbook('simpleCoordinate.xlsx')
sheet=wb['Sheet1']

from collections import defaultdict

CoordDict = defaultdict(list)
for i in range (1,201,1):
    coordinate_list=[(sheet.cell(row=i,column=1).value, sheet.cell(row=i,     column=2).value)]

for oID, xy in coordinate_list:
   CoordDict[oID].append(xy)

print(list(CoordDict.items()))

which returns:

[(11, ['-121.177487,35.49885'])]

Only the 200th line of the excel sheet, rather than the whole thing.. I'm not sure what I'm doing wrong, is it something with the for statement? Am I thinking about this in the wrong way? I'm a total newbie to python any advice would be helpful!


Solution

  • You are overwriting coordinate_list 200 times. Instead, create it, then append to it with the += operator.

    wb=openpyxl.load_workbook('simpleCoordinate.xlsx')
    sheet=wb.get_sheet_by_name('Sheet1')
    
    from collections import defaultdict
    
    coordinate_list = list()
    for i in range (1,201,1):
        coordinate_list += [(sheet.cell(row=i,column=1).value, sheet.cell(row=i,     column=2).value)]
    
    coord_dict = defaultdict(list)
    for oid, xy in coordinate_list:
        coord_dict[oid] = xy
    
    print(list(coord_dict.items()))