I have an excel spreadsheet with 2 columns that I am attempting to convert to a 2d array with the following code:
#!/usr/bin/python3
import openpyxl
import sys
import os
book = openpyxl.load_workbook('contact2019.xlsx')
sheet = book.active
first_sheet = book.get_sheet_names()[0]
worksheet = book.get_sheet_by_name(first_sheet)
excel_data = [[0 for x in range(2)] for y in range(1)]
print(len(excel_data))
first = 0
cell_num = 0
for row in range(2,worksheet.max_row+1):
for column in "AB": #Here you can add or reduce the columns
cell_name = "{}{}".format(column, row)
excel_data.append(worksheet[cell_name].value)
print(excel_data)
My issue is that the data is simply being added to a 1d array sequentially, so my output of this is as follows:
['Sam Adams', '*******@gmail.com']
This is not a 2d array which I have initialized.
Input data is as follows:
Sam Adams **********@gmail.com
Sammy Adams **********@gmail.com
Samuel Adams **********@gmail.com
Samantha Adams **********@gmail.com
Sam Adams **********@gmail.com
Why is this assigning data sequentially instead of two per row?
Welcome to SO!
Your code below iterates through and adds each item as a separate item, hence you get everything in sequential order.
for row in range(2,worksheet.max_row+1):
for column in "AB": #Here you can add or reduce the columns
cell_name = "{}{}".format(column, row)
excel_data.append(worksheet[cell_name].value)
Rather than looping through first the rows and then all the columns, you could replace the code to only loop through the rows.
for row in range(2,worksheet.max_row+1):
excel_data.append([worksheet["A{}".format(row)].value,worksheet["B{}".format(row)].value])
By doing this you are now creating a list of 2 element lists (or a 2d array).
If you aren't set on doing it this way, you can also consider the python library pandas
which abstracts out a lot of the work and lets you work with spreadsheet like objects called DataFrames
.