In this example, I'm showing 2 ways how I read the excel file and print data.
Instead of using both ways, I want to use pandas module to output data as I did with xlrd module.
I want to loop through the column/rows and append it to the arrays (Col_B3, Col_C3, Col_D3)
How can I do this?
import xlrd
import pandas as pd
import io
path = r'C:\Temp Files\My_Excel_File.xlsx'
''' USING XLRD '''
#open workbook
inputWorkbook = xlrd.open_workbook(path)
#open first sheet
Sheet = inputWorkbook.sheet_by_index(0)
Col_B3 = []
Col_C3 = []
Col_D3 = []
for row in range(2 ,Sheet.nrows):
Col_B3.append(Sheet.cell_value(row, 1))
Col_C3.append(Sheet.cell_value(row, 2))
Col_D3.append(Sheet.cell_value(row, 3))
print(Col_B3)
print(Col_C3)
print(Col_D3)
''' USING PANDAS '''
df = pd.read_excel(path)
print(df)
XLRD OUTPUT
['Col_B3', 1.0, 2.0, 3.0, 4.0]
['Col_C3', 'Jack', 'Jill', 'Peter', 'Jade']
['Col_D3', 1200.0, 875.0, 120.0, 4230.0]
PANDAS OUTPUT
Unnamed: 0 Unnamed: 1 Unnamed: 2 Unnamed: 3
0 NaN NaN NaN NaN
1 NaN Col_B3 Col_C3 Col_D3
2 NaN 1 Jack 1200
3 NaN 2 Jill 875
4 NaN 3 Peter 120
5 NaN 4 Jade 4230
With XLRD, you had more control on how to wrangle the data as you read it in. Pandas reads your data as is; your first column is empty, same as first row. Your data is also in column format, so Pandas read it in column wise.
You can reshape it into list form with Pandas:
res = (df.dropna(how='all') #remove completely empty rows
.dropna(how='all',axis=1) #remove completely empty columns
.T #flip columns into row position
#convert to list
.to_numpy()
.tolist()
)
print(res)
[['Col_B3', '1', '2', '3', '4'],
['Col_C3', 'Jack', 'Jill', 'Peter', 'Jade'],
['Col_D3', '1200', '875', '120', '4230']]