Search code examples
pythonexcelpandasxlrd

Read Excel Rows Using Pandas like Xlrd module


In this example, I'm showing 2 ways how I read the excel file and print data. Instead of using both ways, I want to use pandas module to output data as I did with xlrd module. I want to loop through the column/rows and append it to the arrays (Col_B3, Col_C3, Col_D3) How can I do this?

import xlrd
import pandas as pd
import io

path = r'C:\Temp Files\My_Excel_File.xlsx'

''' USING XLRD '''
#open workbook
inputWorkbook = xlrd.open_workbook(path)
#open first sheet
Sheet = inputWorkbook.sheet_by_index(0)

Col_B3 = []
Col_C3 = []
Col_D3 = []

for row in range(2 ,Sheet.nrows):
    Col_B3.append(Sheet.cell_value(row, 1))
    Col_C3.append(Sheet.cell_value(row, 2))
    Col_D3.append(Sheet.cell_value(row, 3))

print(Col_B3)    
print(Col_C3)
print(Col_D3)


''' USING PANDAS '''
df = pd.read_excel(path)

print(df)

XLRD OUTPUT

['Col_B3', 1.0, 2.0, 3.0, 4.0]
['Col_C3', 'Jack', 'Jill', 'Peter', 'Jade']
['Col_D3', 1200.0, 875.0, 120.0, 4230.0]

PANDAS OUTPUT

   Unnamed: 0 Unnamed: 1 Unnamed: 2 Unnamed: 3
0         NaN        NaN        NaN        NaN
1         NaN     Col_B3     Col_C3     Col_D3
2         NaN          1       Jack       1200
3         NaN          2       Jill        875
4         NaN          3      Peter        120
5         NaN          4       Jade       4230

The Excel File


Solution

  • With XLRD, you had more control on how to wrangle the data as you read it in. Pandas reads your data as is; your first column is empty, same as first row. Your data is also in column format, so Pandas read it in column wise.

    You can reshape it into list form with Pandas:

    res = (df.dropna(how='all') #remove completely empty rows
          .dropna(how='all',axis=1) #remove completely empty columns
          .T #flip columns into row position
          #convert to list    
          .to_numpy()
          .tolist()
          )
    
    print(res)
    
    [['Col_B3', '1', '2', '3', '4'],
     ['Col_C3', 'Jack', 'Jill', 'Peter', 'Jade'],
     ['Col_D3', '1200', '875', '120', '4230']]