I have xls type files that have 9 columns and varying number of rows. I would like to use xlrd or other modules so that I can extract all the values from the 1st and 2nd columns out of the nine columns, and then use the extracted values individually. So far my code looks like:
import xlrd
import openpyxl
workbook = xlrd.open_workbook('C09.xls')
sheet_names = workbook.sheet_names()
sheet = workbook.sheet_by_name(sheet_names[0])
num_rows = sheet.nrows
num_cols = sheet.ncols
plist = [[0 for x in range(3)] for x in range(num_rows)]
for i in range(num_rows):
for j in range(3):
plist[i][j] = sheet.cell(i,j).value
and then use the values in [i] (such as doing multiplications and whatnot) and then pulling the corresponding values from [j].
The above code gives an output like:
[['Col header 1', 'Col header 2', 'Col header 3'], [1.0, 1000, 2000], [2.0, 1001, 2001], ..... so on]
Is there a simple way of doing so? I am quite new to Python, so I would appreciate it if you could be a bit more specific. Thank you so much!
Some feedback/improvements:
In your snippet, the initialization part is redundant. This avoids the double iteration:
plist = [[sheet.cell(i,j).value for j in range(3)] for i in xrange(num_rows)]
should you run into None
values, you can normalize them with:
plist = [[sheet.cell(i,j).value or 0 for j in range(3)] for i in xrange(num_rows)]
Finally, here is a more Pythonic way of doing 0-initializations:
plist = [x[:] for x in [[0] * 3] * sheet.nrows