Search code examples
pythonexcelxlrd

How to extract values from excel sheet where few rows are formatted as number but few are text in Python


I have an excel sheet where some of the rows are formatted as number but some of the rows are formatted as text. I have python script which reads each row and convert it into json and process it. Excel sheet looks like below:

enter image description here

In the above image, we can see that values in row 98, 99, 100, 101, 102 column 1 are formatted as text but values in row 103, 104 column 1 are formatted as number. I have below python code to access it:

import xlrd
wb = xlrd.open_workbook("file.xlsx")

sheet = wb.sheet_by_index(4)

for i in range(4, 185): #from 5th row to 186th row
    data = dict()
    print(str(sheet.cell_value(i, 0)))
    data['ProductNumber'] = str(sheet.cell_value(i, 0))

I have to save the values in str and thus I have used str. But I get below results for above rows 98 to 104

ROW     Value Extracted
98          907
99          088
100         019
101         274
102         905
103         290.0
104         971.0

Above results looks fine for row 98 to 102 but for 103, 104 row its wrong. It should be only 290 & 971. To do this I put an int before str like below:

str(int(sheet.cell_value(i, 0)))

Now using this above results looks like below:

ROW     Value Extracted
98          907
99          88
100         19
101         274
102         905
103         290
104         971

This again is wrong because the values for row 99 and 100 should be 088 & 019 and not 88 & 19. How can I resolve this issue. Is there any check I can use which can help me decide this . I am using xlrd python module.


Solution

  • for your example set of data, this would work:

    str(int(sheet.cell_value(i,0))).zfill(3)