Search code examples
pythonexcelloopsxlrd

Trouble comparing 2 excel sheets in Python using nested for loop


Using python 3.4 and I have a problem with a small part of a larger program I am working on. For this part I need to compare column A of two excel sheets 'Bookings.xlsx' and 'Forced.xlsx'.

Column A contains booking numbers in both sheets, bookings.xlsx contains the data I need (in the same row) for every booking number in forced.xlsx

Here is the part I'm having trouble with.

reloc_sheet = reloc_book.sheet_by_index(0)
forced_sheet = forced_book.sheet_by_index(0)
bookings_sheet = bookings_book.sheet_by_index(0)

forced_rentals = []
for force_row in range(4,forced_sheet.nrows): #row 0:3 are headers
        Fnum = forced_sheet.cell(force_row, 0)
        for book_row in range(1,bookings_sheet.nrows): #row 0 is a header
                Bnum = bookings_sheet.cell(book_row,0)
                if Fnum == Bnum:
                        booNum = str(bookings_sheet.cell(book_row,0))
                        renCODate = bookings_sheet.cell(book_row,2)
                        renCOLoc = str(bookings_sheet.cell(book_row,4))
                        renUnit = str(bookings_sheet.cell(book_row,13))
                        renAgent = str(bookings_sheet.cell(book_row,12))
                        forced_rentals += [[booNum,renCODate,renCOLoc,renUnit,renAgent]]

So as far as I understand, this should look at the each booking number in the 'forced' sheet (variable Fnum) and compare it against the 'bookings' sheet (variable Bnum) and when it finds a match it will add the appropriate data from that row to the list 'forced_rentals'.

The problem is that after this loop has finished, the list is empty but it should have found 632 matches and therefore contain 632 nested lists. I'm sure it's a really simple solution but I can't figure it out.


Solution

  • I have solved my problem. First of all here is the snippet of code which now works:

    forced_rentals = []
    for force_row in range(4,forced_sheet.nrows):
            Fnum = forced_sheet.cell_value(force_row, 0)
            Fnum_type = type(Fnum)
            if type(Fnum) is float:
                    Fnum = str(Fnum)
                    Fnum = Fnum.replace('.0','')
            if Fnum[-2:] == '/1':
                    Fnum = Fnum.replace('/1','')
            for book_row in range(1,bookings_sheet.nrows):
                    Bnum = bookings_sheet.cell_value(book_row,0)
                    Bnum_type = type(Bnum)
                    if type(Bnum) is float:
                            Bnum = str(Bnum)
                            Bnum = Bnum.replace('.0','') 
                    if Bnum[-2:] == '/1':
                            Bnum = Bnum.replace('/1','')                
                    if Fnum == Bnum:
                            booNum = str(bookings_sheet.cell_value(book_row,0))
                            renCODate = bookings_sheet.cell_value(book_row,2)
                            renCOLoc = str(bookings_sheet.cell_value(book_row,4))
                            renUnit = str(bookings_sheet.cell_value(book_row,13))
                            renAgent = str(bookings_sheet.cell_value(book_row,12))
                            forced_rentals += [[booNum,renCODate,renCOLoc,renUnit,renAgent]]
                            break
    

    1) An all number Bnum or Fnum variable would either be a string eg '7123456' or a float 7123456.0 which wasn't being recognised as the same value. Converting to string by would simply make the float '7123456.0' again not the same. I solved this by:

    if type(Fnum) is float:
                    Fnum = str(Fnum)
                    Fnum = Fnum.replace('.0','')
    

    This converts a float to a string and removes the '.0'

    2) Next issue came when I realised that not all booking numbers (Bnum and Fnum variables) will include a /1. Rental 7123456 and rental 7123456/1 are the same but our report server (which generates the excel sheets) will use the two interchangeably meaning the forced sheet may have 7123456 and the booking sheet have 7123456/1. In order to compensate for this I added this:

    if Fnum[-2:] == '/1':
                    Fnum = Fnum.replace('/1','')
    

    This will look for any booking number ending with '/1' and remove it.