We have 2 excel files one having 7.5k records and other having 7k records. We need to compare the data by keeping one specific column as fixed from one sheet to compare with another sheet.
For Example sheet1:
**Emp_ID| Name| Phone| Address**
-------------------------------------
1 | A | 123 | ABC
-------------------------------------
2 | B | 456 | CBD
-------------------------------------
3 | C | 789 | S
For Example sheet2:
**Emp_ID| Name| Phone| Address**
-------------------------------------
1 | A | 123 | ABC
-------------------------------------
3 | C | 789 | S
Python Comparison should on the basis of Emp_ID and Emp_ID=2 should be output as missing when passing the argument as Emp_ID while executing the python script. I am trying the same by using XLRD module, but its comparing only cell by cell instead of freezing one column and then comparing the row with other excel file.
def compareexcel(oldSheet, newSheet):
rowb2 = xlrd.open_workbook(oldSheet)
rowb1 = xlrd.open_workbook(newSheet)
sheet1 = rowb1.sheet_by_index(0)
sheet2 = rowb2.sheet_by_index(0)
for rownum in range(max(sheet1.nrows, sheet2.nrows)):
if rownum < sheet1.nrows:
row_rb1 = sheet1.row_values(rownum)
row_rb2 = sheet2.row_values(rownum)
for colnum, (c1, c2) in enumerate(izip_longest(row_rb1, row_rb2)):
if c1 != c2:
print "Row {} Col {} - {} != {}".format(rownum+1, colnum+1, c1, c2)
I have written one function to search for a column value into another sheet and on the basis of that comparison would take place in compare function
def search(sheet2 , s):
for row in range(sheet2.nrows):`enter code here`
if s == sheet2.cell(row,0).value:
return (row,0)
return (9,9)
def compare(oldPerPaxSheet,newPerPaxSheet):
rb1 = xlrd.open_workbook(oldPerPaxSheet)
rb2 = xlrd.open_workbook(newPerPaxSheet)
sheet1 = rb1.sheet_by_index(0)
sheet2 = rb2.sheet_by_index(0)
for rownum in range(max(self.sheet1.nrows, self.sheet2.nrows)):
if rownum < sheet1.nrows:
row_rb1 = sheet1.row_values(rownum)
print ("row_rb1 : "), row_rb1
search_str = sheet1.cell(rownum,0).value
r,c = search(sheet2,search_str)
if (c != 9):
row_rb2 = sheet2.row_values(r)
for colnum, (c1, c2) in enumerate(izip_longest(row_rb1, row_rb2)):
if c1 != c2:
print "Row {} Col {} - {} != {}".format(rownum+1, colnum+1, c1, c2)
else:
print ("ROw does not exists in the other sheet")
pass
else:
print ("Row {} missing").format(rownum+1)