Search code examples
pythonexcelpandasopenpyxlsubtraction

python excel subtract with 2 worksheet


Is it possible to create a python script to automatic which is subtract cell value with 2 worksheet in one excel file? I have checked some documents, and seem that use the method of pandas or openpyxl to do so. But I can't to do that. Do you have any suggestion to me? Many thanks.

Script:

from datetime import datetime
import pandas as pd
import openpyxl as xl;

currDateTime = datetime.now()

Sheet1 ="C:\\Users\\peter\\Downloads\\" + currDateTime.strftime('%Y%m%d') + "\\5250A" + "\\5250A.xlsx"
wb3 = xl.load_workbook(Sheet1) 
ws3 = wb3.worksheets[0] 
wb4 = xl.load_workbook(Sheet1) 
ws4 = wb4.worksheets[1] 
wb5 = xl.load_workbook(Sheet1) 
ws5 = wb5.create_sheet("Done")

wb4.subtract(wb3)

wb5.save(str(Sheet1)) 

Expected Result:

Figure1 Figure2 Figure3


Solution

  • Do so in excel coule be way easier I think. There could be a smarter way to write this code.

    [NOTE] I just do the subsctraction cell by cell, so if there's any mismatch like same row but different dept.id or same col but different item will make errors. If you may meet this situation, you'll have a change some in the following code.

    import openpyxl as xl
    
    def get_row_values(worksheet):
        """
        return data structure:
        [
            [A1, B1, C1, ...],
            [A2, B2, C2, ...],
            ...
        ]
        """
        result = []
        for i in worksheet.rows:
            row_data = []
            for j in i:
                row_data.append(j.value)
            result.append(row_data)
        return result
    
    
    if __name__ == '__main__':
        # load excel file
        wb = xl.load_workbook('test1.xlsx')
        ws1 = wb.worksheets[0]
        ws2 = wb.worksheets[1]
    
        # get data from the first 2 worksheets
        ws1_rows = get_row_values(ws1)
        ws2_rows = get_row_values(ws2)
    
        # calculate and make a new sheet
        ws_new = wb.create_sheet('Done')
        # insert header
        ws_new.append(ws1_rows[0])
        for row in range(1, len(ws1_rows)):
            # do the substract cell by cell
            row_data = []
            for column, value in enumerate(ws1_rows[row]):
                if column == 0:
                    # insert first column
                    row_data.append(value)
                else:
                    if ws1_rows[row][0] == ws2_rows[row][0]:
                        # process only when first column match
                        row_data.append(value - ws2_rows[row][column])
            ws_new.append(row_data)
        wb.save('test2.xlsx')
    

    here's my sample excel file

    first sheet:

    enter image description here

    second sheet:

    enter image description here

    generated sheet:

    enter image description here