Search code examples
pythonexcelpandasopenpyxlxlrd

How to subtract cell values from one column with cell values from another column in xlsx files using python


I want to subtract the cell values from one column with cell values from another column and write the sum to a new column in an excel file. Then I want the sum, if not equal to 0, to be added to a list for later use. The data in my excel file are structured like this:

Name | Number | Name1 | Number1
Name2 | Number2 | Name3 | Number3
....
Namex | Numberx | Namey |Numbery

I want to subtract the numbers from each other and then add the sum to a new column like this:

Name| Number | Name1 | Number1 | Sum of (Number - Number1)

I have tried to use openpyxl to do this, but I am really confused because the docs are so different from earlier versions of Python to newer. I am working in Python 3.4. I am happy to get suggestions on which module you would recommend me using. The code I have so far is giving me errors because I am calling the excelfile as a generator, not a subscriptable. I am not sure how to search and read an excelfile and at the same time make it subscriptable so that it is possible to write to it. Could anyone please help me?

Here is my code:

from openpyxl import Workbook, load_workbook

def analyzexlsx(filepath):
    numbers = []
    excel_input = load_workbook(filepath)
    filepath = [pth for pth in Path.cwd().iterdir()
                  if pth.suffix == '.xlsx'] #Want to iterate through several excel files in a folder.
    ws = excel_input.active
    cols = tuple(ws.columns)
    col_b = cols[1] 
    col_e = cols[4] 
    for j, k in zip(col_e, col_b): 
        if None:
            print('None')
        equally = (int(j.value) - int(k.value)) #line 13, error. Trying to subtract column cell values.
        if equally != 0: #If the columns sum is not equal to 0, it is to be added to the numbers list.
            numbers.append(j.row)

        else:
            pass

    col1 = []
    col2 = []
    col4 = []
    col5 = []
    col7 = []
    col8 = []

    mainlist = []
    try:
        for row in numbers:
            col1.append(str(ws.cell(row=row, column=1).value))
            col2.append(str(ws.cell(row=row, column=2).value))
            col4.append(ws.cell(row=row, column=4).value)
            col5.append(ws.cell(row=row, column=5).value)
            col7.append(ws.cell(row=row, column=7).value)
            col8.append(ws.cell(row=row, column=8).value)
    finally:
        for i, j, k, l, m, n in zip(col1, col2, col4, col5, col7, col8):
            mainlist.append(i + ", " + j + ", " + k + ", " + l + ", " + m + ", " + n)
    return mainlist

Traceback (most recent call last):
    Line 13, in analyzexlsx
        equally = (int(j.value) - int(k.value))
    TypeError: int() argument must be a string or a number, not 'NoneType

I would be really happy for answers as I have worked on this for quite a while and now I am stuck. I am fairly new to Python.


Solution

  • First create DataFrame from excel by read_excel.

    Then need substract 2. with 4 columns:

    df = pd.read_excel('file.xlsx')
    
    #select by column name
    df['E'] = df['B'] - df['D']
    

    #select by positions, but python count from 0 so for 2. column need 1
    df['E'] = df.iloc[:, 1] - df.iloc[:, 3]
    

    Maybe also help check documentation.