Search code examples
excelexcel-formulaopenpyxl

using Openpyxl and Excel365 have array formula applied to single cell (no spillover) using ws.cell


I have a workbook that has a tab with raw data in it, and another tab that is doing calculations.
Sheet1 = Calculations.
Sheet2 = Raw data.

To help understand the data a little better, it is stock data with tickers listed vertically, and yearly financial metrics horizontally. EPS, Earnings, Dividends etc.
There are 10 entries per year and I am tracking per year for example 1990-2024.
The year 1990 starts on row L and ends on MM.

I am trying to calculate things like Avg Growth Rate for several of the values, I have the formula but I see that when using Openpyxl I am getting the dreaded @ sign of death.
If I look in the raw XML, I don't see that info interestingly. It does state that it is an Array formula, and it gives a single cell for the Ref value.

For simplicity sake, I am going to provide an example of a formula that does not need CSE simply because those are tremendously long like 500 chars or so, but the layout of my code is the same.
If someone really wants it, I can post the formula.

import openpyxl
import pandas as pd
import os
import time

def main():
# storing xl path, xl filenames, and tickers as globals for easy handling from def to def.
  xlsx_path   = 'D:\\Stock_Stuff\\Valuation\\tmp\\data_files\\xlsx_files\\'
  valuation_path = 'D:\\Stock_Stuff\\Valuation\\tmp\\data_files\\'
  XlsxFiles   = find_xlsx_files(xlsx_path)
  Tickers     = get_stock_ticker(XlsxFiles)

  fill_sheet1(Tickers,valuation_path)

def fill_sheet1(tkrs,valuationPath):
  # generate path plus files for workbook.
  files = 'Valuation_template.xlsx'
  myfile = valuationPath + files
  wb =openpyxl.load_workbook(myfile)
  ws = wb['Sheet1']
  start_row = 10
  df = pd.DataFrame(tkrs,columns=["Symbols"])

  valfields = { "Symbol" : 1, "Company": 2, "Sector" : 3, "Div Streak" : 4, "P/E" : 5, "Div Yield" : 6, "Annual Div" : 7, "Co Size" : 8
               , "Price" : 9, "Pcnt of 52wk Hi-Low" : 10, "Earnings Date" : 11
              }

  for myRows in ws.iter_rows(min_row=10):
     myindex = myRows[0].row
     ws.cell(row=myindex,column=valfields['Earnings Date'],value=f"""=INDEX('Sheet2'!I$7:I$400,MATCH($A{myindex},'Sheet2'!$A$7:$A$400,0))""")

I have seen examples of how to do this when it is just a single cell such as ws["A1"]=ArrayFormula() but I am not sure how I can do that when I am assigning to 1000 cells.
Therefore in order to do this, I really need to be able to do this using the Row,Column method. Or maybe, I am just misunderstanding how to use this as I could theoretically drag the formula from cell 10 to cell cell 150 or whatever.
I have seen the spillover method in Excel, however, I am not super fond of that as you don't get to see the formula in all of the cells so that if you want to check your math, it is really hard to do so. Finding/fixing mistakes is just harder.
When Excel adds all of the @ signs the formula breaks, however if I do a search and replace of @ for "" then the formulas work again.

The result I get on the formulas is either #Div/0 or #N/A depending on the source data. I am suspecting this is based on if there are any blank cells. Div/0 for blank cells and NA otherwise.

calculated data:
enter image description here

raw data:
the numbers on top are offsets from 1990:
enter image description here


Solution

  • OK so what you may be needing to do is the following code example.

    import openpyxl
    from openpyxl.worksheet.formula import ArrayFormula
    
    wb = openpyxl.load_workbook('foo.xlsx')
    ws = wb['Calculations']
    
    start_row = 2
    
    # Loop through the cells in column E
    for myRows in ws['E']:
        myindex = myRows.row
        # Skip header row
        if myindex < start_row:
            continue
    
        # Formula created for each row in the column
        formula = f"=(INDEX(Sheet2!L{myindex}:Sheet2!MM{myindex},IF(MOD(MATCH(2,1/(Sheet2!L{myindex}:Sheet2!MM{myindex}<>\"\")),10)=0,MATCH(2,1/(Sheet2!L{myindex}:Sheet2!MM{myindex}<>\"\"))-9,\"need to do something else\"))/INDEX(Sheet2!L{myindex}:Sheet2!MM{myindex},IF(MOD(MATCH(2,1/(Sheet2!L{myindex}:Sheet2!MM{myindex}<>\"\")),10)=0, MATCH(2,1/(Sheet2!L{myindex}:Sheet2!MM{myindex}<>\"\"))-9-(3*10),\"need to do something else\" )))^(1/3)-1"
    
        # Add to sheet as an Array Formula
        ws[myRows.coordinate].value = ArrayFormula(ref=myRows.coordinate, text=formula)
    
    wb.save('foo_out.xlsx')
    

    Example sheet using Column E
    I removed the '=' from the first two cells E2 and E3 so you can see (part of) the formulas and whole formula in E4 is shown. Example Output