Search code examples
pythonexcelpython-3.xxlwingsholtwinters

Looping through rows from a spreadsheet, running a function through each row and outputting the result back to the end of the same row


This is a tough one, but I've been stuck for 2 weeks and I would appreciate if someone could help me on this. Basically, I've got a spreadsheet where the first row is like this (I was not able to paste the spreadsheet here and keep it formatted in an understandable way): A1=Material, B1=Jan/15, C1=Feb/15, ..., AW=Dec/18. The material list (A column) goes all the way from A2 to A6442 and each line has a part number. From B2:B6442 each line represents a quantity for each part. So, row B2:AW2 would be the consumption for the part on B1 from jan/15 to dec/18.

Considering the above, what I want to do is loop through every single row, apply a def (triple_exponential_smoothing) and return the last 6 numbers from the series back to Excel, on cells AR to AW (ex. for the 2nd row, AR2:AW2). I would use the first 3.5 years (B2:AQ2) as base for calculation for the remaining 6 months of the year (AR2:AW2). When I run it with a defined range (as per below), it works:

series = xw.Range((2,2),(2, 37)).value 

When I run a loop instead I cannot even get the output from the function, let alone write it back to Excel. My code so far is the below:

import os
import xlwings as xw

#Defining folder
os.chdir('G:\...\Reports')

#importing data
wb = xw.Book('sheet.xlsx')
sht = wb.sheets['sheet']
series = [sht.range((i,2),(i, 37)).value for i in range(2, 6443)]

# Holt Winters formula

def initial_trend(series, slen):
     sum = 0.0
     for i in range(slen):
          sum += float(series[i+slen] - series[i]) / slen
    return sum / slen

def initial_seasonal_components(series, slen):
     seasonals = {}
     season_averages = []
    n_seasons = int(len(series)/slen)
    # compute season averages
    for j in range(n_seasons):
         season_averages.append(sum(series[slen*j:slen*j+slen])/float(slen))
# compute initial values
for i in range(slen):
    sum_of_vals_over_avg = 0.0
    for j in range(n_seasons):
        sum_of_vals_over_avg += series[slen*j+i]-season_averages[j]
    seasonals[i] = sum_of_vals_over_avg/n_seasons
return seasonals

def triple_exponential_smoothing(series, slen, alpha, beta, gamma, n_preds):
    result = []
    seasonals = initial_seasonal_components(series, slen)
    for i in range(len(series)+n_preds):
        if i == 0: # initial values
             smooth = series[0]
             trend = initial_trend(series, slen)
             result.append(series[0])
             continue
        if i >= len(series): # we are forecasting
             m = i - len(series) + 1
             result.append((smooth + m*trend) + seasonals[i%slen])
        else:
            val = series[i]
            last_smooth, smooth = smooth, alpha*(val-seasonals[i%slen]) + (1-alpha)*(smooth+trend)
            trend = beta * (smooth-last_smooth) + (1-beta)*trend
            seasonals[i%slen] = gamma*(val-smooth) + (1-gamma)*seasonals[i%slen]
            result.append(smooth+trend+seasonals[i%slen])
    return result

#printing results for the function looped through all rows    

print(triple_exponential_smoothing(series, 12, 0.96970912, 0.07133329, 0, 12))

Am I missing something? I am open to other ways of doing it, as long as I can do all the rows at once.

Thank you all in advance.


Solution

  • The simplest way to do this would be to create a user defined function (UDF) that worked on one row, you could then copy that down as far as required.

    For better performance you could read the whole data range into Python, loop through each row, writing the results to a list of lists or a Numpy array, then write all the results back to an Excel range in a single operation. That could also conveniently be written as a UDF.