Search code examples
pythonpandasnumpyloopsmatrix-multiplication

How to add another iterator to nested loop in python without additional loop?


I am trying to add a date to my nested loop without creating another loop. End is my list of dates and end(len) is equal to len(year).

Alternatively I can add the date to the dataframe (data1) is that a better solution?

Data-Sample

state_list = ['A','B','C','D','E'] #possible states

data1 = pd.DataFrame({"cust_id": ['x111','x112'], #customer data
                    "state": ['B','E'],
                    "amount": [1000,500],
                    "year":[3,2],
                    "group":[10,10],
                    "loan_rate":[0.12,0.13]})

data1['state'] = pd.Categorical(data1['state'], 
                                        categories=state_list, 
                                        ordered=True).codes


lookup1 = pd.DataFrame({'year': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                    'lim %': [0.1, 0.1, 0.1, 0.1, 0.1,0.1, 0.1, 0.1, 0.1, 0.1]}).set_index(['year'])

matrix_data = np.arange(250).reshape(10,5,5) #3d matrix by state(A-E) and year(1-10)

end = pd.Timestamp(year=2021, month=9, day=1)    # creating a list of dates
df = pd.DataFrame({"End": pd.date_range(end, periods=10, freq="M")})
df['End']=df['End'].dt.day
End=df.values

Calculation

results={}
for cust_id, state, amount, start, group, loan_rate in data1.itertuples(name=None, index=False):
    res = [amount * matrix_data[start-1, state, :]]
    for year in range(start+1, len(matrix_data)+1,):
        res.append(lookup1.loc[year].iat[0] * np.array(res[-1]))
        res.append(res[-1] * (loan_rate)) # *(End/365) # I want to iterate here
        res.append(res[-1]+ 100)
        res.append(multi_dot([res[-1],matrix_data[year-1]]))
        results[cust_id] = res

example of expected output:

{'x111': [array([55000, 56000, 57000, 58000, 59000]),
  array([5500., 5600., 5700., 5800., 5900.]),
  array([56.055, 57.074, 58.093, 59.112., 60.132.]),

line 3 - calculation example ((5500 * 0.12) * (30/365))

array([5500., 5600., 5700., 5800., 5900.])- the entire line will be multiplied by loan_rate and (30/365)


Solution

  • If I understand correctly, the question boils down to this:

    There is an

    • array of values, e.g.
    vals = np.array([5500., 5600., 5700., 5800., 5900.])  # == res[-1] from the example
    
    • a constant loan rate, e.g.
    loan_rate = 0.12  # ... to continue with the example above
    
    • and the goal is to perform a calculation ...
    value * loan_rate * (end_date/365)
    

    ... for each value where the end_date comes from the array End (array([[30], [31], ..., [30]])).

    • So for example if year == 1:
      • 5500 * 0.12 * 30/365
      • 5600 * 0.12 * 30/365
      • 5700 * 0.12 * 30/365
      • ...
    • if year == 2: then the end_date is 31

    Then you could just look up the end date depending on the year. Note the use of year - 1 as end_dates is 0 indexed and the counting of years seems to start with 1.

    end_dates = End.reshape(-1)  # array([30, 31, 30, 31, 31, 28, 31, 30, 31, 30]); just to simplify access to the end date values
    results={}
    for cust_id, state, amount, start, group, loan_rate in data1.itertuples(name=None, index=False):
        res = [amount * matrix_data[start-1, state, :]]
        for year in range(start+1, len(matrix_data)+1,):
            res.append(lookup1.loc[year].iat[0] * np.array(res[-1]))
            res.append(res[-1] * loan_rate * end_dates[year-1]/365) # year - 1 here
            res.append(res[-1]+ 100)
            res.append(np.linalg.multi_dot([res[-1],matrix_data[year-1]]))
        results[cust_id] = res  # no need to store the results of a customer multiple times; see comment below question