Search code examples
sqlexcelamazon-redshiftrecursive-queryanalytical

Translate excel formulas into Redshift SQL query


I am looking for help to translate excel formulas into Redshift SQL . With given fixed input values for rows 1 (Month ascending) and 2 (SU - some number) I have to calculate through particular cells to get value for D8.

enter image description here

Complexity is, that formulas reference values in recurring levels/previous column/earlier months

Let me provide calculations for one column, as an example let's take column D

D1 - Month
D2 - SU 
D3 - C7 
D4 - C4 + B7
D5 - D3 * 1.54 + D4
D6 - D2 - D5
D7 - D6 / 2.38
D8 - D3 + D4 + D7

Below queries to build and populate exemplary table and populate it with data.


  CREATE TABLE TESTOSS
   (    PERIOD_MONTH VARCHAR2(20 CHAR), 
    SU NUMBER
   );
   
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-09',503);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-10',1109);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-11',1537);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-12',2327);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-01',3000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-02',4000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-03',5000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-04',3292);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-05',3721);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-06',4149);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-07',4578);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-08',5006);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-09',5435);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-10',5863);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-11',6292);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-12',6720);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2022-02',7577);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2022-01',7149);

This one was once solved in: Translate excel formulas into SQL query using an ORACLE function MODEL, which is not present in Redshift. So far I didn't came up with the idea how to translate it properly.


Solution

  • This is a recursive problem so a recursive CTE is what you need.

    SET UP:

    CREATE TABLE testoss(    
       period_month VARCHAR(20), 
       su INT
    );
       
    INSERT INTO testoss (period_month, su) VALUES 
    ('2020-09',503),
    ('2020-10',1109),
    ('2020-11',1537),
    ('2020-12',2327),
    ('2021-01',3000),
    ('2021-02',4000),
    ('2021-03',5000),
    ('2021-04',3292),
    ('2021-05',3721),
    ('2021-06',4149),
    ('2021-07',4578),
    ('2021-08',5006),
    ('2021-09',5435),
    ('2021-10',5863),
    ('2021-11',6292),
    ('2021-12',6720),
    ('2022-02',7577),
    ('2022-01',7149);
    

    QUERY:

    WITH RECURSIVE build(period_month, su, t1, t2, rp_su, n_su, np, tpe, cur_rn, max_rn) AS (
       SELECT period_month, su, t1, t2, rp_su, n_su, np, tpe, cur_rn, max_rn
       from (SELECT period_month, su, 0::float AS t1, 0::float AS t2, 0::float AS rp_su, 
          su::float AS n_su, n_su / 2.38 AS np, np AS tpe, 
          ROW_NUMBER() OVER (ORDER BY PERIOD_MONTH) AS rn, 2::int as cur_rn, count(1) over() as max_rn
       FROM testoss ) where rn = 1
       UNION ALL 
       SELECT t.period_month, t.su, b.np as t1, b.t2 + b.t1 AS t2,
          b.np * 1.54 + b.t2 + b.t1 AS rp_su, t.su - (b.np * 1.54 + b.t2 + b.t1) AS n_su, 
          (t.su - (b.np * 1.54 + b.t2 + b.t1)) / 2.38 AS np, b.np + b.t2 + b.t1 + (t.su - (b.np * 1.54 + b.t2 + b.t1)) / 2.38 AS tpe, 
          b.cur_rn + 1 AS cur_rn, b.max_rn
       FROM build b 
       JOIN (SELECT period_month, su, lag(period_month) over(order by period_month) prev_period, 
          ROW_NUMBER() OVER (ORDER BY PERIOD_MONTH) AS rn from testoss) t 
       ON t.prev_period = b.period_month
       WHERE t.rn = b.cur_rn AND b.cur_rn <= b.max_rn) 
    SELECT period_month, su, t1, t2, rp_su, n_su, np, tpe FROM build order by period_month, cur_rn;
    

    RESULT (CSV)

    period_month,su,t1,t2,rp_su,n_su,np,tpe
    2020-09,503,0,0,0,503,211.34453781512607,211.34453781512607
    2020-10,1109,211.34453781512607,0,325.47058823529414,783.5294117647059,329.2140385565991,540.5585763717252
    2020-11,1537,329.2140385565991,211.34453781512607,718.3341571922888,818.6658428077112,343.9772448771896,884.5358212489148
    2020-12,2327,343.9772448771896,540.5585763717252,1070.2835334825972,1256.7164665174028,528.0321287888247,1412.5679500377396
    2021-01,3000,528.0321287888247,884.5358212489148,1697.7052995837048,1302.2947004162952,547.1826472337375,1959.750597271477
    2021-02,4000,547.1826472337375,1412.5679500377396,2255.229226777695,1744.7707732223048,733.0969635387836,2692.847560810261
    2021-03,5000,733.0969635387836,1959.750597271477,3088.719921121204,1911.2800788787958,803.058856671763,3495.9064174820237
    2021-04,3292,803.058856671763,2692.847560810261,3929.5582000847758,-637.5582000847758,-267.88159667427556,3228.0248208077483
    2021-05,3721,-267.88159667427556,3495.9064174820237,3083.3687586036394,637.6312413963606,267.91228630099187,3495.9371071087403
    2021-06,4149,267.91228630099187,3228.0248208077483,3640.6097417112755,508.39025828872445,213.60935222215315,3709.5464593308934
    2021-07,4578,213.60935222215315,3495.9371071087403,3824.895509530856,753.1044904691439,316.4304581803126,4025.976917511206
    2021-08,5006,316.4304581803126,3709.5464593308934,4196.849364928575,809.1506350714253,339.97925843337197,4365.956175944578
    2021-09,5435,339.97925843337197,4025.976917511206,4549.544975498599,885.4550245014007,372.0399262610927,4737.9961022056705
    2021-10,5863,372.0399262610927,4365.956175944578,4938.89766238666,924.1023376133398,388.2782931148487,5126.274395320519
    2021-11,6292,388.2782931148487,4737.9961022056705,5335.9446736025375,956.0553263974625,401.70391865439603,5527.978313974915
    2021-12,6720,401.70391865439603,5126.274395320519,5744.898430048289,975.1015699517111,409.7065419965173,5937.684855971433
    2022-01,7149,409.7065419965173,5527.978313974915,6158.926388649552,990.0736113504481,415.9973156934656,6353.682171664898
    2022-02,7577,415.9973156934656,5937.684855971433,6578.3207221393695,998.6792778606305,419.6131419582481,6773.295313623146
    

    EXPLANATION

    1. This recursive CTE plucks the first row off the test table and calculates the few values that don't need previous rows' data.
    2. Then this joined with the next row from the input table using the cur_rn column and row numbers of the source table. The join is done by using a lag() to get the previous period_month and joining this with the previously created results.
    3. Calculations only need t1 and t2 from the previous row which has been joined.
    4. Calculate values of current row and iterate.

    If you have questions just ask in a comment.