Search code examples
sqloraclerecursive-queryconnect-by

Hierarchical query with reference to previous row, not parent row


I have a table detailing a list of objects with the current state and the list of operations needed to do before being finished.

ProductId    CurrentWeight    RemainingRoute
001          50               M1-M7-M5
002          48               M3-M2-M9

I would like to convert this into a operation list table as follows

ProductId    CurrentWeight    Machine
001/1        50               M1
001/2        50               M7
001/3        50               M5
002/1        48               M3
002/2        48               M2
002/3        48               M9

That’s easy with this query:

SELECT ProductId || ‘/‘ || level, REGEXP_SUBSTR(RemainingRoute, ‘[^-]+’,1, LEVEL), CurrentWeight
FROM TABLE
CONNECT BY LEVEL <= REGEXP_COUNT(RemainingRoute, ‘-’) + 1
AND PRIOR ProductId = ProductId
AND PRIOR SYS_GUID() is not null;

Here’s the problem. I need to calculate the final weight of the product as it drops after each operation - there’s a yield factor which is different for each machine.

This means I need a hierarcical query where I should look at the weight of the previous row (not parent row) and multiply with the yield of the corresponding machine.

My current query is:

SELECT REGEXP_SUBSTR(RemainingRoute, ‘[^-]+’,1, LEVEL), CASE LEVEL WHEN 1 THEN CurrentWeight ELSE ( CASE REGEXP_SUBSTR(RemainingRoute, ‘[^-]+’,1, LEVEL) WHEN ‘M1’ THEN 0.95 * PRIOR WeightAfter WHEN ‘M7’  THEN 0.9 * PRIOR WeightAfter END ) END AS WeightAfter
FROM TABLE
CONNECT BY LEVEL <= REGEXP_COUNT(RemainingRoute, ‘-’) + 1
AND PRIOR ProductId = ProductId
AND PRIOR SYS_GUID() is not null;

There’s two problems really with this query (it doesn’t work) 1. I want to access the previous level value of a field rather than the parent level value -> PRIOR is not the right command 2. I’m using the field WeightAfter recursively as I’m referencing to it inside the WeightAfter variable definition - I know this is wrong but don’t know how to circumvent it.

Any advice is highly appreciated


Solution

  • You are probably looking for something like I show below. Note that there is no analytic (and aggregate) product function, I had to mock it up by taking logs, analytic sum, and exponential function.

    I don't see why you must combine the productid and the level (I called it "step") in a single string; of course you can do that if you wish, but I show the output the way I believe it should be. Also, in the output it is not clear what weight you must show (or, rather, WHY) - the weight before the product arrives at the machine, or after it is processed by that machine? I show both (as well as the original weight before any processing began) on every row; decide what you actually need for your report.

    I mocked up your inputs in the WITH clause, but of course you should use your actual tables (with their table and column names). I hope you have a table like the MACHINES view in my query. I used a left outer join just in case a machine is not actually shown in the MACHINES table, although that shouldn't be allowed. (Not sure how you can enforce that though, given your data model which is in direct violation of First Normal Form.) If a machine is not present in the MACHINES table, its yield factor is treated as 1.00 in the query. That happens with machine 'M9' in this example.

    with
      sample_inputs (productid, currentweight, remainingroute) as (
        select '001', 50, 'M1-M7-M5' from dual union all
        select '002', 48, 'M3-M2-M9' from dual
      )
    , machines (machine, yield_factor) as (
        select 'M1', 0.95 from dual union all
        select 'M7', 0.90 from dual union all
        select 'M3', 0.80 from dual union all
        select 'M4', 1.00 from dual union all
        select 'M6', 0.92 from dual union all
        select 'M2', 0.90 from dual union all
        select 'M5', 0.86 from dual
      )
    , routes (productid, step, currentweight, machine) as (
        select productid, level, currentweight,
               regexp_substr(remainingroute, '[^-]+', 1, level)
        from   sample_inputs
        connect by  level <= regexp_count(remainingroute, '[^-]+')
                and prior productid = productid
                and prior sys_guid() is not null
      )
    , weights (productid, step, original_weight, machine, weight_out) as (
        select r.productid, r.step, r.currentweight, r.machine,
               round(r.currentweight *
                       exp(sum(ln(m.yield_factor)) 
                         over (partition by r.productid order by r.step)), 2)
        from   routes r left outer join machines m on r.machine = m.machine
      )
    select productid, step, original_weight, machine,
           lag(weight_out, 1, original_weight)
             over (partition by productid order by step) as weight_in, weight_out
    from   weights
    order  by productid, step;
    

    Output:

    PRODUCTID  STEP ORIGINAL_WEIGHT MACHINE       WEIGHT_IN      WEIGHT_OUT
    ---------- ---- --------------- ------- --------------- ---------------
    001           1           50.00 M1                50.00           47.50
    001           2           50.00 M7                47.50           42.75
    001           3           50.00 M5                42.75           36.76
    002           1           48.00 M3                48.00           38.40
    002           2           48.00 M2                38.40           34.56
    002           3           48.00 M9                34.56           34.56