Search code examples
sqloracleautomationoracle12ctrending

Oracle 12C Adding Rows for Trending


Assume I have historical data that lists items which show defects each day. Items drop in and out of this list. I.E.-

TABLE NAME: ITEMS_WITH_DEFECTS

DAY        | ITEMID
-------------------
01-JAN-16  | A
01-JAN-16  | D
02-JAN-16  | B
02-JAN-16  | D
03-JAN-16  | A
03-JAN-16  | C
04-JAN-17  | A
04-JAN-17  | D

I'd like to build a data source that shows for each day relative to the previous day, how many items are new, have dropped, and have carried over. It'd like to do this without losing any information from my history data source, so my desired output is:

TABLE NAME: ITEM_DEFECT_TRENDS

DAY       | ITEMID | DEFECT | TREND
------------------------------------
01-JAN-16 | A      | y      | New
01-JAN-16 | B      | n      | (null)
01-JAN-16 | C      | n      | (null)
01-JAN-16 | D      | y      | New
02-JAN-16 | A      | n      | Dropped
02-JAN-16 | B      | y      | New
02-JAN-16 | C      | n      | (null)
02-JAN-16 | D      | y      | Carryover
03-JAN-16 | A      | y      | New
03-JAN-16 | B      | n      | Dropped
03-JAN-16 | C      | y      | New
03-JAN-16 | D      | n      | Dropped
04-JAN-16 | A      | y      | Carryover
04-JAN-16 | B      | n      | (null)
04-JAN-16 | C      | n      | Dropped
04-JAN-16 | D      | y      | New

I know how to produce the TREND column, but I don't know how to add the rows or DEFECT column to the output.

Is this something I can do with SQL or PL SQL?

The reason I'd like to model the data like this is because my database contains a large list of about 12,000 possible items, but only 500 or so items will show up with defects at a time. It's more memory efficient to scope this with only the items I have to consider versus all of the possible items.


Solution

  • Yes, I think you can do this with SQL:

    select d.day, i.itemid, 
           (case when id.itemid is not null then 'y' else 'n' end) as defect,
           (case when id.itemid is null and
                      lag(id.itemid) over (partition by i.itemid order by d.day) is null
                 then 'New'
                 when id.itemid is not null and
                      lag(id.itemid) over (partition by i.itemid order by d.day) is not null
                 then 'CarryOver'
                 when lag(id.itemid) over (partition by i.itemid order by d.day) is not null
                 then 'Dropped'
            end) as trend
    from (select distinct day from items_with_defects) d cross join
         (select distinct itemid from items_with_defects) i left join
         items_with_defects id
         on id.day = d.day and id.itemid = i.itemid;
    

    The idea is to generate all the rows using a cross join -- if you have other ways of getting the population of days and items that you want, then use them.

    Then left join the original data to check for matches. The defect column is easy. I think the trend column logic is correct, but the question does not fully explain it.