Search code examples
sqloracledatetimeanalytics

Finding runs of a particular value


I have a table in Oracle 10 that is defined like this:

LOCATION   HOUR              STATUS
--------------------------------------
10         12/10/09 5:00PM      1
10         12/10/09 6:00PM      1
10         12/10/09 7:00PM      2
10         12/10/09 8:00PM      1
10         12/10/09 9:00PM      3
10         12/10/09 10:00PM     3
10         12/10/09 11:00PM     3

This table continues for various locations and for a small number of status values. Each row covers one hour for one location. Data is collected from a particular location over the course of that hour, and processed in chunks. Sometimes the data is available, sometimes it isn't, and that information is encoded in the status. I am trying to find runs of a particular status, so that I could convert the above table into something like:

LOCATION   STATUS     START               END
-----------------------------------------------------------
10         1          12/10/09 5:00PM     12/10/09 7:00PM
10         2          12/10/09 7:00PM     12/10/09 8:00PM  
10         1          12/10/09 8:00PM     12/10/09 9:00PM
10         3          12/10/09 9:00PM     12/11/09 12:00AM 

Basically condensing the table into rows that define each stretch of a particular status. I have tried various tricks, like using lead/lag to figure out where starts and ends are and such, but all of them have met with failure. The only trick that works so far is going one by one through the values programatically, which is slow. Any ideas for doing it directly in Oracle? Thanks!


Solution

  • Here's an ANSI SQL solution:

    select      t1.location
    ,           t1.status
    ,           min(t1.hour)                                      AS "start" -- first of stretch of same status
    ,           coalesce(t2.hour, max(t1.hour) + INTERVAL '1' HOUR) AS "end"
    from        t_intervals t1             -- base table, this is what we are condensing
    left join   t_intervals t2             -- finding the first datetime after a stretch of t1
    on          t1.location = t2.location  -- demand same location
    and         t1.hour     < t2.hour      -- demand t1 before t2
    and         t1.status  != t2.status    -- demand different status 
    left join   t_intervals t3             -- finding rows not like t1, with hour between t1 and t2
    on          t1.location = t3.location
    and         t1.status  != t3.status
    and         t1.hour     < t3.hour
    and         t2.hour     > t3.hour
    where       t3.status is null          -- demand that t3 does not exist, in other words, t2 marks a status transition
    group by    t1.location                -- condense on location, status.
    ,           t1.status
    ,           t2.hour                    -- this pins the status transition
    order by    t1.location
    ,           t1.status
    ,           min(t1.hour)