Search code examples
sqloracle-databasegroup-bycountsql-timestamp

Oracle SQL: Count between irregular timestamps


I'm sorry if this question has been asked here before, but I can't seem to find it. I keep finding how to sum per hour, but my question is regarding SUM and COUNT between timestamps that are defined in another column.

I have one table called incoming_orders: It shows the intended destination, and the timestamp of the incoming order.

I have a second table called scheduled_output: It shows each scheduled output moment for each destination.

I have third table called outgoing_orders: It shows how the actual destination, and the timestamp of the outgoing order.

So, the data could be:

--Incoming_orders:
Destination  Timestamp
ROUTE B      14/03/2018 7:48:00 
ROUTE A      14/03/2018 7:58:00
ROUTE A      14/03/2018 12:48:00
ROUTE C      14/03/2018 13:28:00

--Scheduled_Output
ROUTE A      14/03/2018 8:00:00
ROUTE A      14/03/2018 11:00:00
ROUTE A      14/03/2018 12:00:00
ROUTE A      14/03/2018 17:00:00    
ROUTE B      14/03/2018 8:00:00
ROUTE B      14/03/2018 10:00:00
ROUTE B      14/03/2018 12:00:00
ROUTE C      14/03/2018 07:00:00 
ROUTE C      14/03/2018 14:00:00 
ROUTE C      14/03/2018 17:00:00 

--Which would lead to the following outgoing_orders:
ROUTE A      14/03/2018 8:00:00
ROUTE B      14/03/2018 8:00:00 
ROUTE C      14/03/2018 14:00:00
ROUTE A      14/03/2018 17:00:00

Now, I want to check that the incoming order of 07:58 to route A actually made it into the output cycle of 08:00 for route A. I was thinking of creating a table like this to show it:

Destination output moment   expected_output actual_output   diff
Route A     8:00            1               1               0
Route A     11:00           0               0               0
Route A     12:00           0               0               0
Route A     17:00           1               1               0

But the question is: How do I calculate that expected_output column? How do I group the incoming order to Route A of 12:48 to the 12:00-17:00 group? It should count all the orders between scheduled output moments, but I'm not sure how to accomplish that.

Could I perhaps CEIL, FLOOR or ROUND to the closest scheduled_output value? Or can I somehow do a BETWEEN row n and n+1 with a rowcount? Or is there another, easier way?


Solution

  • I think it's easiest to determine the previous time of scheduled output, obtaining time intervals, more or less in this way:

    SELECT destination,
           time_stamp,
           ( SELECT max( time_stamp ) 
             FROM SCHEDULED_OUTPUT t1
             WHERE t.destination = t1.destination
               AND t1.time_stamp < t.time_stamp
            ) as previous_time_stamp
    FROM SCHEDULED_OUTPUT t
    order by 1,2
    

    or in a more compact form using analytic funtion:

    SELECT destination,
           time_stamp,
           lag( time_stamp ) over (partition by destination order by time_stamp )
           as previous_time_stamp
    FROM SCHEDULED_OUTPUT t
    order by 1,2
    

    demo: http://sqlfiddle.com/#!4/c7bc9/1

    | DESTINATION |            TIME_STAMP |   PREVIOUS_TIME_STAMP |
    |-------------|-----------------------|-----------------------|
    |     ROUTE A | 2018-03-14 08:00:00.0 |                (null) |
    |     ROUTE A | 2018-03-14 11:00:00.0 | 2018-03-14 08:00:00.0 |
    |     ROUTE A | 2018-03-14 12:00:00.0 | 2018-03-14 11:00:00.0 |
    |     ROUTE A | 2018-03-14 17:00:00.0 | 2018-03-14 12:00:00.0 |
    |     ROUTE B | 2018-03-14 08:00:00.0 |                (null) |
    |     ROUTE B | 2018-03-14 10:00:00.0 | 2018-03-14 08:00:00.0 |
    |     ROUTE B | 2018-03-14 12:00:00.0 | 2018-03-14 10:00:00.0 |
    |     ROUTE C | 2018-03-14 07:00:00.0 |                (null) |
    |     ROUTE C | 2018-03-14 14:00:00.0 | 2018-03-14 07:00:00.0 |
    |     ROUTE C | 2018-03-14 17:00:00.0 | 2018-03-14 14:00:00.0 |
    

    and next the above resultset can be joined to INCOMING_ORDERS in order to calculate counts:

    SELECT x.destination, x.time_stamp as output_moment,
           count( y.DESTINATION ) as expected_output 
    FROM (
       SELECT destination,
              time_stamp,
              lag( time_stamp ) over (partition by destination order by time_stamp )
              as previous_time_stamp
       FROM SCHEDULED_OUTPUT t
    ) x
    LEFT JOIN INCOMING_ORDERS y
    ON x.DESTINATION =  y.DESTINATION
    AND y.TIME_STAMP <= x.TIME_STAMP
    AND ( y.TIME_STAMP > x.previous_time_stamp OR x.previous_time_stamp IS NULL )
    GROUP BY x.destination, x.time_stamp
    ORDER BY 1,2
    

    Demo: http://sqlfiddle.com/#!4/c3958/2

    | DESTINATION |         OUTPUT_MOMENT | EXPECTED_OUTPUT |
    |-------------|-----------------------|-----------------|
    |     ROUTE A | 2018-03-14 08:00:00.0 |               1 |
    |     ROUTE A | 2018-03-14 11:00:00.0 |               0 |
    |     ROUTE A | 2018-03-14 12:00:00.0 |               0 |
    |     ROUTE A | 2018-03-14 17:00:00.0 |               1 |
    |     ROUTE B | 2018-03-14 08:00:00.0 |               1 |
    |     ROUTE B | 2018-03-14 10:00:00.0 |               0 |
    |     ROUTE B | 2018-03-14 12:00:00.0 |               0 |
    |     ROUTE C | 2018-03-14 07:00:00.0 |               0 |
    |     ROUTE C | 2018-03-14 14:00:00.0 |               1 |
    |     ROUTE C | 2018-03-14 17:00:00.0 |               0 |
    

    This condition:

    AND y.TIME_STAMP <= x.TIME_STAMP
    AND ( y.TIME_STAMP > x.previous_time_stamp OR x.previous_time_stamp IS NULL )
    

    tells that if an order is placed at, say 8:00:00 and the route starts at the same time 8:00:00, then this order is still assigned to this "starting" route. If this is not possible (that is - the order must be assigned to the next route when it is placed at the exact time when the route starts), then change the condition to:

    AND y.TIME_STAMP < x.TIME_STAMP
    AND ( y.TIME_STAMP >= x.previous_time_stamp OR x.previous_time_stamp IS NULL )