I have a series of transactions. I need to summarize the total time spent working on these transactions and a count of transactions within groups that break when you move from one group to another. This means that the group can repeat if there is a set of transactions for that group for multiple time slots. I am using an oracle database and SQL to do this summarization. I'm looking for some help writing the SQL to summarize this data. I've tried many different queries using various analytical and grouping functions.
Source Data:
User ID | Area | Start Time | End Time | Transaction ID
jsmith 24 07-JUL-14 09.06.21.000000000 AM 07-JUL-14 09.14.51.000000000 AM 63665020
jsmith 24 07-JUL-14 11.06.01.000000000 AM 07-JUL-14 11.07.10.000000000 AM 63679145
jsmith 24 07-JUL-14 11.16.36.000000000 AM 07-JUL-14 11.16.51.000000000 AM 63680441
jsmith 21 07-JUL-14 11.33.05.000000000 AM 07-JUL-14 11.33.18.000000000 AM 63683133
jsmith 21 07-JUL-14 11.48.55.000000000 AM 07-JUL-14 11.51.03.000000000 AM 63685906
jsmith 21 07-JUL-14 11.55.48.000000000 AM 07-JUL-14 11.56.01.000000000 AM 63686651
jsmith 24 07-JUL-14 12.32.51.000000000 PM 07-JUL-14 12.32.55.000000000 PM 63690943
jsmith 24 07-JUL-14 12.32.55.000000000 PM 07-JUL-14 12.33.02.000000000 PM 63690956
jsmith 21 07-JUL-14 12.42.05.000000000 PM 07-JUL-14 12.42.11.000000000 PM 63692212
jsmith 21 07-JUL-14 12.44.38.000000000 PM 07-JUL-14 12.44.43.000000000 PM 63692600
jsmith 25 07-JUL-14 01.01.18.000000000 PM 07-JUL-14 01.01.52.000000000 PM 63694808
jsmith 25 07-JUL-14 01.03.43.000000000 PM 07-JUL-14 01.04.44.000000000 PM 63695353
jsmith 25 07-JUL-14 01.12.30.000000000 PM 07-JUL-14 01.14.06.000000000 PM 63697002
jsmith 24 07-JUL-14 01.14.11.000000000 PM 07-JUL-14 01.14.23.000000000 PM 63697045
jsmith 24 07-JUL-14 01.21.41.000000000 PM 07-JUL-14 01.23.46.000000000 PM 63699080
Summary Data Desired: (mock up)
User ID | Area | Start Time | End Time | Total Minutes | Total Transactions
jsmith 24 9:06am 11:16am 110 3
jsmith 21 11:33am 11:56am 23 3
jsmith 24 .....
jsmith 21 .....
jsmith 25 .....
jsmith 24 .....
To resolve such issues, you need to generate some group number for each consecutive rows. Here I first use LAG
to generate a tick mark each time we start a new group. An outer query using SUM
will count the number of tick marks from the first from to the current one to generate a group number:
SELECT "Area",
MIN("Start Time") as "Start Time",
MAX("End Time") as "End Time",
SUM("End Time" - "Start Time")*24*60 as "Total Minutes",
COUNT("Transaction ID") as "#Transaction ID"
FROM (
SELECT SUM(clk)
OVER (ORDER BY "Start Time"
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) grp,
-- ^^^^^^^^^^^^^
-- Generate a group number by summing the tick marks
V1.*
FROM (
SELECT CASE
WHEN LAG("Area", 1, NULL) OVER (ORDER BY "Start Time") = "Area"
THEN 0
ELSE 1
END clk,
-- ^^^^^^^^
-- Set a tick mark ("clock") to 1 each time we change group
T.*
FROM T
) V1
) V2
GROUP BY GRP, "Area"
ORDER BY "Start Time"
See http://sqlfiddle.com/#!4/93f05/2
A little bit more difficult to grasp, but this works too:
SELECT "Area",
MIN("Start Time") as "Start Time",
MAX("End Time") as "End Time",
SUM("End Time" - "Start Time")*60 as "Total Minutes",
COUNT("Transaction ID") as "#Transaction ID"
FROM (
SELECT ROWNUM-ROW_NUMBER()
OVER (PARTITION BY "Area" ORDER BY "Start Time") grp,
T.*
FROM T
ORDER BY "Start Time"
) V
GROUP BY GRP, "Area"
ORDER BY "Start Time"
See http://sqlfiddle.com/#!4/07d43/6
In that case, beware that you can have duplicate grp
-- but only for different group. Unless the rows are contiguous. Please refer to the comments bellow for a discussion about that.