Here's rather nebulous question.
I have a list of start/stop times from script executions, which may include nested script calls.
| script | start | stop | duration | time executing |
| ------ | ----- | ---- | -------------- | ----------------------------------- |
| A | 1 | 8 | 7 i.e. (8-1) | 3 i.e. ((8-1) - (6-2) - (5-4)) |
| ->B | 2 | 6 | 4 i.e. (6-2) | 3 i.e. ((6-2) - (5-4)) |
| ->->C | 4 | 5 | 1 i.e. (5-4) | 1 |
| D | 9 | 10 | 1 i.e. (10-9) | 1 |
| E | 11 | 12 | 1 i.e. (12-11) | 1 |
| F | 9 | 16 | 7 i.e. (16-9) | 5 i.e. ((16-9) - (14-13) - (16-15)) |
| ->G | 13 | 14 | 4 i.e. (14-13) | 1 i.e. (14-13) |
| ->H | 15 | 16 | 1 i.e. (15-14) | 1 i.e. (16-15) |
Duration is the total time spent in a script. Time executing is the time spent in the script, but not in subscript.
So A calls B and B calls C. C takes 1 tick, B takes 4 but time executing is just 3, and A takes 7 ticks, but time executing is 3. F calls G and then H, so takes 7 ticks but time executing is only 5.
What I'm trying to wrap my ('flu-ridden) head around is a pseudo-code algorithm for step-wise or recursing through the list of times in order to generate the time executing value for each row.
Any help for this problem (or cure for common cold) gratefully received. :-)
If all time points are distinct, then script execution timespans are related to each other by an ordered tree: Given any pair of script execution timespans, either one strictly contains the other, or they don't overlap at all. This enables an easy recovery of parent-child relationships, if you wanted to do that.
But if you just care about execution times, we don't even need that! :) There's a pretty simple algorithm that just sorts the starting and ending times and walks through the resulting array of "events", maintaining a stack of open "frames":
(time, scriptID)
pairs, and insert the start time and end time of each script into it (i.e., insert two pairs per script into the same array).(0, 0, 0)
entry on it. (This is just a dummy entry to simplify later code.) Also create an array seen[]
with a boolean flag per script ID, all initially set to false
.(time, scriptID)
pairs:
(time, scriptID)
pair for a script ID that you have not seen before, that script is starting.
seen[scriptID] = true
.(time, scriptID, 0)
onto the stack. The final component, initially 0, will be used to accumulate the total duration spent in this script's "descendant" scripts.seen[scriptID] == true
), that script is ending.
(time, scriptID, descendantDuration)
triple from the stack (note that the scriptID in this triple should match the scriptID in the pair at the current index of the array; if not, then somehow you have "intersecting" script timespans that could not correspond to any sequence of nested script runs).time - startTime[scriptID]
.duration - descendantDuration
.duration
to the new top-of-stack's descendantDuration
(i.e., third) field.That's all! For n script executions this will take O(n log n) time, because the sorting step takes that long (iterating over the array and performing the stack operations take just O(n)). Space usage is O(n).