Search code examples
rbreakpoints

Adding artificial breakpoints


I have the following time series:

Lines <- "Hour,PF
0,14/01/2015 00:00,0.305
1,14/01/2015 01:00,0.306
2,14/01/2015 02:00,0.307
3,14/01/2015 03:00,0.3081
4,14/01/2015 04:00,0.3091
5,14/01/2015 05:00,0.3101
6,14/01/2015 06:00,0.3111
7,14/01/2015 07:00,0.3122
8,14/01/2015 08:00,0.455
9,14/01/2015 09:00,0.7103
10,14/01/2015 10:00,0.9656
11,14/01/2015 11:00,1
12,14/01/2015 12:00,0.9738
13,14/01/2015 13:00,0.9476
14,14/01/2015 14:00,0.9213
15,14/01/2015 15:00,0.8951
16,14/01/2015 16:00,0.8689
17,14/01/2015 17:00,0.8427
18,14/01/2015 18:00,0.6956
19,14/01/2015 19:00,0.6006
20,14/01/2015 20:00,0.5056
21,14/01/2015 21:00,0.4106
22,14/01/2015 22:00,0.3157
23,14/01/2015 23:00,0.3157"

library (zoo)
library (strucchange)

z <- read.zoo(text = Lines, tz = "", format = "%d/%m/%Y %H:%M", sep = ",")

bp <- breakpoints(z ~ 1, h = 2)

plot(z)
abline(v = time(z)[bp$breakpoints])

The Breakpoints are at observation number:
8 10 18 21 

Now suppose I have the same time series but at observation number 11 - 24 hours are missing. I would like to define Gap (=24 hours in this case) and find the relevant breakpoints as in the previous example with 2 additional breakpoints at the beginning of the gap and at the end. For the following time series the breakpoints will be:

8 10  12   13  18 21

Here is the time series with gaps:

 Lines <- "Hour,PF
    0,14/01/2015 00:00,0.305
    1,14/01/2015 01:00,0.306
    2,14/01/2015 02:00,0.307
    3,14/01/2015 03:00,0.3081
    4,14/01/2015 04:00,0.3091
    5,14/01/2015 05:00,0.3101
    6,14/01/2015 06:00,0.3111
    7,14/01/2015 07:00,0.3122
    8,14/01/2015 08:00,0.455
    9,14/01/2015 09:00,0.7103
    10,14/01/2015 10:00,0.9656 
    11,14/01/2015 11:00,1           <---
    12,15/01/2015 12:00,0.9738      <--- GAP of 24 hours
    13,15/01/2015 13:00,0.9476
    14,15/01/2015 14:00,0.9213
    15,15/01/2015 15:00,0.8951
    16,15/01/2015 16:00,0.8689
    17,15/01/2015 17:00,0.8427
    18,15/01/2015 18:00,0.6956
    19,15/01/2015 19:00,0.6006
    20,15/01/2015 20:00,0.5056
    21,15/01/2015 21:00,0.4106
    22,15/01/2015 22:00,0.3157
    23,15/01/2015 23:00,0.3157"

Solution

  • Your Gap is actually a whole month (14/01 to14/02).

    c(0,1)+which.max(diff(time(z))) # 12,13
    

    will return the indices for the largest gap which you can append to your break points.

    Note: the start and end of the gap is 12,13 not 11,12 since index in R starts from 1.