Search code examples
rsegmentsanchorpoint

Finding Anchors and slope for segmented time series


I have the following time series:

Lines <- "Hour,PF
0,14/01/2015 00:00,0.305
1,14/01/2015 01:00,0.306
2,14/01/2015 02:00,0.307
3,14/01/2015 03:00,0.3081
4,14/01/2015 04:00,0.3091
5,14/01/2015 05:00,0.3101
6,14/01/2015 06:00,0.3111
7,14/01/2015 07:00,0.3122
8,14/01/2015 08:00,0.455
9,14/01/2015 09:00,0.7103
10,14/01/2015 10:00,0.9656
11,14/01/2015 11:00,1
12,14/01/2015 12:00,0.9738
13,14/01/2015 13:00,0.9476
14,14/01/2015 14:00,0.9213
15,14/01/2015 15:00,0.8951
16,14/01/2015 16:00,0.8689
17,14/01/2015 17:00,0.8427
18,14/01/2015 18:00,0.6956
19,14/01/2015 19:00,0.6006
20,14/01/2015 20:00,0.5056
21,14/01/2015 21:00,0.4106
22,14/01/2015 22:00,0.3157
23,14/01/2015 23:00,0.3157"

library (zoo)
library (strucchange)

z <- read.zoo(text = Lines, tz = "", format = "%d/%m/%Y %H:%M", sep = ",")

bp <- breakpoints(z ~ 1, h = 2)

plot(z)
abline(v = time(z)[bp$breakpoints])
fit <- zoo(fitted(bp), time(z))
lines(fit, col = "blue", lty = 2, lwd = 2)
levs <- fit[bp$breakpoints + 0:1]
a<-diff(levs) / diff(as.numeric(time(levs)) / 3600)
DF <- fortify.zoo(a)

I get the following DF:

> DF
                Index             a
1 2015-01-14 10:00:00  2.061000e-01
2 2015-01-14 17:00:00 -9.516197e-17
3 2015-01-14 21:00:00 -1.448854e-01

I tried to change the formula in breakpoints to get a linear model with slope and intercept:

bp <- breakpoints(z ~ Lines$PF, h = 2)

with no success. The final result I would like to have is the beginning of the segment and the end of the segment, the slope (a) as is now and the Intercept , left point of a segment (Anchor) and the right point of the segment. As followed (only example with no connection to real numbers):

> DF
    Start Segment       End Segment             Slope          Intercept   Anchor Beginning Anchor End
1 2015-01-14 10:00:00  2015-01-14 08:00:00     2.061000e-01    8.123            0.50        0.30
2 2015-01-14 08:00:00  2015-01-14 17:00:00    -9.516197e-17    9.456            0.70        0.40
3 2015-01-14 17:00:00  2015-01-14 23:00:00    -1.448854e-01    2.9009           0.60        0.90

Solution

  • Well, you have it at your hands with your breakpoints, e.g.

    (breaks <- data.frame(
      start = index(z[c(1, bp$breakpoints+1)]),
      end = c(index(z[bp$breakpoints]), index(z[length(z)]))
    ))
    #                 start                 end
    # 1 2015-01-14 00:00:00 2015-01-14 07:00:00
    # 2 2015-01-14 08:00:00 2015-01-14 09:00:00
    # 3 2015-01-14 10:00:00 2015-01-14 17:00:00
    # 4 2015-01-14 18:00:00 2015-01-14 20:00:00
    # 5 2015-01-14 21:00:00 2015-01-14 23:00:00
    fits <- lapply(seq_len(nrow(breaks)), function(x) {
      idx <- index(z)>=breaks[x, 1] & index(z)<=breaks[x, 2]
      fit <- lm(z[idx]~index(z[idx]))
    })
    sapply(fits, coefficients)
    #                        [,1]          [,2]          [,3]          [,4]          [,5]
    # (Intercept)   -4.048094e+02 -1.007876e+05  8.358223e+03  3.750603e+04  1.873346e+04
    # index(z[idx])  2.850529e-07  7.091667e-05 -5.880291e-06 -2.638889e-05 -1.318056e-05
    

    The last step would be to merge together all data your need in the format you want.