Search code examples
rregressionnls

changepoint detection in R


Anybody an idea how to solve this R issue? It is to find a changepoint in a relation, like x=5 in data below.

fitDat <- data.frame(x=1:10, y=c(rnorm(5, sd=0.2),(1:5)+rnorm(5, sd=0.2)))
plot(fitDat$x, fitDat$y)

SAS works beautifully well;

/*simulated example*/
data test;
INPUT id $ x y;
cards;
1   1 -0.22711769
2   2 -0.08712909
3   3  0.06922072
4   4 -0.12940913
5   5 -0.43152927
6   6  1.17685016
7   7  1.83410448
8   8  2.88528795
9   9  4.30078012
10 10  4.84517101
;
run;

proc nlmixed data=test;
    parms B0 = 0 B1=1 t0=5 sigma_r=1;
    mu1 = (x <= t0)*B0;
    mu2 = (x > t0)*(B0 + B1*(x - t0));
    mu=mu1+mu2;
    model y~normal(mu,sigma_r);
run;

R fails;

changePointOptim <- function(x, int, slope, delta){ # code adapted from https://stats.stackexchange.com/questions/7527/change-point-analysis-using-rs-nls
       int + (pmax(delta-x, 0) * slope)
}

# nls
nls(y ~ changePointOptim(x, b0, b1, delta), 
              data = fitDat, 
              start = c(b0 = 0, b1 = 1, delta = 5))

# optimization
sqerror <- function (par, x, y)
       sum((y - changePointOptim(x, par[1], par[2], par[3]))^2)
minObj <- optim(par = c(0, 1, 3), fn = sqerror,
              x = fitDat$x, y = fitDat$y, method = "BFGS")
minObj$par

# nlme
library(nlme)
nlmeChgpt <- nlme(y ~ changePointOptim(b0, b1, delta,x), 
              data = fitDat, 
              fixed = b0 + b1 + delta, 
              start = c(b0=0, b1=1, delta=5)) 
summary(nlmeChgpt)

As it is a simple case with a clear signal, R should work. I wonder what I’m doing wrong in R (I used some code from https://stats.stackexchange.com/questions/7527/change-point-analysis-using-rs-nls). Anybody a suggestion/solution?

Thanks!

Willem


Solution

  • You've got a sign wrong in changePointOptim. Then:

    set.seed(0)
    fitDat <- data.frame(x=1:10, y=c(rnorm(5, sd=0.2),(1:5)+rnorm(5, sd=0.2)))
    plot(fitDat$x, fitDat$y)
    
    
    changePointOptim <- function(x, int, slope, delta){ # code adapted from http://stats.stackexchange.com/questions/7527/change-point-analysis-using-rs-nls
           int + (pmax(x-delta, 0) * slope)  ####Chamnged sign here
    }
    
    # optimization
    sqerror <- function (par, x, y)
           sum((y - changePointOptim(x, par[1], par[2], par[3]))^2)
    minObj <- optim(par = c(0, 1, 3), fn = sqerror,
                  x = fitDat$x, y = fitDat$y, method = "BFGS")
    

    gives:

    > minObj$par
    [1] 0.1581436 1.1762401 5.5963392
    

    which is about what you started with.

    # nls
    nls(y ~ changePointOptim(x, b0, b1, delta),
                  data = fitDat,
                  start = list(b0 = 0, b1 = 1, delta = 5)) ##Note it is a list
    

    gives the same:

    0.1581 1.1762 5.5963 
    

    Not sure about the NLME at the moment