Search code examples
rregressionlinear-regressionnls

R nls2 function - Error in result[[which.min(ss)]] : attempt to select less than one element in get1index


I'm for 2 days struggling to do a multiple regression. Here's what I have at this moment:

y<- c(-0.3902,  0.5277,  0.4357, -0.1888, -6.7422,  0.3797, -0.5141,      NA, -1.2423,  5.6756, -0.5352,
-0.2379,      NA,  0.4270,      NA,      NA,      NA,      NA,      NA,      NA, -1.1185,  0.0594,
 0.8280,      NA,  1.8387, -3.1469,-1.6212, -0.8400,      NA ,     NA,      NA, -0.7291,  2.0888)
x<- c( 0.07712829, 0.07038519, 0.08875312, 0.08235028, 0.10874493, 0.09713412, 0.11821937, 0.12796526,
0.12159038, 0.08520884, 0.07046089, 0.07417249, 0.07507544, 0.11416440, 0.09955467, 0.06688244,
0.06871298, 0.06187514, 0.12293434, 0.07864503, 0.12417404, 0.08600490, 0.10745128, 0.12277381,
0.12952106, 0.09144677, 0.09034708, 0.08039892, 0.07856194, 0.07864304, 0.10883127, 0.10690687,
0.11617899)

f1<- y ~ ((a*b)/(a+b)+x)

st1 <- expand.grid(a = seq(0, 1000, len = 10),
                   b = seq(0, 800, len = 10))

o<-nls2(f1,
        start = st1,
        algorithm = "brute-force")

And the result is

Error in numericDeriv(form[[3L]], names(ind), env) : Missing value or an infinity produced when evaluating the model Error in result[[which.min(ss)]] : attempt to select less than one element in get1index

I've tried with

st2<- data.frame(a = c(0,1000), b = c(0, 800))
o<-nls2(f1,
        start = st2,
        algorithm = "brute-force")```

getting

Error in result[[which.min(ss)]] : attempt to select less than one element in get1index

I have no idea what the value for a can be, which also makes it difficult to give a starting value. I have to apply the equation to multiple datasets (here I just give the smaller one for example).

Any tips on how to get this done? Perhaps using a different package for multiple regression instead of nls2?

Any help will be much appreciated!


Solution

  • The model is not identifiable. The outer parentheses make no difference so we can write your model equivalently as:

    y ~ (a*b)/(a+b) + x
    

    and the first term is just some constant so cannot be a function of two parameters.

    Also note that the right hand side is NaN if a=b=0.

    Perhaps you intended f2 shown below. Have also changed 0 to 1 in the lower limits to avoid the NaN situation.

    st2 <- data.frame(a = c(1, 1000), b = c(1, 800))
    f2 <- y ~ (a*b) / (a+b+x)
    nls2(f2, start = st2, algorithm = "brute-force")
    

    Added

    From the comments y ~ (a*b)/(a+b) + x was actually your intention so in that case we can make it identifiable and so amendable to modeling using:

     f3 <- y ~ A + x
     nls(f3, start = list(A = 1))
    

    This model is so simple that we can alternately calculate the optimum value of A directly as mean(y - x, na.rm = TRUE). Now choose a (or b) arbitrarily and solve A = a * b / (a + b) for b (or a) noting that if we take reciporocals of both sides we have 1/A = 1/a + 1/b, i.e. A is half the harmonic mean of a and b.

    If you do want to use nls (even though it's easy to do without it) so that you don't have to solve for a or b then try this:

    o <- order(x)
    dd <- na.omit(data.frame(x, y)[o, ])
    b <- 800
    f1 <- y ~ (a*b)/(a+b) + x
    fm <- nls(f1, dd, start = list(a = 1))
    plot(y ~ x, dd)
    lines(fitted(fm) ~ x, dd, col = "red")