I'm for 2 days struggling to do a multiple regression. Here's what I have at this moment:
y<- c(-0.3902, 0.5277, 0.4357, -0.1888, -6.7422, 0.3797, -0.5141, NA, -1.2423, 5.6756, -0.5352,
-0.2379, NA, 0.4270, NA, NA, NA, NA, NA, NA, -1.1185, 0.0594,
0.8280, NA, 1.8387, -3.1469,-1.6212, -0.8400, NA , NA, NA, -0.7291, 2.0888)
x<- c( 0.07712829, 0.07038519, 0.08875312, 0.08235028, 0.10874493, 0.09713412, 0.11821937, 0.12796526,
0.12159038, 0.08520884, 0.07046089, 0.07417249, 0.07507544, 0.11416440, 0.09955467, 0.06688244,
0.06871298, 0.06187514, 0.12293434, 0.07864503, 0.12417404, 0.08600490, 0.10745128, 0.12277381,
0.12952106, 0.09144677, 0.09034708, 0.08039892, 0.07856194, 0.07864304, 0.10883127, 0.10690687,
0.11617899)
f1<- y ~ ((a*b)/(a+b)+x)
st1 <- expand.grid(a = seq(0, 1000, len = 10),
b = seq(0, 800, len = 10))
o<-nls2(f1,
start = st1,
algorithm = "brute-force")
And the result is
Error in numericDeriv(form[[3L]], names(ind), env) : Missing value or an infinity produced when evaluating the model Error in result[[which.min(ss)]] : attempt to select less than one element in get1index
I've tried with
st2<- data.frame(a = c(0,1000), b = c(0, 800))
o<-nls2(f1,
start = st2,
algorithm = "brute-force")```
getting
Error in result[[which.min(ss)]] : attempt to select less than one element in get1index
I have no idea what the value for a can be, which also makes it difficult to give a starting value. I have to apply the equation to multiple datasets (here I just give the smaller one for example).
Any tips on how to get this done? Perhaps using a different package for multiple regression instead of nls2?
Any help will be much appreciated!
The model is not identifiable. The outer parentheses make no difference so we can write your model equivalently as:
y ~ (a*b)/(a+b) + x
and the first term is just some constant so cannot be a function of two parameters.
Also note that the right hand side is NaN if a=b=0.
Perhaps you intended f2
shown below. Have also changed 0 to 1 in the lower limits to avoid the NaN situation.
st2 <- data.frame(a = c(1, 1000), b = c(1, 800))
f2 <- y ~ (a*b) / (a+b+x)
nls2(f2, start = st2, algorithm = "brute-force")
From the comments y ~ (a*b)/(a+b) + x
was actually your intention so in that case we can make it identifiable and so amendable to modeling using:
f3 <- y ~ A + x
nls(f3, start = list(A = 1))
This model is so simple that we can alternately calculate the optimum value of A directly as mean(y - x, na.rm = TRUE)
. Now choose a
(or b
) arbitrarily and solve A = a * b / (a + b)
for b
(or a
) noting that if we take reciporocals of both sides we have 1/A = 1/a + 1/b
, i.e. A
is half the harmonic mean of a
and b
.
If you do want to use nls (even though it's easy to do without it) so that you don't have to solve for a or b then try this:
o <- order(x)
dd <- na.omit(data.frame(x, y)[o, ])
b <- 800
f1 <- y ~ (a*b)/(a+b) + x
fm <- nls(f1, dd, start = list(a = 1))
plot(y ~ x, dd)
lines(fitted(fm) ~ x, dd, col = "red")