Search code examples
rnon-linear-regression

Find initial conditions for nonlinear models using the nlsLM function


I am using the nlsLM function from the minpack.lm package to find the values of parameters a, e, and c that give the best fit to the data out. Here is my code:

n <- seq(0, 70000, by = 1)
TR <- 0.946
b <- 2000
k <- 50000
nr <- 25
na <- 4000
nd <- 3200
d <- 0.05775

y <- d + ((TR*b)/k)*(nr/(na + nd + nr))*n
## summary(y)
out <- data.frame(n = n, y = y)
plot(out$n, out$y)

## Estimate the parameters of a nonlinear model
library(minpack.lm)
k1 <- 50000
k2 <- 5000

fit_r <- nlsLM(y ~ a*(e*n + k1*k2 + c), data=out,
               start=list(a = 2e-10,
                          e = 6e+05, 
                          c = 1e+07), lower = c(0, 0, 0), algorithm="port")
print(fit_r)
## summary(fit_r)

df_fit <- data.frame(n = seq(0, 70000, by = 1))
df_fit$y <- predict(fit_r, newdata = df_fit)
plot(out$n, out$y, type = "l", col = "red", ylim = c(0,10))
lines(df_fit$n, df_fit$y, col="green")
legend(0,ceiling(max(out$y)),legend=c("observed","predicted"), col=c("red","green"), lty=c(1,1), ncol=1)

The fitting to the data seems to be very sensitive to initial conditions. For example:

  • with list(a = 2e-10, e = 6e+05, c = 1e+07), this gives a good fit:
Nonlinear regression model
  model: y ~ a * (e * n + k1 * k2 + c)
   data: out
        a         e         c 
2.221e-10 5.895e+05 9.996e+06 
 residual sum-of-squares: 3.225e-26

Algorithm "port", convergence message: Relative error between `par' and the solution is at most `ptol'.

enter image description here

  • with list(a = 2e-01, e = 100, c = 2), this gives a bad fit:
Nonlinear regression model
  model: y ~ a * (e * n + k1 * k2 + c)
   data: out
        a         e         c 
1.839e-08 1.000e+02 0.000e+00 
 residual sum-of-squares: 476410

Algorithm "port", convergence message: Relative error in the sum of squares is at most `ftol'.

enter image description here

So, Is there an efficient way to find initial conditions that give a good fit to the data ?

EDIT:

I added the following code to explain better the problem. The code is used to find the values of a, e, and c that give the best fit to data from several data sets. Each line in Y corresponds with one data set. By running the code, there is an error message for the 3rd data set (or 3rd line in Y): singular gradient matrix at initial parameter estimates. Here is the code:

TR <- 0.946
b <- 2000
k <- 50000
nr <- 25
na <- 4000
nd <- 3200
d <- 0.05775

Y <- data.frame(k1 = c(114000, 72000, 2000, 100000), k2 = c(47356, 30697, 214, 3568), n = c(114000, 72000, 2000, 100000), 
           na = c(3936, 9245, 6834, 2967), nd = c(191, 2409, 2668, 2776), nr = c(57, 36, 1, 50), a = NA, e = NA, c = NA)

## Create a function to round values
roundDown <- function(x) {
  k <- floor(log10(x))
  out <- floor(x*10^(-k))*10^k
  return(out)
}

ID_line_NA <- which(is.na(Y[,c("a")]), arr.ind=TRUE)
## print(ID_line_NA)

for(i in ID_line_NA){

  print(i)

  ## Define the variable y
  seq_n <- seq(0, Y[i, c("n")], by = 1)
  y <- d + (((TR*b)/(Y[i, c("k1")]))*(Y[i, c("nr")]/(Y[i, c("na")] + Y[i, c("nd")] + Y[i, c("nr")])))*seq_n
  ## summary(y)
  out <- data.frame(n = seq_n, y = y)
  ## plot(out$n, out$y)

  ## Build the linear model to find the values of parameters that give the best fit 
  mod <- lm(y ~ n, data = out)
  ## print(mod)

  ## Define initial conditions
  test_a <- roundDown(as.vector(coefficients(mod)[1])/(Y[i, c("k1")]*Y[i, c("k2")]))
  test_e <- as.vector(coefficients(mod)[2])/test_a
  test_c <- (as.vector(coefficients(mod)[1])/test_a) - (Y[i, c("k1")]*Y[i, c("k2")])

  ## Build the nonlinear model
  fit <- tryCatch( nlsLM(y ~ a*(e*n + Y[i, c("k1")]*Y[i, c("k2")] + c), data=out,
                                   start=list(a = test_a,
                                              e = test_e,
                                              c = test_c), lower = c(0, 0, 0)),
                             warning = function(w) return(1), error = function(e) return(2))
  ## print(fit)

  if(is(fit,"nls")){

    ## Plot
    tiff(paste("F:/Sources/Test_", i, ".tiff", sep=""), width = 10, height = 8, units = 'in', res = 300)
    par(mfrow=c(1,2),oma = c(0, 0, 2, 0))
    df_fit <- data.frame(n = seq_n)
    df_fit$y <- predict(fit, newdata = df_fit)
    plot(out$n, out$y, type = "l", col = "red", ylim = c(0, ceiling(max(out$y))))
    lines(df_fit$n, df_fit$y, col="green")
    dev.off()

    ## Add the parameters a, e and c in the data frame
    Y[i, c("a")] <- as.vector(coef(fit)[c("a")])
    Y[i, c("e")] <- as.vector(coef(fit)[c("e")])
    Y[i, c("c")] <- as.vector(coef(fit)[c("c")])

  } else{

    print("Error in the NLM")

  }
}

So, using the constraints a > 0, e > 0, and c > 0, is there an efficient way to find initial conditions for the nlsLM function that give a good fit to the data and to avoid error messages ?

I added some conditions to define initial conditions for the parameters a, e, and c:

Using the result of the linear model lm(y ~ n):

c = intercept/a - k1*k2 > 0 and 
e = slope/a > 0
0 < a < intercept/(k1*k2)

, where intercept and slope is the intercept and slope of lm(y ~ n), respectively.


Solution

  • The problem is not how to find the initial values of the parameters. The problem is that this is a re-parameterized linear model with constraints. The parameters for the linear model are slope a*e and intercept a*(k1 * k2 + c) so there can only be 2 parameters such as slope and intercept but the formula in the quesiton attempts to define three: a, c and e.

    We will need to fix one of the variables or, in general, add an additional constraint. Now if co is a vector whose first element is the Intercept and second element is the slope from the linear model

    fm <- lm(y ~ n)
    co <- coef(fm)
    

    then we have the equations:

    co[[1]] = a*e
    co[[2]] = a*(k1*k2+c)
    

    co, k1 and k2 are known and if we consider c as fixed then we can solve for a and e to give:

    a = co[[2]] / (k1*k2 + c)
    e = (k1 * k2 + c) * co[[1]] / co[[2]]
    

    Since both co[[1]] and co[[2]] are positive and c must be too a and e are necessarily positive as well giving us a solution once we arbitrarily fix c. This gives an infinite number of a, e pairs which minimize the model, one for each non-negative value of c. Note that we do not need to invoke nlsLM for this.

    For example, for c = 1e-10 we have:

    fm <- lm(y ~ n)
    co <- coef(fm)
    
    c <- 1e-10
    a <- co[[2]] / (k1*k2 + c)
    e <- (k1 * k2 + c) * co[[1]] / co[[2]]
    a; e
    ## [1] 5.23737e-13
    ## [1] 110265261628
    

    Note that numerical problems may exist due to the large difference in magnitude among the coefficients; however, if we increase c that will increase e and decrease a making the scaling even worse so the parameterization of this problem given in the question seems to have inheritently bad numerical scaling.

    Note that none of this requires running nlsLM to get the optimum coefficients; however, due to the bad scaling it might still be possible to improve the answer somewhat.

    co <- coef(lm(y ~ n))
    c <- 1e-10
    a <- co[[2]] / (k1*k2 + c)
    e <- (k1 * k2 + c) * co[[1]] / co[[2]]
    nlsLM(y ~ a * (e * n + k1 * k2 + c), start = list(a = a, e = e), lower = c(0, 0))
    

    which gives:

    Nonlinear regression model
      model: y ~ a * (e * n + k1 * k2 + c)
       data: parent.frame()
            a         e 
    2.310e-10 5.668e+05 
     residual sum-of-squares: 1.673e-26
    
    Number of iterations to convergence: 12 
    Achieved convergence tolerance: 1.49e-08