Search code examples
rregressionnls

R nls singular gradient


I've tried searching the other threads on this topic but none of the fixes are working for me. I have the results of a natural experiment and I want to show the number of consecutive occurrences of an event fit an exponential distribution. My R shell is pasted below

f <- function(x,a,b) {a * exp(b * x)}
> x
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
[26] 26 27
> y
 [1] 1880  813  376  161  100   61   31    9    8    2    7    4    3    2    0
[16]    1    0    0    0    0    0    1    0    0    0    0    1
> dat2
    x    y
1   1 1880
2   2  813
3   3  376
4   4  161
5   5  100
6   6   61
7   7   31
8   8    9
9   9    8
10 10    2
11 11    7
12 12    4
13 13    3
14 14    2
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=1, b=1)) 
Error in numericDeriv(form[[3L]], names(ind), env) : 
  Missing value or an infinity produced when evaluating the model
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=7, b=-.5)) 
Error in nls(y ~ f(x, a, b), data = dat2, start = c(a = 7, b = -0.5)) : 
  singular gradient
> fm <- nls(y ~ f(x,a,b), data = dat2, start = c(a=7,b=-.5),control=nls.control(maxiter=1000,warnOnly=TRUE,minFactor=1e-5,tol=1e-10),trace=TRUE) 
4355798 :   7.0 -0.5
Warning message:
In nls(y ~ f(x, a, b), data = dat2, start = c(a = 7, b = -0.5),  :
  singular gradient

Please forgive the bad formatting, first post here. x contains bins of a histogram, y contains the number of occurrences of each bin in that histograms. dat2 cuts off at 14 since the 0 count bins would throw off the exponential regression, and I really only need to fit those first 14. Those bins which have counts beyond 14 I have biological reason to believe they are special. The issue I initially got was infinity, which I don't get since none of the values are 0. After giving decent starting values as suggested by a different post here I get the singular gradient error. The only other posts I saw with that had more variables, I tried increasing the number of iterations but that did not succeed. Any help is appreciated. A


Solution

  • 1) linearize to get starting values You need better starting values:

    # starting values
    fm0 <- nls(log(y) ~ log(f(x, a, b)), dat2, start = c(a = 1, b = 1))
    
    nls(y ~ f(x, a, b), dat2, start = coef(fm0))
    

    giving:

    Nonlinear regression model
      model: y ~ f(x, a, b)
       data: x
            a         b 
    4214.4228   -0.8106 
     residual sum-of-squares: 2388
    
    Number of iterations to convergence: 6 
    Achieved convergence tolerance: 3.363e-06
    

    1a) Similarly we could use lm to get the initial value by writing

    y ~ a * exp(b * x)
    

    as

    y ~ exp(log(a) + b * x)
    

    and taking logs of both to get a model linear in log(a) and b:

    log(y) ~ log(a) + b * x
    

    which can be solved using lm:

    fm_lm <- lm(log(y) ~ x, dat2)
    st <- list(a = exp(coef(fm_lm)[1]), b = coef(fm_lm)[2])
    nls(y ~ f(x, a, b), dat2, start = st)
    

    giving:

    Nonlinear regression model
      model: y ~ f(x, a, b)
       data: dat2
           a        b 
    4214.423   -0.811 
     residual sum-of-squares: 2388
    
    Number of iterations to convergence: 6 
    Achieved convergence tolerance: 3.36e-06
    

    1b) We can also get it to work by reparameterizing. In that case a = 1 and b = 1 will work provided we transform the initial values in line with the parameter transformation.

    nls(y ~ exp(loga + b * x), dat2, start = list(loga = log(1), b = 1))
    

    giving:

    Nonlinear regression model
      model: y ~ exp(loga + b * x)
       data: dat2
      loga      b 
     8.346 -0.811 
     residual sum-of-squares: 2388
    
    Number of iterations to convergence: 20 
    Achieved convergence tolerance: 3.82e-07
    

    so b is as shown and a = exp(loga) = exp(8.346) = 4213.3

    2) plinear Another possibility that is even easier is to use alg="plinear" in which case starting values are not needed for the parameters entering linearly. In that case the starting value of b=1 in the question seems sufficient.

    nls(y ~ exp(b * x), dat2, start = c(b = 1), alg = "plinear")
    

    giving:

    Nonlinear regression model
      model: y ~ exp(b * x)
       data: dat2
            b      .lin 
      -0.8106 4214.4234 
     residual sum-of-squares: 2388
    
    Number of iterations to convergence: 11 
    Achieved convergence tolerance: 2.153e-06