Search code examples
rexponentialnlsmodel-fitting

R: Error in function "nls" for my datset


I have data with climatological pressure values and the heights of the stations where the pressure was measured. I would like to fit an exponential model to them. The data look like this:

x
 [1]  539  575 1320  438 1840  496  316  552  325  386 1599 1073  556 1029 1661
[16] 2472 1594 1197  910 1035  596  646  420  516 1980 1045 2287  440  419 1611
[31]  577 3580  484 1018 1669  745 1974  366  273  454  203  588 1427  405 1403
[46]  485  490 2106  990 3305 1078  455  300 1638 1708  438 1303  482  775 2502
[61]  457 2690  422 1638  555  426

y
[1] 954.1 951.4 867.2 964.0 813.3 958.8 979.7 950.8 978.4 971.3 835.1 894.1
[13] 952.0 904.4 833.3 751.5 839.0 882.5 912.0 899.4 947.1 942.3 968.5 961.9
[25] 801.3 893.6 769.8 965.6 965.1 836.9 949.2 653.6 959.8 900.2 830.6 928.6
[37] 800.3 971.1 983.5 963.4 992.6 947.5 848.3 969.4 858.2 959.9 959.3 787.2
[49] 900.4 677.6 893.2 962.7 981.5 834.9 827.0 966.0 870.1 961.1 925.2 749.3
[61] 962.8 734.0 968.0 836.3 950.4 966.5

I have first tried just to take the logarithms of the data and fit an lm them:

log.p=log(y)
log.height=log(x)
lmlog=lm(log.p~log.height)

But as this delivered a model that does not fit at all, I've decided to use the nls function having taken various tips from other posts (e.g. "start"):

f <- function(x,a,b) {a* exp(b * x)}
dat <- as.list(x, y)
start <- coef(nls(log(y) ~ log(f(x, a, b)), dat, start = c(a = 1, b = -1)))
nls=nls(y~ f(x,a,b), data=dat, start=start)

Unfortunately, even for "start", the following errors appear and I really don't know what to do anymore...

Error in numericDeriv(form[[3L]], names(ind), env) : 
  Missing value or an infinity produced when evaluating the model
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf

Can anyone help? Thanks in advance!!!


Solution

  • Transforming both variables doesn't linearize the relationship:

    plot(log(y)~log(x))
    

    enter image description here

    Instead you should only transform y:

    plot(log(y)~x)
    modlm <- lm(log(y)~x)
    abline(modlm)
    

    enter image description here

    summary(modlm)
    
    Call:
    lm(formula = log(y) ~ x)
    
    Residuals:
           Min         1Q     Median         3Q        Max 
    -0.0081825 -0.0009194  0.0000952  0.0008455  0.0070058 
    
    Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
    (Intercept)  6.927e+00  4.567e-04 15166.4   <2e-16 ***
    x           -1.227e-04  3.516e-07  -349.1   <2e-16 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    Residual standard error: 0.002185 on 64 degrees of freedom
    Multiple R-squared:  0.9995,    Adjusted R-squared:  0.9995 
    F-statistic: 1.219e+05 on 1 and 64 DF,  p-value: < 2.2e-16
    

    Of course you can use nls too:

    modnls <- nls(y~exp(a+b*x), start=list(a=coef(modlm)[[1]], b=coef(modlm)[[2]]))
    plot(y~x)
    xnew <- seq(min(x), max(x), by=0.5)
    lines(xnew, exp(coef(modnls)[[1]]+xnew*coef(modnls)[[2]]))
    

    enter image description here

    summary(modnls)
    
    Formula: y ~ exp(a + b * x)
    
    Parameters:
        Estimate Std. Error t value Pr(>|t|)    
    a  6.926e+00  4.384e-04 15797.5   <2e-16 ***
    b -1.225e-04  3.831e-07  -319.7   <2e-16 ***
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
    Residual standard error: 1.904 on 64 degrees of freedom
    
    Number of iterations to convergence: 2 
    Achieved convergence tolerance: 6.25e-08
    

    Note how the parameter estimates of the linear and the nonlinear fit are very similar. Usually, you should choose whether to use lm with transformed data or nls with untransformed data depending on the error distribution. However, since the relationship between y and x is not very nonlinear, it doesn't matter so much here.