I'd like to fit the logarithmic curve through my data using nls
.
library(dplyr)
library(ggplot2)
a <- 3
b <- 2
Y <- data_frame(x = c(0.2, 0.5, 1, 2, 5, 10),
y = a + b*log(x))
Y %>%
ggplot(aes(x = x, y = y)) +
geom_point(shape = 19, size = 2) +
geom_smooth(method = "nls",
formula = y ~ p1 + p2*log(x),
start = list(a = a, b = b),
se = FALSE,
control = list(maxiter = 100))
This gives me an error:
Error in method(formula, data = data, weights = weight, ...) : number of iterations exceeded maximum of 100
What is going wrong?
Here's some text I copied and pasted after doing ?nls
:
Warning
Do not use nls on artificial "zero-residual" data.
The nls function uses a relative-offset convergence criterion that compares the numerical imprecision at the current parameter estimates to the residual sum-of-squares. This performs well on data of the form
y = f(x, θ) + eps
(with var(eps) > 0). It fails to indicate convergence on data of the form
y = f(x, θ)
because the criterion amounts to comparing two components of the round-off error. If you wish to test nls on artificial data please add a noise component, as shown in the example below.
That inspired me to try this:
> library(dplyr)
> library(ggplot2)
> a <- 3
> b <- 2
> Y <- data_frame(x = c(0.2, 0.5, 1, 2, 5, 10),
+ y = a + b*log(x)*(1 + rnorm(length(x), sd=0.001)))
> Y %>%
+ ggplot(aes(x = x, y = y)) +
+ geom_point(shape = 19, size = 2) +
+ geom_smooth(method = "nls",
+ formula = y ~ p1 + p2*log(x),
+ start = list(p1 = a, p2 = b),
+ se = FALSE,
+ control = list(maxiter = 100))
Note: your code had start = list(a=a, b=b)
which is a typo because a
and b
aren't defined in your formula. Aside from that, adding the *(1 + rnorm(length(x), sd=0.001))
is the only thing I did.
The resulting graph made it seem like everything worked fine.
I'd generally recommend doing the fit separately, however, and then plotting it with predict
. That way you can always check the quality of the fit to see if it worked before plotting.
> fit <- nls(data=Y, formula = y ~ p1 + p2*log(x), start = list(p1 = a, p2 = b))
> summary(fit)
Formula: y ~ p1 + p2 * log(x)
Parameters:
Estimate Std. Error t value Pr(>|t|)
p1 3.001926 0.001538 1952 4.14e-13 ***
p2 1.999604 0.001114 1795 5.78e-13 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.003619 on 4 degrees of freedom
Number of iterations to convergence: 1
Achieved convergence tolerance: 1.623e-08
> new_x = data.frame(x=seq(from=0.2, to=10, length.out=100))
> ggplot(data=Y, aes(x=x, y=y)) +
geom_point() +
geom_line(data=new_x,
aes(x=new_x, y=predict(fit, newdata=new_x)),
color='blue')