I'm a bit out of my depth here... I have this data:
d <- data.frame(matrix(data = c(1,1.5,6,2,11,2.5,16,3,26,4,46,5,66,6,86,7,126,8,176,9,276,10,426,11,626,12,876,13,1176,14,1551,15,2026,16,2676,17,3451,18,4351,19,5451,20,6801,21,8501,22,10701,23),
byrow = TRUE,
ncol = 2
)
)
names(d) <- c('x','y')
Looks like this:
plot(x = d$x,
y = d$y,
pch = 19,
col = 'grey50',
bty = 'n'
)
Now I want to describe the relationship between X and Y as a formula. So I try the nls
function. Like this:
fit <- nls(y ~ a * x ^ b,
start = list(a = 1,
b = 1),
data = d
)
lines(d$x,
predict(fit),
col = 'red',
lty = 2
)
As You can see, the line almost fits! And this is where I'm stuck. Something tells me that there is a perfect fit. But I don't know where to go from here. Alternative starting values does not seem to change anything. I got the advise to use lm(log(y) ~ log(x), data = d)
as starting parameters. But no love:
fit <- nls(y ~ a * x ^ b,
start = list(a = exp(0.3120),
b = 0.3883),
data = d
)
lines(d$x,
predict(fit),
col = 'blue',
lty = 2
)
I've tried a few other formulas, but I'm really just shooting in the dark here:
nls(y ~ a * x / (b + x), data = d)
nls(y ~ a + ((x * b) / (x + c)), start = c(a = 1, b = 10, c = 1), data = d
So, any suggestions on how to move forward?
I am not entirely sure, but it seems to me that your data just have noise. It makes you think you can do better, but in fact probably not.
This noise can be better visualized with linearized data :
# Linearize by eye
new_x = d$x^0.18
# Plot of data linearized
plot(x = new_x,
y = d$y,
pch = 19,
col = 'grey50',
bty = 'n'
)
# Linear regression
lin_reg = lm(d$y ~ new_x)
# Fitted
abline(a=lin_reg$coef[1], b=lin_reg$coef[2])
See that the points vary above and below. So your first formula is probably the right one.