r ggplot2 regression non-linear-regression

Plotting nonlinear regression in R

I am new to R and have been trying to fit a nonlinear model to some data, unsuccessfully though. In the end I added a polynomial trendline in Excel and tried to plot the function I got - for some reason the function doesn't fit my data in R. I tried the simple geom_smooth, but achieved a "bulky" line, I want a smooth one. I have 6 samples in one plot, here is the data for one of them, including the function obtained by Excel and my attempt to plot it. I am sure there is a better way - I also need to get the function of the fit in an output.

datax <- c(0, 21.3, 30, 46.3, 72)
datay <- c(0, 0.008723333, 0.016253333, 0.039896667, 0.079893333)
data <- data.frame(datax, datay)
x <- seq(0, 0.01, length.out = 72)
poly.fit <- function(x) 1E-5*x^2+0.0002*x
ggplot(data, aes(x=datax, y=datay)) +
  geom_point() +
  stat_function(fun=poly.fit)

Solution

Well, that function does not perfectly fit the data. After running the code and poly.fit(46.3) it returns 0.0306969 which is not .03989

The problem lies in the equation itself. If you did want to create a function in R that did match the data perfectly, there is a principle called polynomial interpolation that pretty much states that you need as many points as terms in the model if you want to fit it perfectly. So, if you would like to match the points, you can use:

 m <- lm(datay ~ poly(datax,4))   # poly() fits the polynomial with 4+1 terms
 summary(m)                       # displays coefficients

Once you get the coefficients, you can recreate the function as you did before, and that should fit the the line to match your points perfectly (as long as you fit enough polynomial terms!).

EDIT: here is an example of reproducible code that displays what you want

library(ggplot2)
datax <- c(0, 21.3, 30, 46.3, 72)
datay <- c(0, 0.008723333, 0.016253333, 0.039896667, 0.079893333)
data <- data.frame(datax, datay)

# This is another approach to the fitting using I()
m <- lm(datay ~ datax + I(datax^2) + I(datax^3) + I(datax^4))

x <- seq(0, 72, by = .1)
poly.fit = function(x){       
    as.numeric(m$coefficients[1]) +
    as.numeric(m$coefficients[2])*x +
    as.numeric(m$coefficients[3])*x^2 + 
    as.numeric(m$coefficients[4])*x^3 + 
    as.numeric(m$coefficients[5])*x^4
}  #This way you dont have to copy and paste coefficients

ggplot(data, aes(x=datax, y=datay)) +
  geom_point() +
  stat_function(fun=poly.fit)