Search code examples
rcurve

Curve fitting to empirical data in R


I have the following empirical data set:

x<-c(0.5,1,1.5,1.731,1.75,2,2.5,3,3.5,4)

y<-c(10000,10000,10000,10000,5700,2700,1700,1350,950,625)

I have been trying to fit the data to a theoretically model. So far I have achieved to fit the data to a power law but only for discrete values (x = 1,2,3,4) using the poweRlaw package:

By looking at the graph, I would say that the tail is reasonably well fitted with the following estimated parameters power_law_fit <- function(x) 5743.492/(x^1.6):

KS
0.234651

xmin
625

pars
1.6

Nonetheless, I wanted a model that would mimic the behavior from the x point 1.731 onwards. Looking at log log graph there is no line that would fit those points therefore, I think the power law distribution should be ruled out.

I have been experimenting with a weibull regression but with no success.

Can anyone please shed some light on the best model and how to do it in R?


Solution

  • My inclination would be to fit a model to only the data for x > 1.7, since that's the range of interest. In fact, looking at a simple plot of your data shows that y is constant at 10000 until x=1.73, when it starts dropping off rapidly. This makes me wonder if y(x < 1.7) is even meaningful. Did whatever you used to measure y clip for x < 1.7??

    plot(x,y)
    

    Using the nls(...) function for non-linear modeling, and after a fair amount of experimentation with functional forms, the model:

    y ~ a/(1 + b*(x*(x-1.73))^n)
    

    gives a good fit:

    df     <- data.frame(x,y)
    df.sub <- df[4:10,]
    
    fit    <- nls(y~a/(1+b*(x*(x-1.731))^n),data=df.sub,start=c(a=1000,b=1,n=.5))
    summary(fit)
    # Formula: y ~ a/(1 + b * (x * (x - 1.731))^n)  
    # Parameters:
    #    Estimate Std. Error t value Pr(>|t|)    
    # a 9.992e+03  1.396e+02   71.58 2.28e-07 ***
    # b 3.744e+00  1.769e-01   21.17 2.94e-05 ***
    # n 4.762e-01  1.859e-02   25.61 1.38e-05 ***
    # ---
    # Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    #
    # Residual standard error: 139.8 on 4 degrees of freedom
    # ...
    library(ggplot2)
    ggp <- ggplot(df.sub, aes(x=x,y=predict(fit)))
    ggp <- ggp + geom_line(color="blue", linetype=2)
    ggp <- ggp + geom_point(color="blue", shape=1, size=3)
    ggp <- ggp + geom_point(data=df, aes(y=y), size=3)
    ggp