Search code examples
rggplot2non-linear-regressionnls

plot separate curves for non-linear (nls) model with categorical variable and separate parameter values


I have a dataset of age and length, plus some categorical variables including sex and location (2 level factor). I have fit a Gompertz model to this, using nls():

gompertz <- nls(Length~a*exp(-b*exp(-c*age)), 
                     data=df, 
                     start=list(a=155,b=0.4, c=0.1)) 

but have been struggling to include a categorical variable as well - i.e. I want to compare the growth rates between 2 locations. A solution has been suggested here:

https://stats.stackexchange.com/questions/63226/non-linear-modelling-with-several-variables-including-a-categorical-variable?newreg=b3e4387021fe4108ad74075bee2e370a

and this model works - I get separate a, b and c parameter estimates for both levels of my location factor. However I do not know how to go about plotting this - I want to plot my raw data (length by age) which 2 separate prediction curves, 1 for each value of location with the separate parameter estimates.

my model is as follows:

gompertz.locs <- nls(formula = Length ~ 
                 as.numeric(location==1)*a1*exp(-b1*exp(-c1*age)) 
               + as.numeric(location==2)*a2*exp(-b2*exp(-c2*age)),
               data = df, 
               start = list(a1=150,b1=0.5, c1=0.5,
                            a2=150,b2=0.5, c2=0.5))

sample data can be generated using:

ages<- runif(100, 0, 22) #ages 0-22

#parameters for model
a1<-153
b1<-0.51
c1<-0.53

a2<-147
b2<-0.45
c2<-0.43

#generate length with error normally distributed 
length1 <- (a1*exp(-b1*exp(-c1*ages))) +rnorm(100, mean=0, sd=5)
length2 <- (a2*exp(-b2*exp(-c2*ages))) +rnorm(100, mean=0, sd=5)

df<-data.frame(Length=c(length1, length2), age=rep(ages, 2), location=c(rep(1,100), rep(2,100)))

TIA!


Solution

  • How about this:

    library(ggplot2)
    ages<- runif(100, 0, 22) #ages 0-22
    
    #parameters for model
    a1<-153
    b1<-0.51
    c1<-0.53
    
    a2<-147
    b2<-0.45
    c2<-0.43
    
    #generate length with error normally distributed 
    length1 <- (a1*exp(-b1*exp(-c1*ages))) +rnorm(100, mean=0, sd=5)
    length2 <- (a2*exp(-b2*exp(-c2*ages))) +rnorm(100, mean=0, sd=5)
    
    df<-data.frame(Length=c(length1, length2), age=rep(ages, 2), location=c(rep(1,100), rep(2,100)))
    
    
    gompertz.locs <- nls(formula = Length ~ 
                           as.numeric(location==1)*a1*exp(-b1*exp(-c1*age)) 
                         + as.numeric(location==2)*a2*exp(-b2*exp(-c2*age)),
                         data = df, 
                         start = list(a1=150,b1=0.5, c1=0.5,
                                      a2=150,b2=0.5, c2=0.5))
    
    a <- coef(gompertz.locs)[c(1,4)]
    b <- coef(gompertz.locs)[c(2,5)]
    c <- coef(gompertz.locs)[c(3,6)]
    df$fit <- a[df$location]*exp(-b[df$location]*exp(-c[df$location]*df$age))
    
    ggplot(df, aes(x=age, y=Length, colour=as.factor(location))) + 
      geom_point() + 
      geom_line(aes(y=fit)) + 
      theme_classic() + 
      labs(colour="Location")
    

    Created on 2022-07-06 by the reprex package (v2.0.1)