Search code examples
rnls

Fitting multiple logistic growth curves in a dataset


I have population data for multiple counties and would like to minimize repetition fitting logistic growth curves for each county.

county      year    pop
lake        1970    69305
lake        1980    104870
lake        1990    152104
lake        2000    210528
lake        2010    297052
marion      1970    69030
marion      1980    122488
marion      1990    194833
marion      2000    258916
marion      2010    331298
seminole    1970    83692
seminole    1980    179752
seminole    1990    287529
seminole    2000    365196
seminole    2010    422718

Currently I'm subsetting each county:

lake<-countypop[1:5,2:3]
colnames(lake)<-c("year", "pop")
marion<-countypop[6:10,2:3]
colnames(marion)<-c("year", "pop")
seminole<-countypop[11:15,2:3]

and then using SSlogis and plotting the curve for each county, e.g.:

lake.model <- nls(pop ~ SSlogis(year, phi1, phi2, phi3, data = lake)))
alpha <- coef(lake.model)
plot(pop ~ year, data = lake, main = "Logistic Growth Model of Lake County", 
xlab = "Year", ylab = "Population", xlim = c(1920, 2030),ylim=c(0,1000000))  
curve(alpha[1]/(1 + exp(-(x - alpha[2])/alpha[3])), add = T, col = "blue") 

I have about 60 counties and I know there has to be a cleaner way to do this. How can I use an apply function, loop, or something else to eliminate the repetition in my code?


Solution

  • Try this:

    pdf("countypop.pdf")
    models <- by(countypop, countypop$county, function(x) {
      fm <- nls(pop ~ SSlogis(year, phi1, phi2, phi3), data = x)
      plot(pop ~ year, x, main = county[1])
      lines(fitted(fm) ~ year, x)
      fm
    })
    dev.off()
    

    Note: We used this as input:

    countypop <- 
    structure(list(county = c("lake", "lake", "lake", "lake", "lake", 
    "marion", "marion", "marion", "marion", "marion", "seminole", 
    "seminole", "seminole", "seminole", "seminole"), year = c(1970L, 
    1980L, 1990L, 2000L, 2010L, 1970L, 1980L, 1990L, 2000L, 2010L, 
    1970L, 1980L, 1990L, 2000L, 2010L), pop = c(69305L, 104870L, 
    152104L, 210528L, 297052L, 69030L, 122488L, 194833L, 258916L, 
    331298L, 83692L, 179752L, 287529L, 365196L, 422718L)), .Names = c("county", 
    "year", "pop"), class = "data.frame", row.names = c(NA, -15L))