Search code examples
rloopslmfactors

Fitting a linear model for every factor in a variable


I have a dataframe containing a long series of measurements relating to forest composition. I would like to fit a linear model for every unique tree species contained in one of these variables, but have been struggling to accomplish this. I have provided my current code approach.

for (species in TreeData$SPECIES){
  CurrSpecies <- lm(GROWTH ~ DBH + TOTAL_HGT + BASTAND + BAHW + 
BA_UPPER + BA_MAX + GDD + PCP + TREE_STATU, 
data = TreeData, subset = SPECIES_CO == species)
  path <- paste(".../SUMMARY_", species, ".csv")
  write(capture.output(summary(CurrSpecies)), file = path)
}

Solution

  • One way to extract the coefficient tables using your strategy is like this, using the (built-in) iris dataset as an example:

    for (species in unique(iris$Species)){
      data <- iris[iris$Species==species, ]
      outname <- paste0("output_folder/", species, ".csv")
      fit <- lm(data[["Sepal.Length"]] ~ data[["Sepal.Width"]])
      fit_coefs <- as.data.frame(summary(fit)$coefficients)
      write.csv(fit_coefs, outname)
    }
    

    Also check out the broom:: package, which makes this kind of task quicker and more consistent.

    If you need other parts of the summary, you can play around with extracting and reorganizing different parts of the summary object summary(fit).

    And an important tip: it's important to include a sample of your TreeData df in the original post!