Search code examples
rlinear-regression

How to calculate linear models individually in melted Dataframe



I have currently a problem where I have a dataframe similar to the one I called "test" below. What I would like to do is to fit a Linear Model for each Site against Time and Group, so one model for A, one for B, and one for C.
e.g: Site A is present in 2 Groups: G1 and G2. It was measured at 5 time points. So I do have 5 values which should be modelled as dependent from Time (value ~ Time) and because it was done in 2 conditions (Group) this should be integrated so: (value ~ Time*Group).

How can I most efficiently achieve this and then extract the information from the summary to store them in a vector or list?

Thank you for your time, I really appreciate it.

test <- data.frame(Site= rep(c( rep("A", 5),
                                rep("B", 5),
                                rep("C", 5)),2),
                   
                    value= c(rnorm(1, n=15), rnorm(1, n=15)),
                    Time= rep(rep((1:5), 3), 2),
                    Group= c(rep("G1", 15), rep("G2", 15))
                    )

# Loop ?
linReg <- lm(value ~ Time * Group, data= test)

Solution

  • Use group_split by Site and then map with lm():

    library(tidyverse)
    
    test %>%
      group_split(Site) %>%
      map(~lm(value ~ Time * Group, data = .))
    

    Output:

    [[1]]
    
    Call:
    lm(formula = value ~ Time * Group, data = .)
    
    Coefficients:
     (Intercept)          Time       GroupG2  Time:GroupG2  
         -0.6393        0.5201        3.6533       -1.2188  
    
    
    [[2]]
    
    Call:
    lm(formula = value ~ Time * Group, data = .)
    
    Coefficients:
     (Intercept)          Time       GroupG2  Time:GroupG2  
        -0.38982       0.24745       0.58777      -0.08554  
    
    
    [[3]]
    
    Call:
    lm(formula = value ~ Time * Group, data = .)
    
    Coefficients:
     (Intercept)          Time       GroupG2  Time:GroupG2  
         0.17921       0.02528       2.13208      -0.34299  
    

    Add %>% summary() or whatever other post-fitting processes you want, within the call to map():

    map(~lm(value ~ Time * Group, data = .) %>% summary())