Search code examples
rmodelregressionlinear-regression

group-wise linear models function nest_by


I have a dataframe of 4 columns: Dataset, X, Y, Group.

The task is to fit a linear model to each of the five groups (The group column contains 5 groups: a, b, c, d, e) in the dataframe and then compare the slope with the dataframe test_2. For the test_2 I have already fitted a model, as there was no group separation like in the test_1. For the test_1 we have been suggested to use the function nest_by to compute a group-wise linear models I have tried to fit a model with the function nest_by

Input:

model <- test_1 %>%
nest_by(Group) %>%
mutate(model = list(lm(y ~ x, data = test_1)))
model

Output:

A tibble: 5 x 3
# Rowwise:  Group
Group               data model 
<fct> <list<tibble[,3]>> <list>
1 a               [58 x 3] <lm>  
2 b               [35 x 3] <lm>  
3 c               [47 x 3] <lm>  
4 d               [44 x 3] <lm>  
5 e               [38 x 3] <lm> 

I do not know now how to proceed. I thought that I could ungroup them and do a summary(), but would be similar to just fit a model separately with the function filter() and create 5 separated models.


Solution

  • Yes, you can proceed further using tidy from broom package which is better option than summary and then doing unnest.

    For example, for mtcars, for each cyl group, we can do the following,

    library(tidyr)
    library(dplyr)
    library(purrr)
    library(broom)
    
    mtcars_model <- mtcars %>% 
      nest(data = -cyl) %>% 
      mutate(
        model = map(data,  ~ lm(mpg ~ wt, data = .))
      )
     
    # now simply for each cyl, tidy the model output and unnest it
     
    mtcars_model %>% 
      mutate(
        tidy_summary = map(model, tidy)
      ) %>% 
      unnest(tidy_summary)
    
    #> # A tibble: 6 × 8
    #>     cyl data               model  term      estimate std.error statistic p.value
    #>   <dbl> <list>             <list> <chr>        <dbl>     <dbl>     <dbl>   <dbl>
    #> 1     6 <tibble [7 × 10]>  <lm>   (Interce…    28.4      4.18       6.79 1.05e-3
    #> 2     6 <tibble [7 × 10]>  <lm>   wt           -2.78     1.33      -2.08 9.18e-2
    #> 3     4 <tibble [11 × 10]> <lm>   (Interce…    39.6      4.35       9.10 7.77e-6
    #> 4     4 <tibble [11 × 10]> <lm>   wt           -5.65     1.85      -3.05 1.37e-2
    #> 5     8 <tibble [14 × 10]> <lm>   (Interce…    23.9      3.01       7.94 4.05e-6
    #> 6     8 <tibble [14 × 10]> <lm>   wt           -2.19     0.739     -2.97 1.18e-2
    

    Created on 2022-07-09 by the reprex package (v2.0.1)

    For additional Information with examples, check here