Search code examples
rdataframeregressiontidymodels

How to run multiple lm models in R and generate a new df?


I have the following df and I need to run for each player the following regression model:

ln(score)_t = \beta_1 + \beta_2\mbox{time_playing}

My code and the example df is something like:

```
library(tidyverse)
library(broom)

df_players <- read.csv("https://github.com/rhozon/datasets/raw/master/data_test_players.csv", head = T, sep = ";") %>% 
  glimpse()

Rows: 105
Columns: 3
$ player       <chr> "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a", "a"…
$ time_playing <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 1,…
$ score        <int> 7, 5, 2, 3, 10, 8, 7, 10, 10, 3, 8, 5, 2, 5, 6, 9, 9, 8, 9, 4, 6, 4, 9, 8, 8, 5, 2, 10, 9, 5, 7, 4, 5, 8, 10, 2, 3, 8, 8, 5, 7, 6, 10…

```

The desired dataframe is something like:

```
df
  player       beta_2
1      a  0.005958000
2      b -0.004110000
3      c  0.000390777
```

How did can I use the lm function for estimate for each different player the beta_2 coefs and generate it like the desired dataframe as showed above ?


Solution

  • Assuming the input shown in the Note at the end use lmList to run the regressions by player and then extract the coefficients. Omit the last line if it is OK to have player as the row names instead of a column.

    library(nlme)
    library(tibble)
    
    fo <- log(score) ~ time_playing | player
    df_players %>%
      lmList(fo, .) %>% 
      coef %>%
      rownames_to_column(var = "player")
    

    giving:

      player (Intercept)  time_playing
    1      a    1.678156  0.0059581851
    2      b    1.732095 -0.0041131361
    3      c    1.642926  0.0003907772
    

    This code can be used for plotting the three regression curves and data.

    library(lattice)
    xyplot(fo, df_players, type = c("p", "r"), as.table = TRUE)
    

    screenshot

    Note

    u <- "https://github.com/rhozon/datasets/raw/master/data_test_players.csv"
    df_players <- read.csv2(u)