Run a large amount of regressions and storing coefficients

I am trying to run a large amount of regressions in my panel data. In this simple example its an univariate regression, but in reality i'm trying to run a multivariate regression (mentioning this just case).

My data looks like this:

DATE	year	month	ASSET_CODE	RET	MKTRET
2000-01-01	2000	1	1	-0.5	1
2000-01-01	2000	1	2	1	1
2000-01-02	2000	1	1	1	2
2000-01-02	2000	1	2	2	2
2000-01-03	2000	1	1	0.3	3
2000-01-03	2000	1	2	-0.2	3
2000-02-01	2000	2	1	-3	4
2000-02-01	2000	2	2	-1	4
2000-02-02	2000	2	1	2	5
2000-02-02	2000	2	2	1	5
2000-02-03	2000	2	1	2	6
2000-02-03	2000	2	2	2	6

I want to group by month and asset code and run a regression RET ~ MKTRET (in each month for each asset code) and store the coefficients for the slope and intercept, and nothing else. I want the end result to be formatted like this:

year	month	ASSET_CODE
2000	1	1
2000	1	2
2000	2	1
2000	2	2

I tried the following approach, which should have given me a dataframe with all the necessary information (and a lot more, including other statistical info) which I could have used to reshape to get the end result:

regressions <- df %>%
  group_by(month, year, ASSET_CODE) %>%
  do(regressions_fit = lm(RET ~ MKTRET, data = ., na.rm = TRUE))
regressions_coef <- tidy(regressions, regressions_fit)

However, this approach did not work for me because I ran out of memory on my computer. I suspect it has to do with the fact that it stored ALL the information related to the regressions and not just the SLOPE and INTERCEPT which i am interested in. Is there a less resource-intensive way to do this?

Solution

flm in the collapse package is like lm but returns the coefficients only and is faster. An alternative would be to use coef(lm(...)) in place of flm.

library(collapse)
library(dplyr)

df %>%
  group_by(year, month, ASSET_CODE) %>%
  summarize(as.data.frame(t(flm(RET ~ MKTRET, cur_group()))), .groups = "drop")
## # A tibble: 4 × 5
##    year month ASSET_CODE `(Intercept)` MKTRET
##   <int> <int>      <int>         <dbl>  <dbl>
## 1  2000     1          1        -0.533  0.4  
## 2  2000     1          2         2.13  -0.600
## 3  2000     2          1       -12.2    2.5  
## 4  2000     2          2        -6.83   1.50

Note

The data in reproducible form

df <- structure(list(DATE = c("2000-01-01", "2000-01-01", "2000-01-02", 
"2000-01-02", "2000-01-03", "2000-01-03", "2000-02-01", "2000-02-01", 
"2000-02-02", "2000-02-02", "2000-02-03", "2000-02-03"), year = c(2000L, 
2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 2000L, 
2000L, 2000L), month = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L), ASSET_CODE = c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 
2L, 1L, 2L), RET = c(-0.5, 1, 1, 2, 0.3, -0.2, -3, -1, 2, 1, 
2, 2), MKTRET = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 
6L)), class = "data.frame", row.names = c(NA, -12L))