Search code examples
rlinear-regressionplm

Fama Macbeth Regression in R pmg


In the past few days I have been trying to find how to do Fama Macbeth regressions in R. It is advised to use the plm package with pmg, however every attempt I do returns me that I have an insufficient number of time periods.

My Dataset consists of 2828419 observations with 13 columns of variables of which I am looking to do multiple cross-sectional regressions. My firms are specified by seriesis, I have got a variable date and want to do the following Fama Macbeth regressions:

totret ~ size
totret ~ momentum
totret ~ reversal
totret ~ volatility
totret ~ value size
totret ~ value + size + momentum
totret ~ value + size + momentum + reversal + volatility

I have been using this command: fpmg <- pmg(totret ~ momentum, Data, index = c("date", "seriesid")

Which returns: Error in pmg(totret ~ mom, Dataset, index = c("seriesid", "datem")) : Insufficient number of time periods

I tried it with my dataset being a datatable, dataframe and pdataframe. Switching the index does not work as well.

My data contains NAs as well.

Who can fix this, or find a different way for me to do Fama Macbeth?


Solution

  • Lately, I fixed the Fama Macbeth regression in R. From a Data Table with all of the characteristics within the rows, the following works and gives the opportunity to equally weight or apply weights to the regression (remove the ",weights = marketcap" for equally weighted). totret is a total return variable, logmarket is the logarithm of market capitalization.

    logmarket<- df %>%
      group_by(date) %>%
      summarise(constant = summary(lm(totret~logmarket, weights = marketcap))$coefficient[1],  rsquared = summary(lm(totret~logmarket*, weights = marketcap*))$r.squared, beta= summary(lm(totret~logmarket, weights = marketcap))$coefficient[2])
    

    You obtain a DataFrame with monthly alphas (constant), betas (beta), the R squared (rsquared).

    To retrieve coefficients with t-statistics in a dataframe:

    Summarystatistics <- as.data.frame(matrix(data=NA, nrow=6, ncol=1)
    names(Summarystatistics) <- "logmarket"
    row.names(Summarystatistics) <- c("constant","t-stat", "beta", "tstat", "R^2", "observations")
    Summarystatistics[1,1] <- mean(logmarket$constant)
    Summarystatistics[2,1] <- coeftest(lm(logmarket$constant~1))[1,3]
    Summarystatistics[3,1] <- mean(logmarket$beta)
    Summarystatistics[4,1] <- coeftest(lm(logmarket$beta~1))[1,3]
    Summarystatistics[5,1] <- mean(logmarket$rsquared)
    Summarystatistics[6,1] <- nrow(subset(df, !is.na(logmarket)))