Search code examples
rdataframefunctionlinear-regressionconfidence-interval

Function in R that computes heteroskedasticity-robust confidence intervals for a linear regression


Good afternoon, I have a question regarding my function down below. The task is to develop a function in R that computes heteroskedasticity-robust confidence intervals for the results of the betas of a linear regression.

As I have tried to do so, my function does not return any output. The console simply doesn´t do anything after trying to get some results from it. I really argue why especially if I compute it manually by the last two rows of my code it works out all fine. Even though you dont have the necessary data.frames, perhaps you can take a look at my code and tell me what is wrong about it or propose an alternative way to solve my problem :)

For clarity: the original numerous values (using all 200 data points each) of the coefficients are c(463.2121, 139.5762), the stdHC are c(74.705054, 5.548689) as given by the lm model and for HC-robust standard errors I use the package sandwich.

my_CI <- function (mod, level = 0.95)
{
  `%>%` <- magrittr::`%>%`
  standard_deviation <- stderrorHC
  Margin_Error <- abs(qnorm((1-0.95)/2))*standard_deviation 
  df_out <- data.frame(stderrorHC, mod,Margin_Error=Margin_Error,
                       'CI lower limit'=(mod - Margin_Error),
                       'CI Upper limit'=(mod + Margin_Error)) %>%
    return(df_out)
}

my_CI(mod, level = 0.95) #retrieving does not return any results for me

Definitions:
women <- read.table("women.txt")
men <- read.table("men.txt")
converged <- merge(women, men, all = TRUE)
level <- c(0.95, 0.975)
modell <- lm(formula = loan ~ education, data = converged)
mod <- modell$coefficients
vcov <- vcovHC(modell, type = "HC1")
stderrorHC <- sqrt(diag(vcov))

mod - abs(qnorm((1-level[1])/2))*stderrorHC 
mod + abs(qnorm((1-level[1])/2))*stderrorHC

Addition: Here is some data from the original dataset. I included just ten data points so we would need to construct the confidence interval upon the t-distributon in this case.

dataMenEductaion <- c(12, 17, 16, 11, 20, 20 , 11, 19, 15, 16)
dataMenLoan <- c(2404.72, 3075.313, 2769.543, 2009.295, 3105.121, 4269.216
                   2213.730, 4025.136, 2605.191, 2760.186)
dataWomenEducation <- c(12, 14, 16, 19 , 12, 19, 20, 17, 16, 10)
dataWomenLoan <- c(1920.667, 2278.255, 2296.804, 2977.048, 1915.740, 3557.991, 
                   3336.683, 2923.040, 2628.351, 1918.218)

Solution

  • I believe that the following provides you with the desired output.

    # install.packages('sandwich')
    library(sandwich) # contains vcovHC()
    
    # data
    df <- data.frame(education = c(12, 17, 16, 11, 20, 20, 11, 19, 15, 16,
                                  12, 14, 16, 19 , 12, 19, 20, 17, 16, 10),
                    loan = c(2404.72, 3075.313, 2769.543, 2009.295, 3105.121, 4269.216,
                             2213.730, 4025.136, 2605.191, 2760.186,
                             1920.667, 2278.255, 2296.804, 2977.048, 1915.740, 3557.991, 
                             3336.683, 2923.040, 2628.351, 1918.218))
    df$sex <- factor(gl(2, nrow(df)/2, labels = c('males', 'females')))
    
    # linear model
    fit <- lm(loan ~ education + sex, data = df)
    coefs <- fit$coefficients
    vcov <- vcovHC(fit, type = "HC1")
    stderrorHC <- sqrt(diag(vcov))
    
    # function to compute robust SEs
    my_CIs <- function (coefs, level = 0.95) {
      standard_deviation <- stderrorHC
      Margin_Error <- abs( qnorm( (1-level)/ 2) ) * standard_deviation 
      df_out <- data.frame(stderrorHC, coefs, Margin_Error = Margin_Error,
                           'CI lower limit' = (coefs - Margin_Error),
                           'CI Upper limit' = (coefs + Margin_Error))
      return(df_out)
    }
    

    Output

    > my_CIs(coefs = coefs)
    stderrorHC     coefs Margin_Error CI.lower.limit CI.Upper.limit
    (Intercept)  295.86900  160.3716    579.89259      -419.5210      740.26416
    education     23.64313  176.0111     46.33968       129.6714      222.35073
    sexfemales   132.07169 -313.2632    258.85576      -572.1189      -54.40743