Search code examples
rdplyrlinear-regressionconfidence-intervalolsmultiplelinearregression

Calculating difference of two means and its confidence interval


I have data of a protein "F" in patients who used or not used a drug "X". I have 3 cofounders (age, BMI, sex) that I should adjust.

I successfully calculated the means of protein amount in the drug-group and no-drug group by using the below code:

> F <- data$f
> by(F, drugnodrug, summary)
> summ(F, by=drugnodrug)

Now, I want to calculate the absolute mean difference (95% confidence interval).

How can I do this in R (dplyr package)? Can I use Multiple Linear regression for this calculation and how?


Solution

  • This can be done with base R, because t.test also reports a confidence interval for the mean difference:

    > res <- t.test(iris$Sepal.Length[iris$Species=="setosa"],
    +               iris$Sepal.Length[iris$Species=="virginica"])
    > res$conf.int
    [1] -1.78676 -1.37724
    attr(,"conf.level")
    [1] 0.95
    

    The mean values are stored in the entry estimate, and you can compute the difference of the means with

    > res$estimate[1] - res$estimate[2]
    mean of x 
       -1.582
    

    or equivalently

    > sum(res$estimate * c(1,-1))
    [1] -1.582