Search code examples
rstatisticst-test

R Function to get Confidence Interval of Difference Between Means


I am trying find a function that allows me two easily get the confidence interval of difference between two means.

I am pretty sure t.test has this functionality, but I haven't been able to make it work. Below is a screenshot of what I have tried so far:

Image

This is the dataset I am using

   Indoor Outdoor
1    0.07    0.29
2    0.08    0.68
3    0.09    0.47
4    0.12    0.54
5    0.12    0.97
6    0.12    0.35
7    0.13    0.49
8    0.14    0.84
9    0.15    0.86
10   0.15    0.28
11   0.17    0.32
12   0.17    0.32
13   0.18    1.55
14   0.18    0.66
15   0.18    0.29
16   0.18    0.21
17   0.19    1.02
18   0.20    1.59
19   0.22    0.90
20   0.22    0.52
21   0.23    0.12
22   0.23    0.54
23   0.25    0.88
24   0.26    0.49
25   0.28    1.24
26   0.28    0.48
27   0.29    0.27
28   0.34    0.37
29   0.39    1.26
30   0.40    0.70
31   0.45    0.76
32   0.54    0.99
33   0.62    0.36

and I have been trying to use t.test function that has been installed from

install.packages("ggpubr")

I am pretty new to R, so sorry if there is a simple answer to this question. I have searched around quite a bit and haven't been able to find anything that I am looking for.

Note: The output I am looking for is Between -1.224 and 0.376

Edit:

The CI of difference between means I am looking for is if a random 34th datapoint was added to the chart by picking a random value in the Indoor column and a random value in the Outdoor column and duplicating it. Running the t.test will output the correct CI for the difference of means for the given sample size of 33.

How can I go about doing this pretending the sample size is 34?


Solution

  • there's probably something more convenient in the standard library, but it's pretty easy to calculate. given your df variable, we can just do:

    # calculate mean of difference
    d_mu <- mean(df$Indoor) - mean(df$Outdoor)
    # calculate SD of difference
    d_sd <- sqrt(var(df$Indoor) + var(df$Outdoor))
    
    # calculate 95% CI of this
    d_mu + d_sd * qt(c(0.025, 0.975), nrow(df)*2)
    

    giving me: -1.2246 0.3767

    mostly for @AkselA: I often find it helpful to check my work by sampling simpler distributions, in this case I'd do something like:

    a <- mean(df$Indoor) + sd(df$Indoor) * rt(1000000, nrow(df)-1)
    b <- mean(df$Outdoor) + sd(df$Outdoor) * rt(1000000, nrow(df)-1)
    quantile(a - b, c(0.025, 0.975))
    

    which gives me answers much closer to the CI I gave in the comment