Search code examples
rt-test

Need help to perform an automated unpaired t-test over different columns from CSV document in R


I would like to perform an automated paired t-test between column 2 and 3, 4 and 5, 6 and 7 and so on. When I use the code below, I am able to perform a t-test, but not an unpaired t-test.

data:

 patient weight_1 weight_2    BMI_1    BMI_2 chol_1 chol_2    gly_1    gly_2
1       A     86.0     97.0 34.44961 30.61482   86.0   97.0 34.44961 30.61482
2       B    111.0     55.5 33.51045 22.80572  111.0   55.5 33.51045 22.80572
3       C     92.4     70.0 28.51852 25.71166   92.4   70.0 28.51852 25.71166

code:

names <- colnames(dataframe)
 for(i in seq(from = 2, to = 8, by = 2)){
 print(names[i])
 print(names[i+1]) 
 print(t.test(dataframe[i], dataframe[i+1]))
 }

output:

[1] "weight_1" [1] "weight_2"

        Welch Two Sample t-test

data:  dataframe[i] and dataframe[i + 1]
t = 1.3183, df = 75.892, p-value = 0.1914
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.459965 12.090735
sample estimates:
mean of x mean of y 
 91.50256  86.68718 
[1] "BMI_1"
[1] "BMI_2"

        Welch Two Sample t-test

data:  dataframe[i] and dataframe[i + 1]
t = 1.5851, df = 75.866, p-value = 0.1171
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.3817027  3.3571650
sample estimates:
mean of x mean of y 
 30.45167  28.96394 

And so on. When I add paired=TRUE to the data:

 names <- colnames(dataframe)
 for(i in seq(from = 2, to = 8, by = 2)){
 print(names[i])
 print(names[i+1]) 
 print(t.test(dataframe[i], dataframe[i+1]), paired=TRUE)
 }

The results are exactly the same, as if it doesn't include the paired function. Could someone help me with this? Many thanks in advance.


Solution

  • You have to change the indexing in the t.test to define clearly that you want to use the columns:

    e.g.:

    df <- data.frame(a = runif(10), b=runif(10), c=runif(10))
        
        t1 <- t.test(df[1], df[2])
        t1$p.value
        
        t2 <- t.test(df[1], df[2], paired=T)
        t2$p.value
    Error in `[.data.frame`(y, yok) : undefined columns selected
    

    but

    t2 <- t.test(df[,1], df[,2], paired=T)
    t2$p.value
    

    works. So in your code it should be

    print(t.test(dataframe[,i], dataframe[,i+1], paired=TRUE)) 
    

    for the paired t.test.

    I would suggest using this form of indexing also for the paired t-test, although it does not throw any error.