Search code examples
rsubsett-test

How to run t-test with subset of rows in R


Below is a part of my data (pairht_protein)

enter image description here

I am trying to run t-test on all the variables (columns) between two groups which are:

Resistant_group <- c(PAIR-01, PAIR-12, PAIR-09)
Sensitive_group <- c(PAIR-07, PAIR-02, PAIR-05)

Before I make a function I tired to pick one of the variables and tried:

t.test(m_pHSL660 ~ Subject, data = subset(pairht_protein, Subject %in% c("Resistant_group", "Sensitive_group")))

But it gave me an error : 'grouping factor must have exactly 2 levels'

Is there a way to run t-test between these groups? and possibly make it as a function?


Solution

  • First, you must correct how you define the groups (you cannot use dashes on variable names):

    Resistant_group <- c('PAIR-01', 'PAIR-12', 'PAIR-09')
    Sensitive_group <- c('PAIR-07','PAIR-02','PAIR-05')
    

    Then, using dplyr package create another factor variable with only two levels:

    library(dplyr)
    
    # assuming pairht_protein is your dataset name
    
    pairht_protein <- pairht_protein %>% mutate(sub = case_when( subject %in% Resistant_group ~1,
                                      subject %in% Sensitive_group ~2),
                              sub = as.factor(sub))
    
    

    Because this new variable is going to make NAs values for elements outside your groups, you don't need to subsetting:

    t.test(m_pHSL660 ~ sub, data =pairht_protein)