Search code examples
rdata-analysisstatistical-test

Count number of rows with a p-value equal to, or lower than 0.05


I am new to R programming. I am facing issues which solving this:

Dataset:

set.seed(897)
ME <- matrix(rnorm(24000),nrow=1000)
colnames(ME) <- c(paste("A",1:12,sep=""),paste("B",1:12,sep=""))

Use apply() to calculate a statistical test for every row in ME. You want to ask whether the groups A and B are from the same population or from populations with different means. You can assume data to be normally distributed. Count the number of rows with a p-value equal to, or lower than 0.05.

I tried >P<- apply(ME , 1 , function(ME){ t.test(ME[1:1000])$p.value }) > length(which(P <= 0.05)) frown emoticon but this is incorrect


Solution

  • If the column names are not in a particular order, then we can use grep to find the index of column names that start with A and also with B.

     ind1 <- grep('^A', colnames(ME))
     ind2 <- grep('^B', colnames(ME))
    

    Then we do the t.test by row using apply with MARGIN=1

     pval <- apply(ME, 1, FUN=function(x) t.test(x[ind1], x[ind2])$p.value)
     head(pval)
     #[1] 0.4987050 0.0303736 0.7143174 0.2955703 0.5082427 0.2109010
    

    We get a logical index by comparing with 0.05

     v1 <- pval <= 0.05
    

    Get the sum of the TRUE values to find the number of rows that have p.value less than 0.05

    sum(v1)
    #[1] 55