I am new to R programming. I am facing issues which solving this:
Dataset:
set.seed(897)
ME <- matrix(rnorm(24000),nrow=1000)
colnames(ME) <- c(paste("A",1:12,sep=""),paste("B",1:12,sep=""))
Use apply() to calculate a statistical test for every row in ME. You want to ask whether the groups A and B are from the same population or from populations with different means. You can assume data to be normally distributed. Count the number of rows with a p-value equal to, or lower than 0.05.
I tried
>P<- apply(ME , 1 , function(ME){ t.test(ME[1:1000])$p.value })
> length(which(P <= 0.05))
frown emoticon but this is incorrect
If the column names are not in a particular order, then we can use grep
to find the index of column names that start with A
and also with B
.
ind1 <- grep('^A', colnames(ME))
ind2 <- grep('^B', colnames(ME))
Then we do the t.test
by row using apply
with MARGIN=1
pval <- apply(ME, 1, FUN=function(x) t.test(x[ind1], x[ind2])$p.value)
head(pval)
#[1] 0.4987050 0.0303736 0.7143174 0.2955703 0.5082427 0.2109010
We get a logical index by comparing with 0.05
v1 <- pval <= 0.05
Get the sum
of the TRUE
values to find the number of rows that have p.value
less than 0.05
sum(v1)
#[1] 55