I have three Samples (replicates) per Group. I want to use a T-test to compare values (MappedReadsCPM) between groups. However, I have 4000 values to compare sequentially (designated by PeakNumber). The following line is close, but it isn’t telling R to compare only peak_1, and then only peak_2, etc.
t.test(MappedReadsCPM~Group, data=subset(data2, Group %in% c("1", "2")))$p.value
I don’t want to print the 4000 p-values - ideally I can add them to a dataframe.
pvalues <- t.test(MappedReadsCPM~Group, data=subset(data2, Group %in% c("1", "2")))$p.value
data2
PeakNumber Sample Group MappedReadsCPM
peak_1 A 1 43.53819
peak_2 A 1 49.20722
peak_3 A 1 38.54943
peak_4 A 1 99.09472
peak_1 B 2 105.21728
peak_2 B 2 42.63114
peak_3 B 2 78.00591
peak_4 B 2 74.37773
peak_1 C 2 509.30606
peak_2 C 2 101.36234
peak_3 C 2 25.17051
peak_4 C 2 32.8804
peak_1 D 1 35.37478
peak_2 D 1 89.11722
peak_3 D 1 112.24688
peak_4 D 1 386.40139
peak_1 E 3 631.07692
peak_2 E 3 162.58791
peak_3 E 3 46.93961
peak_4 E 3 56.69035
peak_1 F 2 38.7762
peak_2 F 2 261.45587
peak_3 F 2 43.99171
peak_4 F 2 72.11012
peak_1 G 1 118.5962
peak_2 G 1 250.1178
peak_3 G 1 84.35
peak_4 G 1 386.40139
you can use sapply
to loop over al the unique peaks in your data and subset the data to that specific peak:
pvalues <- sapply(unique(data2$PeakNumber), function(peak){
t.test(MappedReadsCPM~Group, data=subset(data2, Group %in% c("1", "2") & PeakNumber == peak))$p.value
})