We are given a matrix with 2 columns (samples, experiment conditions) and n rows (genes for example), and we aim to identify the genes that have significantly changed (at a specific FDR) between the two samples.
How to perform this using R?
Below is an example from fdrtool
package manual that shows how to compute FDR from a vector of p-values:
library("fdrtool")
data(pvalues)
fdr = fdrtool(pvalues, statistic="pvalue")
fdr$qval # estimated Fdr values
fdr$lfdr # estimated local fdr
But the problem is that we have just two vectors of observations here, not the p-values. Any ideas?
Here is a sample data that can be used: foo <- matrix(runif(1000), ncol=2)
I assume we have no replicate information, p-value, etc. But for sure the genes that have far different values between the two samples have for sure stronger evidence. Is there any way to assign FDR in this condition?
if you have one sample for each condition there is no way to have a pvalue,because this is the probability that the difference between samples drawn for one population are statistically different. But, if you have no replicates, no mean, no variance for each gene, as I understood, we can't estimate the sampling error, and therefore there is no how to differentiate the value you see from a random value, for a conventional test of small samples, as t-test. Look this, it may help:
http://en.wikipedia.org/wiki/P-value
http://www-stat.stanford.edu/~tibs/SAM/
What you can do, is a MA plot
http://en.wikipedia.org/wiki/MA_plot
and see for the distribution of your data which are the big differences, and select those. But, this is not in the statistical framework of a false discovery rate analysis, it may help as an exploratory analysis, but there is no real statistic in that. In the literature of microarray you probably will find alternatives, to make a set of assumptions and have a hypothesis test, but I don't know one to indicate, maybe the affy package have one...