Long story short, I need to perform several hundred million t-tests. I have two lists of samples, ys
and ns
, and I want to compare a sample from each list, so the first sample in ys
will be compared to the first sample in ns
and so on. The result will be a list of p-values, one from each comparison. What is the fastest way to do this? Currently, I am using the map
function
p_values = [result[1] for result in list(map(ttest_ind, ys, ns))]
but it is still slow. numpy.vectorize
looks like it might be faster, but I can't figure out how to use it with a function that takes two lists as input. Would it be faster if I hard coded the t-test math instead of using scipy.stats.ttest_ind
?
The whole idea is: not running this in Python, but in C/C++.
For which you have two choices: