Search code examples
rfor-loophistogramp-value

Finding p value of histogram data in R


Based on the proportion of slopes from the randomisation, greater or less than the slope from the observed data, I would like to calculate the expected probability of getting the observed slope. The observed slope is -0.2717.

Any help would be greatly appreciated, I am a newbie.

histdata<- numeric(10000)
for (i in 1:10000) {histdata[i]<-(summary.lm(lm(sample(tcons)~tleave))
[[4]][[2]])}
hist(histdata)
abline(v=-0.2717, lwd=3, lty=2)
box()

data3<- -0.2717>histdata

This ^^ gives me 9954 that are not greater than the original and 46 that are greater.


Solution

  • If you have the results of a randomization procedure in rand_vals and an observed value in obs_val, then the one-tailed p-value (quantifying support for the null hypothesis vs. the alternative hypothesis that the observed value is greater than the null value) is

    mean(rand_vals>=obs)
    
    • Note that this is NOT ☢☣ (can't find a skull & crossbones emoji) the "probability of getting the observed slope". It is *the probability of observing a value greater than or equal to the observed slope, if the null hypothesis is true.
    • In some cases it may be appropriate to include the observed value in the "randomization" set as well, i.e. mean(c(rand_vals,obs)>=obs); this won't make much difference if your randomization set is large.
    • a two-tailed p-value would be something like mean(abs(rand_vals)>=abs(obs))