Why my p.value distribution of t-test compairing same distribution is not homogenous?

I was watching the Statquest video about Independent filtering and I wanted to recreate his example.

https://www.youtube.com/watch?v=Gi0JdrxRq5s&t=208s

So basicaly he made two tests:

In one, he performed 1,000 t-test where he compared triplicates from 2 different normal distribution. Most of the p-value were <0.05 but, due to the 5% error, he showed that +/- 50 of them were >0.05, which are "false negative".

On the other one, he performed 1,000 t-test on triplicates from the same normal distribution. Obivously, not many p-value <0.05 as he obtained an homogenous distribution of p-value, ranging from 0 to 1.

So I used the following code to create something similar for the second "test" and obtained this: density of p-value

Here's the (not perfect, it's a quick and dirty check) code:

pvals <- replicate(10000,t.test(rnorm(3), rnorm(3), alternative = "two.sided", conf.level = 0.95)$p.value)
h <- hist(pvals, breaks=25, plot = FALSE)
cuts <- cut(h$breaks, c(-Inf, 0.0499999999, Inf))
plot(h, col=c("green", "red")[cuts])
length(which(pvals < 0.05))

Clearly, the number of pvalue <0.05 is lower than the other one. Statisticaly, I would expect to have around 500 pvalue <0.05 (5% of 10,000 tests) but it's not the case. I think my code is wrong but I don't see where...

Thank in advance for your help !

Solution

The issue comes from the small sample size of the test. It is easy to show by simulation:

replicate(10, sum(replicate(10000,t.test(rnorm(3), rnorm(3), alternative = "two.sided", conf.level = 0.95)$p.value)<.05))
#332 370 397 368 398 339 362 334 325 350
replicate(10, sum(replicate(10000,t.test(rnorm(30), rnorm(30), alternative = "two.sided", conf.level = 0.95)$p.value)<.05))
#504 534 511 475 502 490 497 537 506 527