Search code examples
rdataframerandomsampling

random sample a vector multiple times to make groups and conduct ANOVA


I have 50 of randomly generated numbers with my set parameters. I want to randomly sample the 50 random numbers into 10 groups of 5 (without replacement)

I want to store the 10 groups as matrix/dataframe and run an ANOVA test on the groups, then repeat the whole process 1000 times storing the F, F Critical, P values from each iteration.

I have the following

samp <- rnorm(50,3.47,0.0189) # 50 samples, mean of 3.47 and SD of 0.0189

for (i in 1:10){
  x <- sample(samp, 5, replace = F)
}

x <- #all my random samples

my anova code that I usually use when data is in a list with a second column identifying the group

Samp_lm <- lm(Samp_lm ~ factor(group), data = x) 
AnovaResults <- anova(Samp_lm)

criticalValues <- cbind(AnovaResults, 'F Critical Value' = qf(1 - 0.05, test.Aov[1, 1], test.Aov[2, 1]))
AnovaStats <- cbind(criticalValues[1,4],criticalValues[1,5],criticalValues[1,6]

Not sure where to go from here.


Solution

  • Since you are repeating the random sampling, you should start by making a function that does what you want:

    SimAnova <- function() {
         Groups <-rep(LETTERS[1:10], each=5)
         Values <- rnorm(50, 3.47, 0.0189)
         AnovaResults <- anova(lm(Values~Groups))
         F <- AnovaResults[1, 4]
         df <- AnovaResults[, 1]
         Crit <- qf(1 - .05, df[1], df[2])
         P <- AnovaResults[1, 5]
         c("F-Value"=F, "Critical F-Value" =Crit, "P-Value"=P)
    }
    SimAnova()
    #          F-Value Critical F-Value          P-Value 
    #        1.7350592        2.1240293        0.1126789 
    SimAnova()
    #          F-Value Critical F-Value          P-Value 
    #       2.04024282       2.12402926       0.05965209 
    SimAnova()
    #          F-Value Critical F-Value          P-Value 
    #        1.635386         2.124029         0.138158 
    

    Now just repeat it 1000 times:

    result <- t(replicate(1000, SimAnova()))
    head(result)
    #        F-Value Critical F-Value   P-Value
    # [1,] 0.5659946         2.124029 0.8164247
    # [2,] 0.7717596         2.124029 0.6427732
    # [3,] 0.8377358         2.124029 0.5862101
    # [4,] 1.6284143         2.124029 0.1401280
    # [5,] 0.2191311         2.124029 0.9899751
    # [6,] 0.2744286         2.124029 0.9780476
    

    Notice that you don't really need to save the Critical F-Value because it is the same for every sample.