Search code examples
rstatisticsanovatukey

How to do ANOVA and Tukey's HSD on histogram dataframe in R


I have several tidyr "uncounted" histograms in one dataframe using the following code:

#read in csv's to list
temp = list.files(pattern="*.csv")
myfiles = lapply(temp, read.csv)

#uncount data. (original c1 ->c3 = frequency, value, variable)
#              (now c1, c2 = value, variable)
new_files <- lapply(myfiles, function(x) {
  names(x) <- c("EVI", "Frequency", "Transition")
  tidyr::uncount(x, Frequency)
})

list(temp)


#take the list of histogram and concatenate them
data_c <- do.call("rbind", new_files)
head(data_c)

Column 1 "EVI" contains the counts of each value, Column 2 is the variable of the given histogram of which there are 9

enter image description here

The tutorials that I have found don't go over turning histograms into ANOVAs and Tukey HSDs so I would appreciate your help!


Solution

  • Starting after the "uncount data" code block:

    #take the list of histogram and concatenate them
    data_c <- do.call("rbind", new_files)
    head(data_c)
    #dplyr::sample_n(data_c, 10)
    
    #Conduct ANOVA and the TukeyHSD
    data_c$EVI<-as.numeric(data_c$EVI)
    aov_m1<-aov(EVI~Transition, data=data_c)
    (THSD_m1<-TukeyHSD(aov_m1, "Transition"))