Consider this data frame dat1
:
dat1 <- data.frame(Region = rep(c("r1","r2"), each = 100),
State = rep(c("NY","MA","FL","GA"), each = 10),
Loc = rep(c("a","b","c","d","e","f","g","h"),each = 5),
ID = rep(c(1:10), each=2),
var1 = rnorm(200),
var2 = rnorm(200),
var3 = rnorm(200),
var4 = rnorm(200),
var5 = rnorm(200))
I have data frames that are similar to dat1
created above. Region
,State
,and Loc
are grouping variables for each observation ID
, and 5 measurements are taken upon each observation var1:var5
. For each grouping variable, I am conducting univariate anovas on each var
. When significant differences are found, I am using the TukeyHSD()
function and the multcompLetters()
function from the multcompView
package to generate CLD's on the groups. Since I want to do this for each grouping variable I am trying to write a function to keep myself from being repetitive and making typos. Below shows where I am with this:
library(tidyverse)
library(multcomp)
library(multcompView)
Tuk <- function(dat,groupvar,var){
TUK <- TukeyHSD(aov(lm(get(var) ~ get(groupvar), data=dat)))
names(TUK)[[1]] <- paste0(groupvar)
lets<-multcompLetters(extract_p(TUK$groupvar))
lets
}
#assuming all 5 vars were significant in the anovas, I would then run this for each grouping variable as follows:
vars <- paste0(names(dat1[,5:9]))
#by Region
lapply(vars, FUN=Tuk, dat=dat1, groupvar="Region")
#by State
lapply(vars, FUN=Tuk, dat=dat1, groupvar="State")
#by Loc
lapply(vars, FUN=Tuk, dat=dat1, groupvar="Loc")
The code works outside of the function. The function will create the model, but I cant figure out how to format it so that it recognizes what groupvar
is for the multcompLetters(extract_p())
part? How can I fix this, and how can I get it the function to output a tidy table that shows each group and the letters for each variable I give it at once. For example it would look something like this for State
using all 5 variables
NY MA FL GA
var1 a ab c a
var2 a ab b c
var3 a c ab bc
var4 ab c ab ab
var5 a b c b
Also, is there a reasonable way to make this function produce boxplots (for each variable) of the groups that show the CLD letters?
Assuming the plot is truly what you're looking for does this get you pretty close for a sing plot of var1 ~ State, Indrajeet has done a great job of building this package and I hate reinventing wheels.
dat1 <- data.frame(Region = rep(c("r1","r2"), each = 100),
State = rep(c("NY","MA","FL","GA"), each = 10),
Loc = rep(c("a","b","c","d","e","f","g","h"),each = 5),
ID = rep(c(1:10), each=2),
var1 = rnorm(200),
var2 = rnorm(200),
var3 = rnorm(200),
var4 = rnorm(200),
var5 = rnorm(200))
library(ggstatsplot)
ggbetweenstats(dat1, State, var1,
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "everything")
#> Note: Shapiro-Wilk Normality Test for var1: p-value = 0.183
#>
#> Note: Bartlett's test for homogeneity of variances for factor State: p-value = 0.373
#>