I want to create a loop that stores the output of t-tests for several variables in a data frame. But when I store the different variables in a vector with quotation marks, the variables cannot be used for the t-test as they are saved with the quotation marks. For example, R takes the first variable as "variable_1" in the loop, which produces an error because for the t-test I need the variable without the quotation marks, e.g. t.test(variable_1 ~ Gender). Does someone know how to get rid of the quotation marks of the names of the variables in a vector?
variable <- c("variable_1", "variable_2", "variable_3")
df <- data.frame(t_value=as.numeric(),
df=as.numeric(),
p_value= as.numeric(),
mean_f= as.numeric(),
mean_m= as.numeric())
attach(data)
for(v in variable){
output <- t.test(v ~ Gender)
values <- output[c(1,2,3,5)]
row <- round(unlist(values, use.names = FALSE),3)
df <- rbind(df, row)
}
Here's some changes that will make it work via get
. As others have pointed out attach
is a terrible idea in this context. So I've used mtcars
as an example and left it out.
Several other changes to make things as good as they can be. You'd be much better served searching stack for the vast number of answers on run a t-test on multiple variables though or just using @starja or @r2evans answer.
variable <- c("mpg", "hp")
df <- data.frame(t_value=as.numeric(),
df=as.numeric(),
p_value= as.numeric(),
mean_f= as.numeric(),
mean_m= as.numeric())
for(v in variable){
output <- t.test(get(v) ~ am, data = mtcars)
values <- output[c(1,2,3,5)]
row <- round(unlist(values, use.names = FALSE), 3)
df_row <- data.frame(t_value=row[[1]],
df=row[[2]],
p_value= row[[3]],
mean_f= row[[4]],
mean_m= row[[5]])
df <- rbind(df, df_row)
}
df
#> t_value df p_value mean_f mean_m
#> 1 -3.767 18.332 0.001 17.147 24.392
#> 2 1.266 18.715 0.221 160.263 126.846