I have a potentially very stupid question, but can't seem to find a solution easily. And i'm pretty new to R, so please forgive my ignorance.
I'm looking for a way to loop through all variables in my dataframe. For instance, to make two-way tables of all variables compared to one specific variable (say, Sex or Educational level). I used to work with Stata, but since R is free, I am now supposed to work with R (I heard there are a plethora of other benefits to working with R as well, so I am very willing to learn :)).
Say, I have 20 variables, of which 15 are answers from a survey and 5 are demographic variables. I would like to see how different answers compare to differences in demographics.
Normally I would tackle the problem above in Stata with something simple as:
for i = 1 to 5 {
for j = 1 to 3 {
tab Sex Var`i'_`j', chi2
}
}
making 15 tables, for the variables Var1_1 to Var5_3 vs Sex, and giving a Pearson chi2 statistic.
So, I tried what I thought was the same for R:
for (i in 1:5) {
for (j in 1:3){
print(table(chisq.test(paste(df$Sex, "df$Var",i,"_",j,sep=""))))
}
}
but this doesn't work.
Can anyone please point me in the right direction as how to solve this? Any help is highly appreciated!
Let's pretend that df
is your data and first 15 columns are answers.
In this case you can use this
lapply(df[,1:15], function(x) {chisq.test(x, df$Sex)})