I am using the R programming language. I created the following data set for this example:
var_1 <- rnorm(1000,10,10)
var_2 <- rnorm(1000, 5, 5)
var_3 <- rnorm(1000, 6,18)
favorite_food <- c("pizza","ice cream", "sushi", "carrots", "onions", "broccoli", "spinach", "artichoke", "lima beans", "asparagus", "eggplant", "lettuce", "cucumbers")
favorite_food <- sample(favorite_food, 1000, replace=TRUE, prob=c(0.5, 0.45, 0.04, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001))
response <- c("a","b")
response <- sample(response, 1000, replace=TRUE, prob=c(0.3, 0.7))
data = data.frame( var_1, var_2, var_3, favorite_food, response)
data$favorite_food = as.factor(data$favorite_food)
data$response = as.factor(data$response)
From here, I want to make histograms for the two categorical variables in this data set and put them on the same page:
#make histograms and put them on the same page (note: I don't know why the "par(mfrow = c(1,2))" statement is not working)
par(mfrow = c(1,2))
histogram(data$response, main = "response"))
histogram(data$favorite_food, main = "favorite food"))
My question : Is it possibly to automatically produce histograms for all categorical variables (without manually writing the "histogram()" statement for each variable) in a given data set and print them on the same page? Is it better to the use the "ggplot2" library instead for this problem ?
I can manually write the "histogram()" statement for each individual categorical variables in the data set, but I was looking for a quicker way to do this. Is it possible to do this with a "for loop"?
Thanks
Here's a base R alternative using barplot
in for
loop :
cols <- names(data)[sapply(data, is.factor)]
#This would need some manual adjustment if number of columns increase
par(mfrow = c(1,length(cols)))
for(i in cols) {
barplot(table(data[[i]]), main = i)
}