I am trying to create a barplot that shows the average hourly wages of union and nonunion workers grouped by single or married grouped by college grad or not college grad. While I've managed to construct a passable barplot with two factor groupings, I cannot figure out how to do so with three factor groupings. The examples I have seen that have three factors look just at frequency counts, so I'm not sure how to incorporate the mean of another variable across all the factors into the plot. What I am looking to create is something that looks like this (created in Stata): Average Hourly Wage by Union Status, Marital Status, and College Graduation My code looks like this:
levelbar = tapply(wage, list(as.factor(union), as.factor(married),
as.factor(collgrad)), mean)
par(mfrow = c(1, 2))
barplot(levelbar, beside = TRUE)
barplot(t(levelbar), beside = TRUE)
When I run this, however, I receive the error:
Error in barplot.default(levelbar, beside = TRUE) :
'height' must be a vector or a matrix
Any help on this would be appreciated. I'm sure ggplot might be useful here, but I do not have a great deal of experience using that package.
Here's a reproducible example using ggplot
and the built-in dataset Titanic.
Note that we calculate the means first and use stat = identity
to make sure we get those into the plot.
# Format the Titanic dataframe
Titanic_df <- Titanic %>% as_tibble()
# Make Class, Sex, Age, and Survived factors
for (col in c("Class", "Sex", "Age", "Survived")) {
Titanic_df[[col]] <- factor(Titanic_df[[col]])
}
# Get by group means
means <- Titanic_df %>%
group_by(Class, Sex, Survived) %>%
summarise(
mean_n = mean(n)
)
# Plot: facets are the Classes, bar colors are the two Sexes, and the groupings in each facet are Survived vs. Not Survived
ggplot(data = means) +
geom_bar(aes(x = Survived, y = mean_n, fill = Sex), stat = "identity", position = "dodge") +
facet_wrap(~ Class)