Plot the means of multiple columns

I want to show different barplots for the years and gender with the mean values of the variables Q1 to Q5, which should look like a density.

I have data that looks like this:

data <- data.frame(userid = c(1,1,1,2,2,2,3,3,3),
                  year = c(2013,2014,2015,2013,2014,2015,2013,2014,2015),
                  gender = c(1,1,1,0,0,0,0,0,0),
                  Q1 = c(3,2,3,1,0,1,2,1,0),
                  Q2 = c(4,3,4,2,0,2,1,4,3),
                  Q3 = c(1,2,1,3,5,4,5,4,5),
                  Q4 = c(1,2,1,2,4,3,2,2,1),
                  Q5 = c(1,1,1,2,1,0,0,0,1))

My solution was to filter() for year and gender first and then use summarise(), to get a vector of the means and put this into the barplot() function:

data %>% filter(gender==1,year==2013) %>% select(-userid,-gender,-year) %>% summarise_all(mean) %>%
  as.numeric() %>%
  barplot()

Instead of doing this for every combination of year and gender, is there a more elegant way, using ggplot and facet_wrap()?

Solution

I may have misunderstood how you want the plot arranged, but if you want to show the mean score answer per year and gender group, you could do facets like this:

library(tidyverse)

data %>%
  pivot_longer(starts_with("Q")) %>%
  group_by(year, gender, name) %>%
  summarize(value = mean(value)) %>%
  ggplot(aes(name, value)) +
  geom_col(fill = 'deepskyblue4') +
  facet_grid(year ~ gender) +
  labs(x = 'Question', y = 'Average score') +
  theme_minimal(base_size = 16)