Search code examples
rggplot2summaryfacet-wrap

Plot the means of multiple columns


I want to show different barplots for the years and gender with the mean values of the variables Q1 to Q5, which should look like a density.

I have data that looks like this:

data <- data.frame(userid = c(1,1,1,2,2,2,3,3,3),
                  year = c(2013,2014,2015,2013,2014,2015,2013,2014,2015),
                  gender = c(1,1,1,0,0,0,0,0,0),
                  Q1 = c(3,2,3,1,0,1,2,1,0),
                  Q2 = c(4,3,4,2,0,2,1,4,3),
                  Q3 = c(1,2,1,3,5,4,5,4,5),
                  Q4 = c(1,2,1,2,4,3,2,2,1),
                  Q5 = c(1,1,1,2,1,0,0,0,1))

My solution was to filter() for year and gender first and then use summarise(), to get a vector of the means and put this into the barplot() function:

data %>% filter(gender==1,year==2013) %>% select(-userid,-gender,-year) %>% summarise_all(mean) %>%
  as.numeric() %>%
  barplot()

Instead of doing this for every combination of year and gender, is there a more elegant way, using ggplot and facet_wrap()?


Solution

  • I may have misunderstood how you want the plot arranged, but if you want to show the mean score answer per year and gender group, you could do facets like this:

    library(tidyverse)
    
    data %>%
      pivot_longer(starts_with("Q")) %>%
      group_by(year, gender, name) %>%
      summarize(value = mean(value)) %>%
      ggplot(aes(name, value)) +
      geom_col(fill = 'deepskyblue4') +
      facet_grid(year ~ gender) +
      labs(x = 'Question', y = 'Average score') +
      theme_minimal(base_size = 16)
    

    enter image description here