I want to show different barplots for the years and gender with the mean values of the variables Q1 to Q5, which should look like a density.
I have data that looks like this:
data <- data.frame(userid = c(1,1,1,2,2,2,3,3,3),
year = c(2013,2014,2015,2013,2014,2015,2013,2014,2015),
gender = c(1,1,1,0,0,0,0,0,0),
Q1 = c(3,2,3,1,0,1,2,1,0),
Q2 = c(4,3,4,2,0,2,1,4,3),
Q3 = c(1,2,1,3,5,4,5,4,5),
Q4 = c(1,2,1,2,4,3,2,2,1),
Q5 = c(1,1,1,2,1,0,0,0,1))
My solution was to filter()
for year and gender first and then use summarise()
,
to get a vector of the means and put this into the barplot()
function:
data %>% filter(gender==1,year==2013) %>% select(-userid,-gender,-year) %>% summarise_all(mean) %>%
as.numeric() %>%
barplot()
Instead of doing this for every combination of year and gender,
is there a more elegant way, using ggplot and facet_wrap()
?
I may have misunderstood how you want the plot arranged, but if you want to show the mean score answer per year and gender group, you could do facets like this:
library(tidyverse)
data %>%
pivot_longer(starts_with("Q")) %>%
group_by(year, gender, name) %>%
summarize(value = mean(value)) %>%
ggplot(aes(name, value)) +
geom_col(fill = 'deepskyblue4') +
facet_grid(year ~ gender) +
labs(x = 'Question', y = 'Average score') +
theme_minimal(base_size = 16)