My first time asking a question on here so please go easy! As you'll see from my sample code I'm new to R as well (3 months) so a bit embarrassed to be showing it! I've got quite a specific requirement but there might be a better way of visualising it. We have people in regions that have answered questions several times and we want to compare the first and most recent responses. The answers are basically on a 1-5 scale but I've left it as the wordy answers (from Completely Insufficient to Completely Sufficient). I want to display the "worse" answers as negative, the good answers as positive, and split the middle answer ("Quite Insufficient") in half, so that the halfway point gets kind of centred on the plot (am I making sense?! There's a sample plot linked below). I also want to group by the region and whether this is the First or Last response. If I plot the 2 separate dataframes then the chart looks good but I can't order the legend. If I union the dataframes the legend looks good but the chart goes wrong! Please help!
# Input load
`dataset` = readr::read_csv("FirstLast,AnswerCount,Answer,Region
First,10,Completely Insufficient,North
First,3,Completely Insufficient,South
Last,5,Completely Insufficient,North
Last,1,Completely Insufficient,South
First,8,Mostly Insufficient,North
First,2,Mostly Insufficient,South
Last,9,Mostly Insufficient,North
Last,2,Mostly Insufficient,South
First,14,Quite Insufficient,North
First,3,Quite Insufficient,South
Last,19,Quite Insufficient,North
Last,7,Quite Insufficient,South
First,26,Mostly Sufficient,North
First,9,Mostly Sufficient,South
Last,44,Mostly Sufficient,North
Last,17,Mostly Sufficient,South
First,8,Completely Sufficient,North
First,3,Completely Sufficient,South
Last,16,Completely Sufficient,North
Last,3,Completely Sufficient,South")
require("dplyr")
library(dplyr)
require("ggplot2")
library(ggplot2)
require("tidyr")
library(tidyr)
require("stringr")
library(stringr)
require("formattable")
library(formattable)
# split mid answer for First reviews
Reviews.First.four <- filter(Reviews.Sums, FirstLast == "First", Answer=="Quite Insufficient") %>% mutate(AnswerCount=as.numeric(AnswerCount/2))
Reviews.First.rest <- filter(Reviews.Sums, FirstLast == "First", Answer != "Quite Insufficient")
Reviews.First <- full_join(Reviews.First.four, Reviews.First.rest) %>% arrange(Answer)
Reviews.First <- mutate(Reviews.First, RegRev = paste(Region, FirstLast))
# split mid answer for Last reviews
Reviews.Last.four <- filter(Reviews.Sums, FirstLast == "Last", Answer=="Quite Insufficient") %>% mutate(AnswerCount=as.numeric(AnswerCount/2))
Reviews.Last.rest <- filter(Reviews.Sums, FirstLast == "Last", Answer !="Quite Insufficient")
Reviews.Last <- full_join(Reviews.Last.four, Reviews.Last.rest) %>% arrange(Answer)
Reviews.Last <- mutate(Reviews.Last, RegRev = paste(Region,FirstLast))
# Split data into negative and positive scores
Reviews.First.Neg <- Reviews.First %>%
filter (Answer == "Completely Insufficient" | Answer == "Mostly Insufficient" | Answer == "Quite Insufficient") %>%
mutate(AnswerCount = AnswerCount *-1)
Reviews.First.Pos <- Reviews.First %>%
filter (Answer == "Quite Insufficient" | Answer == "Mostly Sufficient" | Answer == "Completely Sufficient")
Reviews.Last.Neg <- Reviews.Last %>%
filter (Answer == "Completely Insufficient" | Answer == "Mostly Insufficient" | Answer == "Quite Insufficient") %>%
mutate(AnswerCount = AnswerCount *-1)
Reviews.Last.Pos <-Reviews.Last %>%
filter (Answer == "Quite Insufficient" | Answer == "Mostly Sufficient" | Answer == "Completely Sufficient")
# Reorder factors (or try to anyway!)
Reviews.First.Neg$Answer <- factor(Reviews.First.Neg$Answer, levels=c("Completely Insufficient", "Mostly Insufficient", "Quite Insufficient"))
Reviews.First.Pos$Answer <- factor(Reviews.First.Pos$Answer, levels=rev(c("Quite Insufficient", "Mostly Sufficient", "Completely Sufficient")))
Reviews.Last.Neg$Answer <- factor(Reviews.Last.Neg$Answer, levels=c("Completely Insufficient", "Mostly Insufficient", "Quite Insufficient"))
Reviews.Last.Pos$Answer <- factor(Reviews.Last.Pos$Answer, levels=rev(c("Quite Insufficient", "Mostly Sufficient", "Completely Sufficient")))
# Other thing I tried was to order both factors same before union-ing them - plot Reviews.all instead of the separate First.Pos and First.Neg and still no joy - sad smiley
#Reviews.First.Neg$Answer <- factor(Reviews.First.Neg$Answer, levels=c("Completely Insufficient", "Mostly Insufficient", "Quite Insufficient", "Mostly Sufficient", "Completely Sufficient"))
#Reviews.First.Pos$Answer <- factor(Reviews.First.Pos$Answer, levels=c("Completely Insufficient", "Mostly Insufficient", "Quite Insufficient", "Mostly Sufficient", "Completely Sufficient"))
#Reviews.all <- union(Reviews.First.Neg, Reviews.First.Pos)
#Reviews.all$Answer = factor(Reviews.all$Answer, levels=c("Completely Insufficient", "Mostly Insufficient", "Quite Insufficient", "Mostly Sufficient", "Completely Sufficient"))
# and plot!
ggplot() +
# geom_bar(data=Reviews.all, aes(x=RegRev, y=AnswerCount, fill=Answer), stat="identity", position = "stack") +
geom_bar(data=Reviews.First.Neg, aes(x=RegRev, y=AnswerCount, fill=Answer), stat="identity", position = "stack") +
geom_bar(data=Reviews.First.Pos, aes(x=RegRev, y=AnswerCount, fill=Answer), stat="identity", position = "stack") +
geom_bar(data=Reviews.Last.Neg, aes(x=RegRev, y=AnswerCount, fill=Answer), stat="identity", position = "stack") +
geom_bar(data=Reviews.Last.Pos, aes(x=RegRev, y=AnswerCount, fill=Answer), stat="identity", position = "stack") +
coord_flip() +
theme_minimal() +
scale_fill_manual(values = c("#d7191c","#fdae61","#ffffbf","#abdda4","#2b83ba"))+
theme(
legend.position = "top"
) +
guides(fill = guide_legend(nrow = 2, byrow=TRUE))
TLDR - I'm terrible at R. Any help much appreciated.
If I plot the 2 separate dataframes then the chart looks good but I can't order the legend.
Let me know if you'd like other adjustments or stuff too.
Reviews.comb <- bind_rows(Reviews.Last.Pos, Reviews.Last.Neg, Reviews.First.Pos, Reviews.First.Neg)
cols <- c("#d7191c","#fdae61","#ffffbf","#abdda4","#2b83ba")
ord <- c("Completely Insufficient", "Mostly Insufficient", "Quite Insufficient", "Mostly Sufficient", "Completely Sufficient")
ggplot() + geom_bar(data=Reviews.comb, aes(x=RegRev, y=AnswerCount, fill=Answer), stat="identity", position = "stack") +
coord_flip() +
theme_minimal() +
scale_fill_manual(breaks = ord, values = cols) +
theme(
legend.position = "top") +
guides(fill = guide_legend(nrow = 2, byrow=TRUE)) +
labs(x = "Answer Count", y = "Reg Rev") + expand_limits(y = c(-50, 50))
UPDATE: I added expand_limits to try to center your bargraphs around 0.
I also compared your union command with a bind_rows
equivalent (bind_rows(Reviews.First.Pos, Reviews.First.Neg)
); only the order was different. That's probably what changed the order of the graphs. The breaks part should reorder your graphs for you.