I would like to create a barplot like this:
library(ggplot2)
# Dodged bar charts
ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar(position="dodge")
However, instead of counts, I want to have the percentage of observations falling into each 'clarity' category by cutting category ('fair', 'good', 'very good' ...).
With this ...
# Dodged bar charts
ggplot(diamonds, aes(clarity, fill=cut)) +
geom_bar(aes(y = (..count..)/sum(..count..)), position="dodge")
I get percentages on the y-axis, but these percentages ignore the cut-factor. I want that all the red bars sum up to 1, all the yellow bars sum up to 1 etc.
Is there an easy way to make that work without having to prepare the data manually?
Thanks!
P.S.: This is a follow-up to this stackoverflow question
You could use sjp.xtab
from the sjPlot-package for that:
sjp.xtab(diamonds$clarity,
diamonds$cut,
showValueLabels = F,
tableIndex = "row",
barPosition = "stack")
The data preparation for stacked group-percentages that sum up to 100% should be:
data.frame(prop.table(table(diamonds$clarity, diamonds$cut),1))
thus, you could write
mydf <- data.frame(prop.table(table(diamonds$clarity, diamonds$cut),1))
ggplot(mydf, aes(Var1, Freq, fill = Var2)) +
geom_bar(position = "stack", stat = "identity") +
scale_y_continuous(labels=scales::percent)
Edit: This one adds up each category (Fair, Good...) to 100%, using 2
in prop.table
and position = "dodge"
:
mydf <- data.frame(prop.table(table(diamonds$clarity, diamonds$cut),2))
ggplot(mydf, aes(Var1, Freq, fill = Var2)) +
geom_bar(position = "dodge", stat = "identity") +
scale_y_continuous(labels=scales::percent)
or
sjp.xtab(diamonds$clarity,
diamonds$cut,
showValueLabels = F,
tableIndex = "col")
Verifying the last example with dplyr, summing up percentages within each group:
library(dplyr)
mydf %>% group_by(Var2) %>% summarise(percsum = sum(Freq))
> Var2 percsum
> 1 Fair 1
> 2 Good 1
> 3 Very Good 1
> 4 Premium 1
> 5 Ideal 1
(see this page for further plot-options and examples from sjp.xtab
...)