I'm going to use the diamond data set that comes standard with the ggplot2 package to illustrate what I'm looking for.
I want to build a graph that is like this:
library(ggplot2)
ggplot(diamonds, aes(clarity, fill=cut)) + geom_bar(position="dodge")
However, instead of having a count, I would like to return the mean of a continuous variable. I'd like to return cut and color and get the mean carat. If I put in this code:
ggplot(diamonds, aes(carat, fill=cut)) + geom_bar(position="dodge")
My output is a count of the number of carats vs the cut.
Anyone know how to do this?
You can get a new data frame with mean(carat)
grouped by cut
and color
and then plot:
library(plyr)
data <- ddply(diamonds, .(cut, color), summarise, mean_carat = mean(carat))
ggplot(data, aes(color, mean_carat,fill=cut))+geom_bar(stat="identity", position="dodge")
If you want faster solutions you can use either dplyr
or data.table
With dplyr
:
library(dplyr)
data <- group_by(diamonds, cut, color)%.%summarise(mean_carat=mean(carat))
With data.table
:
library(data.table)
data <- data.table(diamonds)[,list(mean_carat=mean(carat)), by=c('cut', 'color')]
The code for the plot is the same for both.