Search code examples
rggplot2probability-distribution

Color-coded PMF with legend in ggplot2


My goal is to produce two overlapping PMFs of binomial distributions using ggplot2, color-coded according to colors that I specify, with a legend at the bottom.

So far, I think I have set up the data frame right.

successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep('  A  ',11),rep('  B  ',11))
df1 <- data.frame(cbind(successes,freq,class))

However, this gives the wrong result.

library(ggplot2)
g <- ggplot(df1, aes(successes),y=freq)
g + geom_bar(aes(fill = class))

I feel like I'm following an example yet getting a totally different result. This (almost) does what I want: it would be exact if it gave relative frequencies.

g <- ggplot(mpg, aes(class))
g + geom_bar(aes(fill = drv))

A couple of questions:

1) Where am I going wrong in my block of code?

2) Is there a better way to show to PMFs in one graph? I'm not determined to use a histogram or bar chart.

3) How can I set this up to give me the ability to choose the colors?

4) How do I order the values on the x-axis? They aren't categories. They are the numbers 0-10 and have a natural order that I want to preserve.

Thanks!

UPDATE

The following two blocks worked.

successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep('  A  ',11),rep('  B  ',11))
df1 <- data.frame(successes,freq,class)
ggplot(df1, aes(successes ,y=freq, fill = class)) +
geom_bar(stat = "identity") +
scale_x_continuous(breaks = seq(0,10,1)) +
scale_fill_manual(values = c("blue", "green")) + theme_bw()

AND

successes <- c(seq(0,10,1),seq(0,10,1))
freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
class <- c(rep('  A  ',11),rep('  B  ',11))
df1 <- data.frame(successes,freq,class)
ggplot(df1, aes(x=successes,y=freq),y=freq) + 
geom_col(aes(fill = class)) +
scale_x_continuous(breaks = seq(0,10,1)) +
scale_fill_manual(values = c("blue", "green")) + theme_bw()

Solution

  • Is this what you're looking for?

    library(ggplot2)
    g <- ggplot(df1, aes(successes ,y=freq, fill = class))
    g + geom_bar(stat = "identity") +
    scale_fill_manual(values = c("blue", "green"))
    

    Of course, keeping in mind you'd indeed change your dataframe creation to:

    successes <- c(seq(0,10,1),seq(0,10,1))
    freq <- c(dbinom(seq(0,10,1),10,0.2),dbinom(seq(0,10,1),10,0.8))
    class <- c(rep('  A  ',11),rep('  B  ',11))
    df1 <- data.frame(successes,freq,class)
    

    as suggested in the comments.