Search code examples
rggplot2gganimatebernoulli-probability

Making an animation of the Bernoulli distribution using gganimate and getting an unexpected jump at p = 0.5


I'm practicing using gganimate and want to create something similar to this Shiny app but animated. I've had some success, but am getting a really weird error that I honestly cannot explain and was hoping someone could help me out. The rest of this post contains a fully reproducible example.

Here is what I have so far:

bernoulli <- cbind.data.frame(c(rep(1,101), rep(0,101)), c(seq(1,0,by=-0.01), seq(0,1,by=0.01)), c(seq(1,0,by=-0.01), seq(1,0,by=-0.01)))
names(bernoulli) <- c("success", "probability", "p")
bernoulli <- bernoulli[order(bernoulli$p),]
row.names(bernoulli) <- c(1:nrow(bernoulli))

This creates the data frame I am working with. It's three variables, so very straightforward. The logic of my animation is to how how the probability of the dichotomous outcomes vary by the p parameter of the Bernoulli distribution. To give an example of a static graph, if I do something like this:

ggplot(subset(bernoulli, p == 0.70), aes(x=success, y=probability)) +
  geom_bar(stat="identity")

and vary the p by which I am subsetting on, I get the desired outcome (although for some reason, there are certain numbers, seemingly completely randomly, where I cannot subset the data, even though I can see them clearly in the data frame. I have checked that the variables are numeric and whatnot, so that's not the issue. This is a side problem that I have no idea why it is occurring).

For example, switch between p == 0.12 and p == 0.70 and compare them to the Shiny app mentioned earlier and you'll see it matches up, though the axes vary, of course.

When I try to implement the animation as follows:

ggplot(bernoulli, aes(x=success, y=probability)) +
  geom_bar(stat="identity") +
  transition_time(p)

Everything looks great! The axes are matched up nicely, the animation flows smoothly... but as soon as the bars hit p == 0.50, they both jump up to 1, something which is not reflected in the data at all. Is this just an issue with my computer/R/ggplot/gganimate?


Solution

  • You need to fix y axis using scale_y_continuous()

    library(dplyr)
    bernoulli %>%
        dplyr::filter(probability > 0 & probability < 1) %>%
        ggplot(aes(x=success, y=probability)) +
            geom_col() + 
            scale_y_continuous(limits = c(0,1)) +
            transition_time(p)
    

    If you don't fix y axis, it will look like this when p==0.5: enter image description here


    And it is truly strange that if I don't do the filter above, one of the bar will still jump to 1 when p==0.5