Search code examples
rggplot2gganimate

Why does gganimate fail to order lines correctly by date with transition_reveal?


I'm aiming to reproduce an animated figure by Ed Hawkins on climate change in R with gganimate. The figure is called climate spiral. While a static ggplot figure shows the correct order of lines by year (the most recent data on top), the animated plot with transition_reveal() results in a wrong order of the lines.

Here is a reproducible example code with synthetic data:

library(tidyverse)
library(lubridate)
library(gganimate)
library(RColorBrewer) 

# Create monthly data from 1950 to 2020 (and a component for rising values with time)
df <- tibble(year = rep(1950:2020, each = 12), 
             month = rep(month.abb, 2020-1950+1)) %>%
  mutate(date = dmy(paste("01",month,year)),
         value = rnorm(n(), 0, 2) + row_number()*0.005) %>%
  with_groups(year, mutate, value_yr = mean(value))


temp <- df %>%
  ggplot(aes(x = month(date, label=T), y = value, color = value_yr)) +
  geom_line(size = 0.6, aes(group = year)) +
  geom_hline(yintercept = 0, color = "white") +
  geom_hline(yintercept = c(-4,4), color = c("skyblue3","red1"), size = 0.2) +
  geom_vline(xintercept = 1:12, color = "white", size = 0.2) +
  annotate("label", x = 12.5, y = c(-4,0,4), label = c("-4°C","0°C","+4°C"), 
           color = c("skyblue3","white","red1"), size = 2.5, fill = "#464950", 
           label.size = NA, label.padding = unit(0.1, "lines"),) +
  geom_point(x = 1, y = -11, size = 15, color = "#464950") + 
  geom_label(aes(x = 1, y = -11, label = year), 
             color = "white", size = 4, 
             fill = "#464950", label.size = NA) +
  coord_polar(start = 0) +
  scale_color_gradientn(colors = rev(brewer.pal(n=11, name = "RdBu")),
                        limits = range(df$value_yr)) +
  labs(x = "", y = "") + 
  theme_bw() +
  theme(panel.background = element_blank(),
        panel.border = element_blank(),
        panel.grid.major = element_blank(),
        plot.background=element_rect(fill="#464950", color="#464950"),
        axis.text.x = element_text(margin = margin(t = -20, unit = "pt"), 
                                   color = "white"),
        axis.text.y = element_blank(), 
        axis.ticks = element_blank(),
        legend.position = "none") 

Now, we can either save the plot as PNG or animate and save as GIF:

ggsave(temp, filename = "test.png", width = 5, height = 5, dpi = 320)

# Animate by date:
anim <- temp +
  transition_reveal(date) +
  ease_aes('linear')
  
output <- animate(anim, nframes = 100, end_pause = 30,
                  height = 5, width = 5, units = "in", res = 300)

anim_save("test.gif", output)

Let's see the results!

Static PNG: enter image description here

Animated GIF: enter image description here

At first glance, the results look equal, however, the detail shows differences (for instance, the marked blue line).

enter image description here

In this example code with synthetic data, the differences are minor. But with real data, the figures look pretty different as many red lines (recent data points with high temperatures) disappear in the background. So, how can you retain the order in transition_reveal() by date? Any help appreciated, thanks a lot!


Solution

  • This isn't the answer per se. This is the why. You'll have to tell me what you prefer given this information for me to give you a solution.

    I tried a few things—each of which I was just sure would work but did not. So, I wanted to see what was happening in ggplot. My hunch proved correct. Your data is in order of value_yr in the png, not year.

    I repeat this question at the end:

    Either you can put the animation in order of value_yr or you can put the color in ggplot in order by year. Which would you prefer?

    How do I know? I extracted the assigned colors in the object.

    tellMe <- ggplot_build(temp)$data[[1]]
    head(tellMe)
    #    colour x           y group PANEL flipped_aes size linetype alpha
    # 1 #1E60A4 1 -1.75990067     1     1       FALSE  0.6        1    NA
    # 2 #1E60A4 2 -0.08968196     1     1       FALSE  0.6        1    NA
    # 3 #1E60A4 3 -0.69657130     1     1       FALSE  0.6        1    NA
    # 4 #1E60A4 4 -0.10777727     1     1       FALSE  0.6        1    NA
    # 5 #1E60A4 5  1.57710505     1     1       FALSE  0.6        1    NA
    # 6 #1E60A4 6  1.63277369     1     1       FALSE  0.6        1    NA 
    
    gimme <- tellMe %>% group_by(group) %>% 
      summarise(color = unique(colour)) %>% 
      print(n = 100) # there are less than 100, I just want them all
    
    head(gimme)
    # # A tibble: 6 × 2
    #   group color  
    #   <int> <chr>  
    # 1     1 #1E60A4
    # 2     2 #114781
    # 3     3 #175290
    # 4     4 #053061
    # 5     5 #1C5C9E
    # 6     6 #3E8BBF 
    

    To me, this indicated that the colors weren't in group order, so I wanted to see the colors to visualize the order.

    I used this function. I know it came from a demo, but I don't remember which one. I looked just so I could include that here, but I didn't find it.

    # this is from a demo (not sure which one anymore!
    showCols <- function(cl=colors(), bg = "lightgrey",
                         cex = .75, rot = 20) {
      m <- ceiling(sqrt(n <-length(cl)))
      length(cl) <- m*m; cm <- matrix(cl, m)
      require("grid")
      grid.newpage(); vp <- viewport(w = .92, h = .92)
      grid.rect(gp=gpar(fill=bg))
      grid.text(cm, x = col(cm)/m, y = rev(row(cm))/m, rot = rot,
                vp=vp, gp=gpar(cex = cex, col = cm))
    }
    
    showCols(gimme$color)
    

    The top left color is the oldest year, the value below it is the following year, and so on. The most recent year is the bottom value in the right-most column.

    enter image description here

    df %>% group_by(yr) %>% summarise(value_yr = unique(value_yr))
    # they are in 'value_yr' order in ggplot, not year
    # # A tibble: 71 × 2
    #       yr value_yr
    #    <int>    <dbl>
    #  1  1950  0.0380 
    #  2  1951 -0.215  
    #  3  1952 -0.101  
    #  4  1953 -0.459  
    #  5  1954 -0.00130
    #  6  1955  0.559  
    #  7  1956 -0.457  
    #  8  1957 -0.251  
    #  9  1958  1.10   
    # 10  1959  0.282  
    # # … with 61 more rows 
    

    Either you can put the animation in order of value_yr or you can put the color in ggplot in order by year. Which would you prefer?



    Update

    You won't use transition_reveal to group and transition by the same element. Unfortunately, I can't tell you why, but it seems to get stuck at 1958!

    To make this gif on the left match that ggplot png on the right:

    enter image description hereenter image description here

    First, I modified the calls to ggplot and geom_line

      ggplot(aes(x = month(date, label = T), y = value, 
                 group = yr, color = yr)) +
      geom_line(size = .6)
    

    Then I tried to use transition_reveal but noticed that subsequent years were layered underneath other years. I can't account for that odd behavior. When I ran showCol after changing temp, the colors were in order. That ruled out what I had thought the problem was initially.

    I modified the object anim, using transition_manual to force the order of the plot layers.

    anim <- temp +
      transition_manual(yr, cumulative = T) +
      ease_aes('linear')
    

    That's it. Now the layers match. As to whether this would have worked before you changed the color assignment: original plot with manual transitions of the year on the left, ggplot png on the right:

    enter image description hereenter image description here

    It looks like that would've have worked, as well. So, my original drawn-out explanation wasn't nearly as useful as I thought, but at least you have a working solution now. (Sigh.)