I am trying to use gganimate package of R, to create an animation of a bunch of histograms, where each frame of the animation shows a histogram of an image. I have ~400 images, so ~400 columns. My data looks like this:
| bins.left | bins.right | hist1 | hist 2 | ... | hist n |
and as you see, I need each column be considered as the Y value of the histogram, in each frame. In other words, my animation should iterate over the columns.
But all the examples that I have studied on the Internet, seem to be considering only one column as the identifier of the frames. For instance in this example:
mapping <- aes(x = gdpPercap, y = lifeExp,
size = pop, color = continent,
frame = year)
p <- ggplot(gapminder, mapping = mapping) +
geom_point() +
scale_x_log10()
the attribute 'Year' is considered as the iterator. This data looks like this:
country continent year lifeExp pop gdpPercap
<fctr> <fctr> <int> <dbl> <int> <dbl>
1 Afghanistan Asia 1952 28.801 8425333 779.4453
2 Afghanistan Asia 1957 30.332 9240934 820.8530
3 Afghanistan Asia 1962 31.997 10267083 853.1007
4 Afghanistan Asia 1967 34.020 11537966 836.1971
5 Afghanistan Asia 1972 36.088 13079460 739.9811
6 Afghanistan Asia 1977 38.438 14880372 786.1134
The reason that I don't want to modify my data to fit into such a pattern is that if I keep all the histograms in one column, my data frame will be extremely lengthy (length = ~ 16000 * 400) and difficult to handle. In addition, it is not intuitive to keep my data in such a confusing fashion. I believe there must be an easy solution to my problem. Any suggestion is highly appreciated.
As @Marius said, you can make this work if your data is in long format. Below I create some fake data and then make the animated plot.
library(tidyverse)
theme_set(theme_classic())
library(gganimate)
Here's fake data with 10 columns of values we want to turn into histograms.
set.seed(2)
dat = replicate(10, runif(100)) %>% as.data.frame
The data is in wide format, so first we'll convert it to long format with the gather
function:
d = dat %>% gather(key, value)
In the new long format, the key
column tells us which histogram column the data originally came from. We'll use this as our frame
and run geom_histogram
:
p = ggplot(d, aes(value, frame=key)) +
geom_histogram()
gganimate(p)
You can see that this is not what we want. ggplot
actually generated a single histogram from all the data and the animation just shows us in succession the part of each stack that came from each value of key
.
We need a way to get ggplot to create separate histograms and animate them. We can do that by pre-binning the data and using geom_rect
to create the histogram bars:
d = dat %>% gather(key, value) %>%
mutate(bins = cut(value, breaks=seq(0,1,0.1),
labels=seq(0,0.9,0.1) + 0.05, include.lowest=TRUE),
bins = as.numeric(as.character(bins))) %>%
group_by(key, bins) %>%
tally
p = ggplot(d, aes(xmin=bins - 0.048, xmax=bins + 0.048, ymin=0, ymax=n, frame=key)) +
geom_rect() +
scale_y_continuous(limits=c(0, max(d$n)))
gganimate(p)
In response to your comment, I don't think you can use gganimate
with data in wide format. gganimate
requires a single frame
column, which requires data in long format. However, gganimate
is a wrapper around the animation
package, and you can create an animated GIF file directly with a for loop and the saveGIF
function:
library(animation)
saveGIF({
for (i in names(dat)) {
p = ggplot(dat, aes_string(i)) +
geom_histogram(breaks=seq(0,1,0.1))
print(p)
}
}, movie.name="test.gif")