Search code examples
rcachingplotr-markdown

In R markdown, how do I prevent plots from non-cached chunks from being saved separately?


When knitting an R markdown file, the plots outputted from any chunk with cache=TRUE are saved independently from the HTML output. This makes sense to me. However, if even a single chunk has the cache=TRUE option set, all chunks, including those with cache=FALSE, have their plots saved independently. For example, the following code saves image files for both chunks:

---
title: "Cache Plot Test"
output:
  html_document:
    df_print: paged
---

```{r test_plot1, cache = FALSE}
library(ggplot2)
ggplot(airquality, aes(x = Temp, y = Wind)) +
  geom_point()
```

```{r test_plot2, cache = TRUE}
library(ggplot2)
ggplot(airquality, aes(x = Month, y = Ozone)) +
  geom_point()
```

Is there any way to prevent this if someone wants to implement caching on particular chunks but doesn't want to independently save every single plot in the output? If there isn't such an option and this is by design, what's the rationale? Why would it be necessary to save the plots from chunks that don't implement caching?


Solution

  • The plots are always written out to a file. You can see that for the cached block, the image is not modified when you re-knit the document, but the image in the non-cached block is rewritten (check the modified dates). R isn't re-running the code that generates the image for the cached block. If you don't have any caching enabled, rmarkdown will "clean" up after it's run and delete all images. But because rmarkdown doesn't track side effects on a per-block level, when cachine is enabled it can't clean up after itself anymore because it doesn't know which images came from which block. So it keeps them all to be safe.