Search code examples
rpdftools

pdf_text function not releasing ram (on windows)


pdf_text() is not releasing RAM. Each time the function runs, it uses more RAM, and doesn't free it up until the R session is terminated. I am on windows.

Minimal example

# This takes ~60 seconds and uses ~500mb of RAM, which is then unavailable for other processes

library(pdftools)
for (i in 1:5) {
  
  print(i)
  pdf_text("https://cran.r-project.org/web/packages/spatstat/spatstat.pdf")
  
}

My question

Why is pdf_text() using so much memory and how can it be freed it up? (without having to terminate the R session)

What I've tried so far

I have tried gc() inside the loop

I have checked that pdf_text() isn't creating some hidden objects (by inspecting ls(all=TRUE)

I have cleared the R session's temp files

Also note

Although the size of the particular pdf in the example above is about 5mb, calling pdf_text on it uses about 20 times that much ram! I am not sure why


Solution

  • This sounds like a memory leak. However I cannot reproduce this problem on MacOS.

    I have started an issue to track this, can you please report which version of pdftools and libpoppler you are using that show this behavior?