Search code examples
rpdf-generationgridextrar-grid

Data.frame header on every PDF page (R - grid & gridExtra)


I'm trying to print a PDF of a 100-line dataframe on multiple pages. Following the model proposed by baptiste, it is possible to do it using the {grid} and {gridExtra} packages.

However, the result is such that only the first page displays the header, being omitted in the others. I would like to know if I can print a PDF in such a way that all pages display the header.

Thanks, Dan

Adaptation of baptiste's answer:

library(magrittr)
library(gridExtra)
library(grid)

set_multipage_pdf <- function(df, paper_size = 'a4', margin = 1.30, landscape = FALSE, page_cex = 1, table_cex = 1) {
  sizes <- list(a4 = c(larger_size = 29.70, smaller_size = 21.00),
                a3 = c(larger_size = 42.00, smaller_size = 29.70),
                letter = c(larger_size = 27.94, smaller_size = 21.59),
                executive = c(larger_size = 18.41, smaller_size = 26.67))

  if (landscape) {
    paper_height <- sizes[[paper_size]]['smaller_size'] * page_cex
    paper_width <- sizes[[paper_size]]['larger_size'] * page_cex
  } else {
    paper_height <- sizes[[paper_size]]['larger_size'] * page_cex
    paper_width <- sizes[[paper_size]]['smaller_size'] * page_cex
  }

  tg <- df %>%
    tableGrob(
      rows = seq_len(nrow(df)),
      theme = ttheme_default(
        base_size = 5 * table_cex
      )
    )

  fullheight <- convertHeight(sum(tg$heights), "cm", valueOnly = TRUE)
  page_margin <- unit(margin, "cm")
  margin_cm <- convertHeight(page_margin, "cm", valueOnly = TRUE)
  freeheight <- paper_height - margin_cm
  npages <- ceiling(fullheight / freeheight)
  nrows <- nrow(tg)

  heights <- convertHeight(tg$heights, "cm", valueOnly = TRUE)
  rows <- cut(cumsum(heights), include.lowest = FALSE,
              breaks = c(0, cumsum(rep(freeheight, npages))))

  groups <- split(seq_len(nrows), rows)

  gl <- lapply(groups, function(id) tg[id,])

  for(page in seq_len(npages)) {
    if(page > 1) grid.newpage()
    grid.draw(gl[[page]])
  }

}

# TEST:
df <- iris[sample(nrow(iris), 187, TRUE),]

pdf('test.pdf', paper = 'a4', width = 0, height = 0)
set_multipage_pdf(df, table_cex = 1.5)
dev.off()

Result: Model by baptiste


Solution

  • The trick is to create a new data frame on each page from subsets of the original. The difficulties arise in adjusting the row height to accommodate the new header row, and in preserving numbering of rows across pages.

    Anyway, here is the updated function to accomplish that:

    library(magrittr)
    library(gridExtra)
    library(grid)
    
    set_multipage_pdf <- function(df, paper_size = 'a4', margin = 1.30, landscape = FALSE, 
                                  page_cex = 1, table_cex = 1) 
    {
      sizes <- list(a4 = c(larger_size = 29.70, smaller_size = 21.00),
                    a3 = c(larger_size = 42.00, smaller_size = 29.70),
                    letter = c(larger_size = 27.94, smaller_size = 21.59),
                    executive = c(larger_size = 18.41, smaller_size = 26.67))
    
      if (landscape) {
        paper_height <- sizes[[paper_size]]['smaller_size'] * page_cex
        paper_width  <- sizes[[paper_size]]['larger_size']  * page_cex
      } else {
        paper_height <- sizes[[paper_size]]['larger_size']  * page_cex
        paper_width  <- sizes[[paper_size]]['smaller_size'] * page_cex
      }
    
      tg <- df %>% tableGrob(rows  = seq_len(nrow(df)), 
                             theme = ttheme_default(base_size = 5 * table_cex))
    
      fullheight  <- convertHeight(sum(tg$heights), "cm", valueOnly = TRUE)
      page_margin <- unit(margin, "cm")
      margin_cm   <- convertHeight(page_margin, "cm", valueOnly = TRUE)
      freeheight  <- paper_height - margin_cm
      npages      <- ceiling(fullheight / freeheight)
      nrows       <- nrow(tg)
      heights     <- convertHeight(tg$heights, "cm", valueOnly = TRUE) * (nrows + npages)/nrows 
      rows        <- cut(cumsum(heights), include.lowest = FALSE,
                         breaks = c(0, cumsum(rep(freeheight, npages))))
      groups      <- split(seq_len(nrows), rows)
    
      gl <- lapply(groups, function(id) 
                   {
                     df[id,] %>% 
                     tableGrob(rows = id, theme = ttheme_default(base_size = 5 * table_cex))
                   })
    
      for(page in seq_len(npages)){
        if(page > 1) grid.newpage()
        grid.draw(gl[[page]])
      }
    
    }