Search code examples
rr-markdownmodelsummarytables-package

how to split table into 2 pages in R markdown using datasummary?


I am trying to create a table using datasummary via R Markdown. My table is, however, very long with many variables. Please see example code below:

---
title: "R Notebook"
output:
  pdf_document: default
  html_notebook: default
  html_document:
    df_print: paged
---

Table 1 example:

```{r, warning=FALSE, message=FALSE, echo=FALSE, include=FALSE, fig.pos="H"}
library(magrittr)
library(tidyverse)
library(kableExtra)
library(readxl)
library(modelsummary)
library(scales)

tmp <- mtcars

tmp$test2 <- tmp$mpg
tmp$test3 <- tmp$cyl
tmp$test4 <- tmp$disp
tmp$test5 <- tmp$hp
tmp$test6 <- tmp$drat
tmp$test7 <- tmp$wt
tmp$test8 <- tmp$qsec
tmp$test9 <- tmp$vs
tmp$test10 <- tmp$am
tmp$test11 <- tmp$gear
tmp$test12 <- tmp$carb
tmp$test13 <- tmp$mpg
tmp$test14 <- tmp$cyl
tmp$test15 <- tmp$disp
tmp$test16 <- tmp$hp
tmp$test17 <- tmp$drat
tmp$test18 <- tmp$wt
tmp$test19 <- tmp$qsec
tmp$test20 <- tmp$vs
tmp$test21 <- tmp$am
tmp$test22 <- tmp$gear
tmp$test23 <- tmp$mpg
tmp$test24 <- tmp$cyl
tmp$test25 <- tmp$disp
tmp$test26 <- tmp$hp
tmp$test27 <- tmp$drat
tmp$test28 <- tmp$wt
tmp$test29 <- tmp$qsec
tmp$test30 <- tmp$vs
tmp$test31 <- tmp$am
tmp$test32 <- tmp$gear
tmp$test33 <- tmp$mpg
tmp$test34 <- tmp$cyl
tmp$test35 <- tmp$disp
tmp$test36 <- tmp$hp
tmp$test37 <- tmp$drat
tmp$test38 <- tmp$wt
tmp$test39 <- tmp$qsec
tmp$test40 <- tmp$vs
  
# create a list with individual variables
# remove missing and rescale
tmp_list <- lapply(tmp, na.omit)
tmp_list <- lapply(tmp_list, scale)

# create a table with `datasummary`
# add a histogram with column_spec and spec_hist
# add a boxplot with colun_spec and spec_box
emptycol = function(x) " "
final_4_table <- datasummary(mpg + cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb + test2 + test3 + test4 + test5 + test6 + test7 + test8 + test9 + test10 + test11 + test12 + test13 + test14 + test15 + test16 + test17 + test18 + test19 + test20 + test21 + test22 + test23 + test24 + test25 + test26 + test27 + test28 + test29 + test30 + test31 + test32 + test33 + test34 + test35 + test36 + test37 + test38 + test39 + test40 ~ N + Mean + SD + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol, data = tmp, title = "Title of my table.", notes = list("note2", "note1")) %>%
    column_spec(column = 5, image = spec_boxplot(tmp_list)) %>%
    column_spec(column = 6, image = spec_hist(tmp_list))

```


```{r finaltable, echo=FALSE}
final_4_table
```

The output does not fit in 1 page as per the below and I do not get another page with the remaining variables:

enter image description here

How can I make sure the table is split into 2 in this example, moving to the following page with title saying "Title of my table. (continued)"?

I checked tables package vignette and by including the below information for longtable gives me the same output:

final_4_table <- datasummary(mpg + cyl + disp + hp + drat + wt + qsec + vs + am + gear + carb + test2 + test3 + test4 + test5 + test6 + test7 + test8 + test9 + test10 + test11 + test12 + test13 + test14 + test15 + test16 + test17 + test18 + test19 + test20 + test21 + test22 + test23 + test24 + test25 + test26 + test27 + test28 + test29 + test30 + test31 + test32 + test33 + test34 + test35 + test36 + test37 + test38 + test39 + test40 ~ N + Mean + SD + Heading("Boxplot") * emptycol + Heading("Histogram") * emptycol, data = tmp, title = "Title of my table.", notes = list("note2", "note1"), options = list(tabular="longtable",
          toprule="\\caption{This table crosses page boundaries.}\\\\
\\toprule", midrule="\\midrule\\\\[-2\\normalbaselineskip]\\endhead\\hline\\endfoot")) %>%
    column_spec(column = 5, image = spec_boxplot(tmp_list)) %>%
    column_spec(column = 6, image = spec_hist(tmp_list))

Solution

  • @manro put you on the right track here. The trick is to use the longtable=TRUE argument.

    Behind the scenes, modelsummary calls the kbl function from the kableExtra package to produce a table. All the arguments which are not explicitly used by datasummary() are automatically pushed forward to the kbl function via the ... ellipsis. This means that you can define the longtable argument directly in the datasummary() call. For example:

    datasummary(mpg + hp ~ Mean + SD, data = mtcars, longtable = TRUE)
    

    The above is a minimal example. I tried passing the longtable=TRUE argument in the same way in your more complex example, and your Rmarkdown document compiled without problems on my machine.