Search code examples
rdata.tableknitrxtable

Controlling output while printing data.table with xtable


Here is a data.table with lots of records / rows:

dt <- data.table(a=sample(letters,1000,replace = T), b=rnorm(1000))

We can view it using simply:

dt

... and a very convenient view is generated with first and last 5 rows.

However, when using xtable to print this using knitr for a pdf report, xtable prints all the thousand rows:

print(xtable(dt))

Any ideas how to fix this?

I want a pretty table from xtable but in the default format of data.table.

A hack like rbind with first and last five elements is possible, but not very elegant.


Solution

  • Here is a modified version of print.data.table that returns the formatted object rather than prints it:

    firstLast <- function(x, ...) {
        UseMethod("firstLast")
    }
    
    firstLast.data.table  <- function (x,
                                       topn = getOption("datatable.print.topn"),
                                       nrows = getOption("datatable.print.nrows"),
                                       row.names = TRUE, ...) {
        if (!is.numeric(nrows)) 
            nrows = 100L
        if (!is.infinite(nrows)) 
            nrows = as.integer(nrows)
        if (nrows <= 0L) 
            return(invisible())
        if (!is.numeric(topn)) 
            topn = 5L
        topnmiss = missing(topn)
        topn = max(as.integer(topn), 1L)
        if (nrow(x) == 0L) {
            if (length(x) == 0L) 
                return("Null data.table (0 rows and 0 cols)\n")
            else return(paste("Empty data.table (0 rows) of ", length(x), 
                " col", if (length(x) > 1L) 
                    "s", ": ", paste(head(names(x), 6), collapse = ","), 
                if (ncol(x) > 6) 
                    "...", "\n", sep = ""))
        }
        if (topn * 2 < nrow(x) && (nrow(x) > nrows || !topnmiss)) {
            toprint = rbind(head(x, topn), tail(x, topn))
            rn = c(seq_len(topn), seq.int(to = nrow(x), length.out = topn))
            printdots = TRUE
        }
        else {
            toprint = x
            rn = seq_len(nrow(x))
            printdots = FALSE
        }
        toprint = data.table:::format.data.table(toprint, ...)
        if (isTRUE(row.names)) 
            rownames(toprint) = paste(format(rn, right = TRUE), ":", 
                sep = "")
        else rownames(toprint) = rep.int("", nrow(x))
        if (is.null(names(x))) 
            colnames(toprint) = rep("NA", ncol(toprint))
        if (printdots) {
            toprint = rbind(head(toprint, topn), `---` = "", tail(toprint, 
                topn))
            rownames(toprint) = format(rownames(toprint), justify = "right")
            return(toprint)
        }
        if (nrow(toprint) > 20L) 
            toprint = rbind(toprint, matrix(colnames(toprint), nrow = 1))
        return(toprint)
    }
    

    This can be used to prepare a large data.table for formatting by xtable:

    library(xtable)
    xtable(firstLast(dt))
    % latex table generated in R 3.1.2 by xtable 1.7-4 package
    % Tue Feb 17 20:15:12 2015
    \begin{table}[ht]
    \centering
    \begin{tabular}{rll}
      \hline
     & a & b \\ 
      \hline
       1: & i & -0.6356429 \\ 
         2: & w & -1.1533783 \\ 
         3: & r & -0.7459959 \\ 
         4: & x &  1.5646809 \\ 
         5: & o & -1.8158744 \\ 
        --- &  &  \\ 
       996: & z & -1.0835897 \\ 
       997: & a &  0.9219506 \\ 
       998: & q &  0.3388118 \\ 
       999: & l & -1.7123250 \\ 
      1000: & l &  0.1240633 \\ 
       \hline
    \end{tabular}
    \end{table}