I am facing an out-of-memory issue in R, when I tried to load multiple tables and render using DT in shiny.
I am wondering if it is possible to only provide table structure (eg, no of rows & column names) to DT, and pre-load first N rows data to display in the app, then load another N rows when user click another page (pagination enabled). I found DT has a dataTableAjax
function that return a Ajax URL and can be queried by DT (not sure how it done)
The original datatables JS library has an similar feature (if I not mistaken), as in https://datatables.net/examples/server_side/defer_loading.html
For example,
sample_table <- data.frame(a = rnorm(1e7), b = rnorm(1e7), c = rnorm(1e7))
library(fst)
# write large data on disk
write_fst(sample_table, "sample_table.fst")
# how to load data on disk on-demand using Ajax?
shinyApp(
ui = fluidPage(
title = 'Server-side processing of DataTables',
fluidRow(
DT::dataTableOutput('tbl')
)
),
server = function(input, output, session) {
# create a widget using an Ajax URL created above
tbl_ajax_url <- reactiveVal({
dataTableAjax(
session,
read_fst("sample_table.fst", from = 1, to = 100, as.data.table = TRUE),
outputId = 'tbl')
})
observeEvent(input$tbl_rows_current, {
rows <- input$tbl_rows_current
tbl_ajax_url(dataTableAjax(
session,
# random access like fst, only load required data when user click the page
read_fst("sample_table.fst", from = min(rows), to = max(rows),
as.data.table = TRUE),
outputId = 'tbl'))
})
output$tbl = DT::renderDataTable({
datatable(data.table(
a = numeric(), b = numeric(), c = numeric(),
check.names = FALSE), rownames = FALSE, options = list(
ajax = list(
serverSide = TRUE, processing = TRUE,
# not sure how to do this part, where url only return part of data
url = tbl_ajax_url()
)
))
})
}
)
If you have any other suggestions, please let me know as well. My primary objective is to prevent loading all tables in R at once, instead only load partially on-demand.
PS: I am not familiar with any HTML, CSS & JS, please be patient and provide as many details as possible, thanks in advance!
I have figure out a solution by myself, but I just put here in case somebody interested.
Using funcFilter
in renderDT
, we can create a new data source and render to DT. I created a on-disk data source, which only read data saved on disk when needed using fst.
Code:
sample_table <- data.frame(a = rnorm(1e7), b = rnorm(1e7), c = rnorm(1e7))
library(fst)
library(shiny)
library(DT)
library(data.table)
# write large data on disk
write_fst(sample_table, "sample_table.fst")
shinyApp(
ui = fluidPage(
title = 'Server-side processing of DataTables',
fluidRow(
DT::dataTableOutput('tbl')
)
),
server = function(input, output, session) {
output$tbl = DT::renderDataTable({
datatable(data.frame(
a = numeric(), b = numeric(), c = numeric(),
check.names = FALSE), rownames = FALSE)
}, funcFilter = dataTablesFilterOnDisk)
}
)
dataTablesFilterOnDisk <- function(data, params) {
start <- as.integer(params$start)
length <- as.integer(params$length)
total_rows <- fst::metadata_fst("sample_table.fst")$nrOfRows
cleanDataFrame <- function(x, escape = params$escape) {
if (escape != "false") {
k = seq_len(ncol(x))
if (escape != "true") {
k = k[as.integer(strsplit(escape, ",")[[1]])]
}
for (j in k) if (is.character(x[, j]) || is.factor(x[, j]))
x[, j] = htmltools::htmlEscape(x[, j])
}
x = unname(x) # remove column names
if (!is.data.frame(x)) return(x)
for (j in seq_len(ncol(x))) {
xj = x[, j]
xj = unname(xj) # remove names
dim(xj) = NULL # drop dimensions
if (is.table(xj)) xj = c(xj) # drop the table class
x[[j]] = xj
}
unname(x)
}
row_range <- c(start + 1L, start + length)
data <- fst::read_fst("sample_table.fst", columns = colnames(data),
from = row_range[1L],
to = min(row_range[2L], total_rows))
list(draw = as.integer(params$draw), recordsTotal = total_rows,
recordsFiltered = total_rows, data = cleanDataFrame(data),
DT_rows_all = NULL,
DT_rows_current = seq.int(row_range[1L], row_range[2L], by = 1L))
}
See more references: