I am building a shiny
application in which you can train a model. One feature is to be able to download the model object (in this case, a glm
object), such that the user can use it later on - outside of the application. The relevant part of my code looks as follows
library(shiny)
library(car)
ui <- fluidPage(
# What parameter do you wish to estimate
selectInput(inputId = "dependent_variable",
label = "Select dependent variable",
choices = c("education",
"vocabulary")),
# Download button for model
downloadButton(outputId = "download_model", label = 'Download Model')
)
server <- function(input, output){
strip_glm <- function(cm) {
cm$y <- c()
cm$model <- c()
cm$residuals <- c()
cm$fitted.values <- c()
cm$effects <- c()
cm$qr$qr <- c()
cm$linear.predictors <- c()
cm$weights <- c()
cm$prior.weights <- c()
cm$data <- c()
cm$family$variance <- c()
cm$family$dev.resids <- c()
cm$family$aic <- c()
cm$family$validmu <- c()
cm$family$simulate <- c()
attr(cm$terms,".Environment") <- c()
attr(cm$formula,".Environment") <- c()
return(cm)
}
reactive_glm_model <- reactive(glm(paste0(input$dependent_variable, "~."), data = Vocab))
stripped_glm <- reactive(strip_glm(reactive_glm_model()))
stripped_glm_summary <- reactive(summary(reactive_glm_model()))
output$download_model <- downloadHandler(
filename = function() {
"report.Rd"
},
content = function(file) {
glm_object <- stripped_glm()
glm_summary <- stripped_glm_summary()
save(glm_object, glm_summary, file = file)
}
)
}
shinyApp(ui, server)
I use the strip_glm()
function, because I don't want the glm
object to be too big and carry unnecessary stuff. It should only be able to predict. However, by stripping glm
, summary()
does not work anymore, therefore I'd like to return the summary as well.
So here is my problem: If I download the object, there are still some 'hidden' objects making the file too big. In this reprex, it is 16.2 MB, whereas if I load the corresponding object back into memory, I find the real object size is way less
load("report.Rd")
object.size(glm_object) # 22 kB
object.size(glm_summary) # 2.5 MB
What is going on here? In the models I am using, my data potentially has millions of rows, causing the object to be several GB's and the downloading takes ages.
UPDATE
It seems to be related to the version or underlying settings. In the above settings, where I do encounter the problem I use
platform x86_64-redhat-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 5.2
year 2018
month 12
day 20
svn rev 75870
language R
version.string R version 3.5.2 (2018-12-20)
nickname Eggshell Igloo
Unfortunately I am not able to update the version of R due to policy constraints
UPDATE II
It seems the problem is not related to R
or shiny
and not reproducible on different platforms
Colleague here. We run this code with RStudio Server, which seems to be causing the problem. Running the reprex with R itself (but still on the same server using the same R executable), bypassing RStudio, fixes the issue and the downloaded R object is a little over 2 MB.
No idea why using RStudio is messing things up, though. The version used is RStudio Server (Pro) 1.2.5001-3