Search code examples
sqlrfunctionr-markdownodbc

Render R Markdown HTML file inside a function issue


I am working through an ODBC connection to a SQL server to query data and build reports. Right now I have a script that works exactly as it should to render the .Rmd files and place it in the folder, but I'm working on a more streamlined way to do it through a function instead.

The basis of the function goes as followed:

RenderFxn = function(RMDfile, OutputName){

source = "path/to/scripts/"
output = "path/to/output/location/"

rmarkdown::render(input = paste0(source,RMDfile,".Rmd"),
                  output_file = paste0(output,OutputName,"_",DATE))

}

RenderFxn("CreateReport","ReportOutput")

It works if I go line by line, but as soon as I call it through the function I get an error. The error occurs at the first chunk of code that connects to the SQL server through the ODBC connection. But again, it works fine if i don't call it through a function.

Does anyone have any ideas why this is?

Thank you!


Solution

  • There are several problems here, I'm inferring most of them lacking code. Here's a stab:

    1. DATE is undefined, error.

    2. I am going to guess that the Rmarkdown document is unable to query the data you need. Note that the rendering process is in a clean environment, so if you have objects such as conn, mydata, param1, and yes, even DATE in the global (calling) environment, you need to pass almost all of them via the render(.., params=list(...)) argument. See https://bookdown.org/yihui/rmarkdown/parameterized-reports.html.

      The notable exception to that is that conn (whether from DBI::dbConnect or RODBC:: something) cannot be passed here, as well as similar objects that contain a memory <pointer>. To fix this, you'll need to pass the parameters you passed to DBI::dbConnect and recreate the connection object inside of the Rmarkdown document.

    3. (minor) Use file.path instead of paste0 when forming the directory components of a path, it's far less likely that you'll forget the trailing slash.

    4. Your function should accept more arguments, since it is breaking scope and trying to find things it has no guarantee of finding. In general, functions should be explicitly passed what it needs. There are obviously exceptions to this, but without explicit arguments, your function is not strictly reproducible, troubleshooting it is a pain, and other users/readers have no clue what is required, what should be found, what they should look like, etc. (By "users", I'm including you in six months after you've forgotten the details.)

      I suggest a function like this:

      RenderFunc <- function(rmd, outputname, ...,
                             date = Sys.Date(),
                             source = "path/to/scripts/",
                             outputpath = "path/to/output/location/") {
        params <- c(list(...), date = date)
        if (!endsWith(rmd, ".Rmd")) rmd <- paste0(rmd, ".Rmd")
        outputname <- paste(OutputName, date, sep = "_")
        rmarkdown::render(input = file.path(source, rmd),
                          output_file = file.path(output, outputname),
                          params = params)
      }
      RenderFunc("CreateReport", "ReportOutput", data=mydata,
                 dbuser="myuser", dbpass="mypass", dbhost="somehost")
      

      where in this case, data= is a parameter defined in the rmarkdown document, and the db* variables will be passed within the list(...) part of params so that the Rmd can instantiate the connection.

    A living/breathing example of the target Rmarkdown document:

    ---
    title: My Document
    output: html_document
    params:
      date: !r Sys.Date()
      data:
      dbuser:
      dbpass:
      dbhost:
    ---
    
    ```{r setup, echo = FALSE, include = FALSE}
    library(DBI)
    conn <- dbConnect(uid = params$dbuser, pwd = params$dbpass, server = params$server)
    ```
    
    # Some Header
    
    ```{r block1}
    remotedata <- dbGetQuery(conn, "select ...")
    # do something here
    ```
    

    In this example, I'm assuming you are passing data= (some form of frame, perhaps) as well as querying from the database. Whichever you want is up to you.

    I'm generally not a fan of putting DBI code in the rmarkdown document; it works and there are many use-cases when it makes absolute sense, but sometimes it can be much simpler for the Rmd to just deal with data, not with connections and queries and such. For this, you might modify the func to be:

    RenderFunc2 <- function(rmd, outputname, ...,
                            dbuser, dbpass, dbhost,
                            date = Sys.Date(),
                            source = "path/to/scripts/",
                            outputpath = "path/to/output/location/") {
      params <- c(list(...), date = date)
      if (!endsWith(rmd, ".Rmd")) rmd <- paste0(rmd, ".Rmd")
      outputname <- paste(OutputName, date, sep = "_")
      conn <- DBI::dbConnect(uid=dbuser, pwd=dbpass, server=dbhost)
      on.exit(dbDisconnect(conn))
      mydata <- DBI::dbGetQuery(conn, "select ...")
      params$data <- mydata
      rmarkdown::render(input = file.path(source, rmd),
                        output_file = file.path(output, outputname),
                        params = params)
    }
    RenderFunc("CreateReport", "ReportOutput", data=mylocaldata,
               dbuser="myuser", dbpass="mypass", dbhost="somehost")
    

    In this case, the db* arguments are used solely by RenderFunc and not passed to the Rmd document (which would not be trying to create a db connection).