Search code examples
rmetaprogrammingpipelinetidyversevariable-names

Threading the needle: finding the name of the actual argument corresponding to a formal of an outer function


The function strip() below tries to produce a brief report on the result of its operation via the tee pipe (%T>%). Because this function is in turn being handed to a wrapper function and then to purrr::pwalk, which will supply it with a bunch of dataframes one by one, I want to get a report of its operation on each dataframe along with the dataframe name; which is to say, the name of the actual dataframe that is supplied to correspond to the formal argument tib in the function below. In the example supplied, this would be "tst_df". I don't know the names in advance of running the function, as they are constructed from the filenames read from disk and various other inputs.

Somewhat to my surprise, I actually have almost all of this working, except for getting the name of the supplied dataframe. In the example below, the code that is supposed to do this is enexpr(XX), but I have also tried expr(XX), and both of these expressions applied to tib or the dot (.), with or without a preceding !!. Also deparse(substitute()) on XX, tib, and ., but without the bang bangs.

I see that the names is stripped initially by pass-by-value, and then again, maybe, by each stage of the pipe, including the T, and again, maybe, by (XX = .) in the anonymous function after the T. But I know R + tidyverse will have a way. I just hope it does not involve providing an integer to count backwards up the call stack

tst_df <- tibble(A = 1:10, B = 11:20, C=21:30, D = 31:40)
tst_df    
################################################################################
# The strip function expects a non-anonymous dataframe, from which it removes
# the rows specified in remove_rows and the columns specified in remove_cols. It
# also prints a brief report; just the df name, length and width.
strip <- function(tib, remove_rows = FALSE, remove_cols = NULL){
  remove_rows <- enquo(remove_rows)
  remove_cols <- enquo(remove_cols)
  out <- tib %>%
    filter(! (!! remove_rows))  %>%
    select(- !! remove_cols) %T>% (function(XX = .){
      function(XX = .)print(
          paste0("length of ", enxpr(XX), " = ", nrow(XX), " Width = ", ncol(XX)))
          cat("\n")
        })
  out  
}

out_tb <- strip(tib = tst_df, remove_rows = (A < 3 | D > 38),  remove_cols = c(C, D))
out_tb

Solution

  • Just save the name of tib at the beginning of your function, it will be found by your reporter function:

    strip <- function(tib, remove_rows = FALSE, remove_cols = NULL) {
      remove_rows <- enquo(remove_rows)
      remove_cols <- enquo(remove_cols)
      tib_name <- as.character(substitute(tib))
      report <- function(out) {
        cat("output length of", tib_name, "=", nrow(out), ", width =", ncol(out), "\n")
      }
    
      tib %>%
        filter(! (!! remove_rows))  %>%
        select(- !! remove_cols) %T>%
        report
    }
    
    out_tb <- strip(tib = tst_df, remove_rows = (A < 3 | D > 38),  remove_cols = c(C, D))
    output length of tst_df = 6 , width = 2