Search code examples
rfunctionloopsenvironmentassign

Pass element form character vector in loop to function and use in assign


I have a function that chooses, reads and formats Excel reports from Workday, and produces data.frames. I've simplified for purposes of begging for your help. :-)

In the function, I pass a parameter called df_name. I want to pass this parameter in a loop to produce many data.frames in a single call.

library(openxlsx)
read_report <- function(
     df_name=NULL
     wb= loadWorkbook(paste0(file_path))
     file_path <- choose.files(multi = FALSE, filters = cbind("Excel (*.xlsx)", "*.xlsx"), caption = "Select Excel File")
          df_data <- read.xlsx(wb,1,colNames = TRUE)
     assign(substitute(df_name), df_data, envir=.GlobalEnv)
)

I have a vector of character strings list_dfs that represent the names of different data.frames:

list_dfs <- c("df_workers", "df_headcount", "df_terms")

I have a loop that iterates list_df which passes the parameter df_name the function read_report

for (i in list_dfs) {
   read_report(df_name = i)
}

This line causes the function to break.

assign(substitute(df_name), df_data, envir=.GlobalEnv)

What's goofy is that when I do this, as I do routinely, outside of a loop, I have no problem, as in the following:

read_report(df_name = "df_workers")

Why does that line choke on the variable i, which is of character type, and not the explicit parameter df_worker?

I can't figure it out. Any help would be appreciated.


Solution

  • I'd suggest to return a dataframe from the function. Then use lapply to call that function, the outputs from the lapply will be then assigned to a list object that would hold all the dataframes.

    Something like this (not tested):

    read_report <- function(df_name = NULL){
      file_path <- choose.files(multi = FALSE,
                                filters = cbind("Excel (*.xlsx)", "*.xlsx"), 
                                caption = "Select Excel File")
      wb = loadWorkbook(paste0(file_path))
      df_data <- read.xlsx(wb, 1, colNames = TRUE)
      return(df_data)
      }
    
    list_dfs <- c("df_workers", "df_headcount", "df_terms")
    
    myList <- lapply(list_dfs, read_report)
    

    Also, note, there was a typo in your function definition. It is fixed.