Search code examples
rfor-loopiterationobjectname

Access name of the object that "i" represents when it iterates in a for-loop through a list of objects in R


I have:

  • directories (let's say two: A and B) that contain files;
  • two character objects storing the directories (dir_A, dir_B);
  • a function that takes the directory as argument and returns the list of the names of the files found there (in a convenient way for me that is different from list.files()).
directories <- c(dir_A, dir_B)
read_names <- function(x) {foo}

Using a for-loop, I want to create objects that each contain the list of files of a different directory as given by read_names(). Essentially, I want to use a for-loop to do the equivalent as:

files_A <- read_names(dir_A)
files_B <- read_names(dir_B)

I wrote the loop as follows:

for (i in directories) {
  assign(paste("files_", sub('.*\\_', '', deparse(substitute(i))), sep = ""), read_names(i))
}

However, although outside of the for-loop deparse(substitute(dir_A)) returns "dir_A" (and, consequently, the sub() function written as above would return "A"), it seems to me that in the for-loop substitute(i) makes i stop being one of the directories, and just being i.

It follows that deparse(substitute(i)) returns "i" and that the output of the for-loop above is only one object called files_i, which contains the list of the files in the last directory of the iteration because that is the last one that has been overwritten on files_i.

How can I make the for-loop read the name (or part of the name in my case, but it is the same) of the object that i is representing in that moment?


Solution

  • There are two issues here, I think:

    1. How to reference both the name (or index) and the value of each element within a list; and
    2. How to transfer data from a named list into the global (or any) environment.

    1. Reference name/index with data

    Once you index with for (i in directories), the full context (index, name) of i within directories is lost. Some alternatives:

    for (ix in seq_along(directories)) {
       directories[[ix]]             # the *value*
       names(directories)[ix]        # the *name*
       ix                            # the *index*
       # ...
    }
    
    for (nm in names(directories)) {
       directories[[nm]]             # the *value*
       nm                            # the *name*
       match(nm, names(directories)) # the *index*
       # ...
    }
    

    If you're amenable to Map-like functions (a more idiomatic way of dealing with lists of similar things), then

    out <- Map(function(x, nm) {
      x                              # the *value*
      nm                             # the *name*
       # ...
    }, directories, names(directories))
    
    out <- purrr::imap(directories, function(x, nm) {
      x                              # the *value*
      nm                             # the *name*
       # ...
    })
    # there are other ways to identify the function in `purrr::` functions
    

    Note: while it is quite easy to use match within these last two to get the index, it is a minor scope-breach that I prefer to avoid when reasonable. It works, I just prefer alternative methods. If you want the value, name, and index, then

    out <- Map(function(x, nm, ix) {
      x                              # the *value*
      nm                             # the *name*
      ix                             # the *index*
       # ...
    }, directories, names(directories), seq_along(directories))
    

    2. Transfer list to env

    In your question, you're doing this to assign variables within a list into another environment. Some thoughts on that effort:

    1. If they are all similar (the same structure, different data), then Don't. Keep them in a list and work on them en toto using lapply or similar. (How do I make a list of data frames?)

    2. If you truly need to move them from a list to the global environment, then perhaps list2env is useful here.

      # create my fake data
      directories <- list(a=1, b=2)
      # this is your renaming step, rename before storing in the global env
      # ... not required unless you have no names or want/need different names
      names(directories) <- paste0("files_", names(directories))
      # here the bulk of the work; you can safely ignore the return value
      list2env(directories, envir = .GlobalEnv)
      # <environment: R_GlobalEnv>
      ls()
      # [1] "directories" "files_a"     "files_b"    
      files_a
      # [1] 1