Search code examples
rcsvclosuresscoping

Issues in R with closures that sequentially returns .csv files


I have a model that outputs data in the form of .csv files. The output directory is full of .csv files, each named n.csv where n is the run number. So on the 0th run it creates 0.csv, on the 1st run it creates 1.csv, etc.

Now I want to analyze this data in R, and compare it to the output of another model. I wrote a function that runs the analysis I desire on two models, given as functions as inputs. The model I'm comparing to is a build-in sna function, and to make a function that simulates my model I wrote the following closure

#creates a model function that returns sequentially numbered .csv files from a directory
make.model <- function(dir) {
  i <- -1 # allows for starting the .csv ennumeration at 0
  model <- function() {
    i <<- i + 1
    my.data <- as.matrix(read.csv(paste0(dir, i, ".csv"), header=FALSE))
    return(my.data)
  }
  return(model)
}

The issue I am running into is that although

my.model <- make.model(directory)
spectral.analysis(my.model, other.model, observed.data, nsim = 100)

Does exactly what I want and computes how well my model and the other model do in modeling the observed data, it's not reusable. The counter inside the closure gets permanently up-ticked and so the function can only be run so many times before it tries to access non-existent .csv files.

I am currently getting around this with a "reset" function that redefines my.model and running it after each time I use my.model, this seems like a very poor solution.

Is there a cleverer way to go about doing this? Crucially, the function spectral.analysis() takes functions as its input and then runs the function to obtain it's values, and rewriting that function isn't on the table right now. I'm not passing the data directly from my model to the analysis function because my model takes hours to run and so I want to be able to prerun a lot of trials and analyze them later.


Solution

  • Self-answering to close, I figured it out with help from the comments.

    length(list.files(pattern=".csv"))
    

    Allows you to get the number of .csv files, so changing the line where i is increased to read

    i <<- (i + 1) %% length(list.files(pattern=".csv"))
    

    solves the problem.