Search code examples
rlarge-data

Automate analysis over multiple .txt files


I have many copies of two types (a + b) of txt file i.e:

a1.txt a2.txt a3.txt... and b1.txt b2.txt b3.txt

My aim is to run an r script that does the following:

read.table a1.txt
#run a bunch of code that chops and changes the data and then stores some vectors and data      frames.
w<-results
x<-results
detach a1.txt
read.table b1 .txt 
#run a bunch of code that chops and changes the data and then stores some vectors and data frames.
y<-results
z<-results
model1<-lm(w~y)
model2<-lm(x~z)

Each time I want to extract coefficients from e.g. 1 slopes for model1 and 2 slopes from model2. I want to run this analysis in an automated way across all pairs of a and b text files and build up the coefficients in vector format in one other file. for later analysys.

I so far have only been able to get bits and bobs from more simple analyses like this. Does anyone have the best idea on how to run this more complex iteration over many files?

EDIT: Tried so far but failed as yet:

your<-function(x) 
{
files <- list.files(pattern=paste('.', x, '\\.txt', sep=''))
a <- read.table(files[1],header=FALSE)
attach(a)
w <- V1-V2
detach(a)
b <- read.table(files[2],header=FALSE)
z <- V1-V2
model <- lm(w~z)
detach(b)
return(model$coefficients[2])
}

slopes <- lapply(1:2, your)
Error in your(1) : object 'V1' not found

Solution

  • You can do something like:

    files <- list.files(pattern='.1\\.txt') # get a1.txt and b1.txt
    

    if you know how many files you have (lets say 10), you would wrap your code above in a function and use one of the apply family depending on your desired output:

    your.function(x) {
      files <- list.files(pattern=paste('.', x, '\\.txt', sep=''))
      a <- read.table(files[1])
      b <- read.table(files[2])
    
      w <- ...
      x <- ...
    
      y <- ...
      z <- ...
    
      model1 <- lm(w~y)
      model2 <- lm(x~z)
    
      return(c(model1$coefficients[2], moedl2$coefficients[2]))
    }
    
    slopes <- lapply(1:10, your.function)