I have multiple files (30, tab delimited) that look like the one below:
|target_id | length| eff_length| est_counts| tpm|
|:------------|------:|----------:|----------:|--------:|
|LmjF.27.1250 | 966| 823.427| 2932| 94.7314|
|LmjF.09.0430 | 1410| 1267.430| 3603| 75.6304|
|LmjF.13.0210 | 2001| 1858.430| 4435| 63.4897|
|LmjF.28.0530 | 4083| 3940.430| 7032| 47.4778|
|LmjF.16.1400 | 591| 448.577| 1163| 68.9761|
|LmjF.29.2570 | 1506| 1363.430| 11135| 217.2770|
I am trying to cut the fifth column from all of these files 30 files with a command such as:
fifth_colum_file1 = file1.csv[ , 5]
But I want to make the process more automatised.
The files that I want to work with have all the pattern "bs_abundance", therefore I think a good starting point would be to either load all the files I want to work with with such a command:
temp = list.files(pattern="*bs_abundance")
Or perhaps I can also load all the tables I want to work with directly into the working space already:
for(i in temp) {
x <- read.table(i, header=TRUE, comment.char = "A", sep="\t")
assign(i,x)
}
Then, as explained, I want to cut the fifth column of each of the files to later bind them all to another table of same number of rows.
Here is a method using lapply
that assumes each file in the folder has the same number of rows.
# get file names
files <- dir("temp")
# remove one file
files <- files[-which(files == "removeFileName")]
# get list of vectors from 29 files
myList <- lapply(files, function(i) {temp <- read.csv(i); temp[, 5]})
# get new data.frame
dfDone <- do.call(data.frame, myList)