I am importing measurement data as a dataframe and want to include the experimental conditions in the data which are given in the filename. I want to add new columns to the dataframe that represent the conditions, and I want to assign the columns with the value specified by the filename. Later, this will facilitate comparisons to other experimental conditions once I merge the editted dataframes from each individual sample/file.
Here is an example of my pre-existing dataframe Measurements
:
Measurements <- data.frame(
X = 1:4,
Length = c(130, 150, 170, 140)
)
Here are the example vectors of variables and values that would be derived from the filename:
FileVars.vec <- c("Condition", "Plant")
FileInfo.vec <- c("aKG", "1")
Here is one way that I have solved how to do what I want:
for (i in 1:length(FileVars.vec)) {
Measurements[FileVars.vec[i]] <- FileInfo.vec[i]
}
Which gives the desired output:
X Length Condition Plant
1 130 aKG 1
2 150 aKG 1
3 170 aKG 1
4 140 aKG 1
But my (limited) understanding of R is that it is a vectorized language that often avoids the need for using for-loops. I feel like this simpler code should work:
Measurements[FileVars.vec] <- FileInfo.vec
But instead of assigning one value for one entire column, it recycles the values within each column:
X Length Condition Plant
1 130 aKG aKG
2 150 1 1
3 170 aKG aKG
4 140 1 1
Is there any way to do a similar simple assignment but without recycling, i.e. one value is assigned to one full column only? I imagine there's a simple formatting fix but I've searched for a solution for >6 hours and no where did I see an assignment like this. I have also thought of creating a separate dataframe of just the experimental conditions and then merging to the actual dataframe, but that seems more roundabout to me, especially with more experimental conditions and observations than these examples.
Also, if there is a more established pipeline/package for taking information from the filename and adding it to the data in a tidy fashion, that would be marvelous as well! The original filename would be something like:
"aKG_1.csv"
Thank you for helping an R noobie! May you receive good coding karma when debugging!
We can convert to a list
and then assign to avoid the recycling of values column wise. As it is a list
, each element will be treated as a unit and the assignment occurs for the respectively columns by recycling those elements
Measurements[FileVars.vec] <- as.list(FileInfo.vec)
-output
Measurements
# X Length Condition Plant
#1 1 130 aKG 1
#2 2 150 aKG 1
#3 3 170 aKG 1
#4 4 140 aKG 1
If we want to reset the type
, use type.convert
Measurements <- type.convert(Measurements, as.is = TRUE)
Note that by creating a vector
for FileInfo.vec
, it will have a single type
i.e. character
. Instead if we want to have multiple types, it can be a list
Measurements[FileVars.vec] <- list("akg", 1)
For the second part of the question, if we have a string
str1 <- "aKG_1.csv"
and wants to create two columns from that, either use, read.csv
or strsplit
Measurements[FileVars.vec] <- read.table(text = tools::file_path_sans_ext(str1),
sep="_", header = FALSE)