Using the code mentioned below in R, I am reading the data (.txt files) saved in different folders. There are following conditions that I need to include in my code.
The structure of my txt files is as follows:
Date/Time XY [XY] C1 [m2] C1c C2 [m] C2c C3 [W] C3c K PP [Pa]..
2005-03-01S01:00:00 0.98 250 0 29 0 289 0 98 289...
2005-03-01S02:00:00 0.97 240 0 28 2 279 0 98 89...
2005-03-01S03:00:00 0.98 252 -1 29 0 289 0 16 289...
..
..
I want following conditions to be included in the code.
if C1c is not = 0, then C1 = NA,
if -400 > C1 > 350, then C1 = NA,
if C2c is not = 0, then C2 = NA,
if -250 > C2 > 450, then C2 = NA,
if C3c is not = 0, then C3 = NA,
if 100 > C3 > 500, then C3 = NA,
if K < 90, then K = NA
if PP < 200, then PP = NA
It is to be noted that not all text files have all these columns. So, the logic should be if the file has the concerned column, the respective condition should be applied to it.
Existing code:
library(data.table)
filelist <- list.files("D:/Test2/", full.names = TRUE, recursive
= TRUE, pattern = ".txt$")
dt <- lapply(filelist, function(file) {
lines <- readLines(file)
comment_end = match("*/", lines)
fread(file, skip = comment_end)
})
dt.tidied <- lapply(dt, FUN = function(x){
setnames(x, old = "T2 [?C]", new = "T2 [°C]", skip_absent = TRUE)
colnames(x) <- gsub("\\[", "(", colnames(x))
colnames(x) <- gsub("\\]", ")", colnames(x))
return(x)
})
merged <- rbindlist(dt.tidied, fill = TRUE, use.names = TRUE)
write.csv(merged, "D:/Test2/Merged2.csv")
Could anyone please help me in modifying the code to include the conditions.
Include logic to test if the column exists prior to any operation reliant on that column e.g:
dt.tidied <- lapply(dt, FUN = function(x){
setnames(x, old = "T2 [?C]", new = "T2 [°C]", skip_absent = TRUE)
colnames(x) <- gsub("\\[", "(", colnames(x))
colnames(x) <- gsub("\\]", ")", colnames(x))
# Apply conditions to the respective columns
if ("C1c" %in% colnames(x)) {
x[C1c != 0, C1 := NA]
x[C1 < -400 | C1 > 350, C1 := NA]
}
if ("C2c" %in% colnames(x)) {
x[C2c != 0, C2 := NA]
x[C2 < -250 | C2 > 450, C2 := NA]
}
if ("C3c" %in% colnames(x)) {
x[C3c != 0, C3 := NA]
x[C3 < 100 | C3 > 500, C3 := NA]
}
if ("K" %in% colnames(x)) {
x[K < 90, K := NA]
}
if ("PP" %in% colnames(x)) {
x[PP < 200, PP := NA]
}
return(x)
})