I am having trouble writing a code that would replace all specified values in multiple columns with new values. The data frame has 20+ columns and I only want to change values in 8 columns (col1, col2, col3, etc). I want to replace all values (4, 5, 6, 7) with (0, -1, -2, -3) respectively. I have very limited knowledge in R and progamming and I have only been able to get a solution that would do the job for one column.
I have read so many solutions to similar questions on here but I could find a solution that works for me. So here is my code:
data$col1[raw_data$col1 == 4 ] <- 0
data$col1[raw_data$col1 == 5 ] <- -1
data$col1[raw_data$col1 == 6] <- -2
data$col1[raw_data$col1 == 7] <- -3
So this works well for one column. can I possibly do it one for all columns?
Set up an example:
demodf <- data.frame(
col1 = 1:10,
col2 = 3:12,
col3 = 5:14,
col4 = 7:16
)
cols_to_amend <- c("col1", "col3")
replace just the relevant columns:
demodf[cols_to_amend] <- apply(demodf[cols_to_amend], 2, FUN = \(x) sapply(x, \(y) if (y %in% 4:7) 4-y else y))
gives:
col1 col2 col3 col4
1 1 3 -1 7
2 2 4 -2 8
3 3 5 -3 9
4 0 6 8 10
5 -1 7 9 11
6 -2 8 10 12
7 -3 9 11 13
8 8 10 12 14
9 9 11 13 15
10 10 12 14 16
# we can use the list of column names to choose where we are replacing
demodf[cols_to_amend]
# we then use `apply` and `MARGIN = 2` to apply a function to each column in this data frame:
<- apply(demodf[cols_to_amend], 2,
# The function we apply will be an anonymous function (`\( )`) taking as its input one column at a time:
FUN = \(x)
# and it will use `sapply` to go down that column performing the following on each item:
\(y) if (y %in% 4:7) 4-y else y)
library(dplyr)
demodf |>
mutate(
across(all_of(cols_to_amend),
~ ifelse(.x %in% 4:7, 4-.x, .x)
)
)
Excessive complexity for this toy example, but allowing for more complex replacements than simple math:
demodf |>
mutate(
across(all_of(cols_to_amend),
~ case_when(.x == 4 ~ 0,
.x == 5 ~ -1,
.x == 6 ~ -2,
.x == 7 ~ -3,
.default = .x)
)
)