I am trying to recode a variable using data.table. I have googled for almost 2 hours but couldn't find an answer.
Assume I have a data.table as the following:
DT <- data.table(V1=c(0L,1L,2L),
V2=LETTERS[1:3],
V4=1:12)
I want to recode V1 and V2. For V1, I want to recode 1s to 0 and 2s to 1. For V2, I want to recode A to T, B to K, C to D.
If I use dplyr
, it is simple.
library(dplyr)
DT %>%
mutate(V1 = recode(V1, `1` = 0L, `2` = 1L)) %>%
mutate(V2 = recode(V2, A = "T", B = "K", C = "D"))
But I have no idea how to do this in data.table
DT[V1==1, V1 := 0]
DT[V1==2, V1 := 1]
DT[V2=="A", V2 := "T"]
DT[V2=="B", V2 := "K"]
DT[V2=="C", V2 := "D"]
Above is the code that I can think as my best. But there must be a better and a more efficient way to do this.
Edit
I changed how I want to recode V2 to make my example more general.
I think this might be what you're looking for. On the left hand side of :=
we name the variables we want to update and on the right hand side we have the expressions we want to update the corresponding variables with.
DT[, c("V1","V2") := .(as.numeric(V1==2), sapply(V2, function(x) {if(x=="A") "T"
else if (x=="B") "K"
else if (x=="C") "D" }))]
# V1 V2 V4
#1: 0 T 1
#2: 0 K 2
#3: 1 D 3
#4: 0 T 4
#5: 0 K 5
#6: 1 D 6
#7: 0 T 7
#8: 0 K 8
#9: 1 D 9
#10: 0 T 10
#11: 0 K 11
#12: 1 D 12
Alternatively, just use recode
within data.table
:
library(dplyr)
DT[, c("V1","V2") := .(as.numeric(V1==2), recode(V2, "A" = "T", "B" = "K", "C" = "D"))]