I have two datasets. They refer to the same data. However, one has string as answers to questions, and the other has the corresponding codes.
library(data.table)
dat_string <- fread("str_col1 str_col2 numerical_col
One Alot 1
Two Alittle 0")
dat_codes <- fread("code_col1 code_col2 numerical_col
0 3 1
1 5 0")
I would like, to combine both datasets, so that the levels get attached to the corresponding codes as labels
, (see this example) for all string columns (in dat_string
).
Please note that the column names can have any format and do not necessarily have the format from the example/
What would be the easiest way to do this?
Desired outcome:
dat_codes$code_col1 <- factor(dat_codes$code_col1, levels=c("0", "1"),
labels=c("One", "Two"))
attributes(dat_codes$code_col1)$levels
[1] "One" "Two"
If I understand your edit - you are saying that both tables are the same shape, with the same row order, it is just that one has labels and one has levels. If that is the case it should be even more straightforward than my original response:
code_cols <- which(sapply(dat_string, is.character))
for(j in code_cols) {
set(dat_codes, j = j, value = factor(
dat_codes[[j]],
levels = unique(dat_codes[[j]]),
labels = unique(dat_string[[j]])
)
)
}
dat_codes
# code_col1 code_col2 numerical_col
# 1: One Alot 1
# 2: Two Alittle 0
dat_codes$code_col1
# [1] One Two
# Levels: One Two
sapply(dat_codes, class)
# code_col1 code_col2 numerical_col
# "factor" "factor" "integer"