Let's say I have the following tibble made by the tibble
and haven
packages:
library(tibble)
library(haven)
# Create numerical values
values <- c(1:5)
# Combine values and colors into a named vector
color_choices <- setNames(values, c("Don't know", "Red", "Blue", "Green", "Yellow"))
name_choices <- setNames(values, c("Don't know", "John", "Paul", "Ringo", "George"))
# Create a tibble with the labelled column
data <- tibble(respondent_ID = seq(1:10),
colour_choice = labelled(sample(1:5, 10, replace = TRUE), labels = color_choices),
name_choice = labelled(sample(1:5, 10, replace = TRUE), labels = name_choices))
data
Now I want to change the haven labels of some of the variables. Specifically I only want to change the the label for value = 1
from "Don't know"
to "Not sure"
, but I want to do this across multiple variables.
This achieves the result I want for the colour_choice
variable:
data_replaced <- data
color_choices2 <- color_choices
names(color_choices2)[1] <- "Not sure"
val_labels(data_replaced$colour_choice) <- color_choices2
However, this is tedious for two reasons; first, even for one variable it is inefficient as it involves making a new named vector for all of the labels in the variable when there is only one that needs replacing. [val_labels(data_replaced$colour_choice) <- c("Not sure" = 1)
results in the other labels being removed]. And, second, it is absolutely not scalable.
I have been experimenting with a dplyr approach (which would be preferable anyway) using the memisc::relabel
function but keep hitting walls and was wondering if anyone can make a suggestion? Here is where I am at:
data_replaced <- data %>%
mutate_at(vars(colour_choice, name_choice), ~ memisc::relabel(., "Don't know" = "Not sure"))
You can make a helper function using labelled:val_label<-
, then call it in mutate(across())
:
library(dplyr)
library(labelled)
set.seed(13) # for OP's sample data
change_value_label <- function(x, value, new_label) {
val_label(x, value) <- new_label
x
}
data %>%
mutate(across(
colour_choice:name_choice,
\(x) change_value_label(x, 1, "Not sure")
))
# # A tibble: 10 × 3
# respondent_ID colour_choice name_choice
# <int> <int+lbl> <int+lbl>
# 1 1 3 [Blue] 5 [George]
# 2 2 5 [Yellow] 4 [Ringo]
# 3 3 2 [Red] 1 [Not sure]
# 4 4 5 [Yellow] 5 [George]
# 5 5 4 [Green] 1 [Not sure]
# 6 6 5 [Yellow] 4 [Ringo]
# 7 7 4 [Green] 1 [Not sure]
# 8 8 3 [Blue] 3 [Paul]
# 9 9 1 [Not sure] 4 [Ringo]
# 10 10 2 [Red] 5 [George]
You could tweak the function to take the label to be changed rather than the value (e.g, swap_value_label(x, "Don't know", "Not sure")
). This has the added benefit of working even if the label to be changed doesn't have a consistent value across different variables.
swap_value_label <- function(x, old_label, new_label) {
value <- val_labels(x)[[old_label]]
val_label(x, value) <- new_label
x
}