I am trying to automatically change the labels of a factor column using tidyverse code, and I have having trouble changing the labels based on a simple function.
Some example data would look like:
subjectid Parameter value
<chr> <fct> <dbl>
1 13 alpha_IST 0.0751
2 13 alpha_IEX 15.7
3 13 alpha_CB 0.236
4 15 alpha_IST 0.0680
5 15 alpha_IEX 16.5
6 15 alpha_CB 0.282
7 17 alpha_IST 0.0793
(To reproduce, the output from dput on the first 6 rows is listed below)
structure(
list(
subjectid = c("13", "13", "13", "15", "15", "15"),
Parameter = structure(c(3L, 2L, 1L, 3L, 2L, 1L), .Label = c("alpha_CB", "alpha_IEX", "alpha_IST"), class = "factor"),
value = c(0.0751, 15.7, 0.236, 0.0680, 16.5, 0.282)
),
row.names = c(NA, -6L),
class = c("tbl_df", "tbl", "data.frame")
)
I am trying to strip out the redundant first half of the Parameter labels (ie remove alpha_).
Given that the above object is called medians, I can do this using:
par_labels <- sapply(
strsplit(levels(medians$Parameter), "_"),
function(x) {
x[2]
}
)
medians %>% mutate(Parameter = factor(Parameter, labels = par_labels))
It seems I should be able to build this same functionality using the fct_relabel function, however I cannot seem to get it to work.
I have tried:
medians %>%
mutate(Parameter = fct_relabel(Parameter, function(x) {
strsplit(x, "_")[2]
}))
which gives the error Error: Problem with mutate() input Parameter. ✖ new_levels must be a character vector
.
I also tried:
medians %>%
mutate(Parameter = fct_relabel(Parameter, function(x) {
strsplit(x, "_")[[1]][2]
}))
which has an error message as follows: Error: Problem with mutate() input Parameter. ✖ new_levels must be the same length as levels(f): expected 3 new levels, got 1.
There are other combinations I have tried with a similar lack of success, and I could see that converting to a character vector, using tidyr to separate and then convert back to a factor would work, but I feel it should be possible in a way similar to what I have tried. Is this possible?
You can use fct_relabel
as :
library(dplyr)
library(forcats)
medians %>%
mutate(Parameter = fct_relabel(Parameter,
function(x) sapply(strsplit(x, "_"), `[`, 2)))
# subjectid Parameter value
# <chr> <fct> <dbl>
#1 13 IST 0.0751
#2 13 IEX 15.7
#3 13 CB 0.236
#4 15 IST 0.068
#5 15 IEX 16.5
#6 15 CB 0.282
However for this problem this is what I would have used in base R :
levels(medians$Parameter) <- sub('.*_', '', levels(medians$Parameter))
Or with fct_relabel
:
medians %>%
mutate(Parameter = fct_relabel(Parameter, ~ sub('.*_', '', .x)))