I have many factor vectors in a tibble. It's a survey, so the levels are letter codes.
The survey tool incorporates order of letter chosen at the time of the survey (from a clicker), which may or may not be useful depending on the question.
I am seeking a tidy function or a process by which to collapse the factor levels with matching letters. I.e., "B,A" = "A,B" and this collapses to just "A,B".
Or "B,C,A" = "C,A,B" = "A,B,C" or any combination of the letters A,B,C. I can have up to 5 letters max in a factor level, so it can get complicated quickly.
Should I convert it to a character string and then use stringi or grepl to break it into multiple columns? I have numerous columns, so I am looking for a slick solution. Any ideas?
Here is an example of a simple string in my data:
string<-c("E","C","A","A,B","A,B,C","B,A","C,A,B") %>% as.factor()
split by comma, sort, paste together.
string %>% strsplit(split = ",", fixed = TRUE) %>%
lapply(sort) %>%
sapply(paste, collapse = ",") %>%
factor
# [1] E C A A,B A,B,C A,B A,B,C
# Levels: A A,B A,B,C C E