Search code examples
rsplitcharacterunique

Split a column of character vectors and return a list


I have the following dataframe:

df <- data.frame(Sl.No = c(1:6),
                 Variable = c('a', 'a,b', 'a,b,c', 'b', 'c', 'b,c'))


   Sl.No   Variable
   1         a
   2         a,b
   3         a,b,c
   4         b
   5         c
   6         b,c

I want to separate the unique values in the variable column as list

myList <- ("a", "b", "c")

I have tried the following code:

separator <- function(x) strsplit(x, ",")[[1]][[1]]
unique(sapply(df$Variable, separator))

This however gives me the following output:

"a"

I request some help. I have searched but seem unable to find an answer to this.


Solution

  • We can split the Variable column at "," and get all the values and select only the unique ones.

    unique(unlist(strsplit(df$Variable, ",")))
    #[1] "a" "b" "c"
    

    If the Variable column is factor convert it into character before using strsplit.