Search code examples
rdplyrrecode

R Create a variable with the levels of grouped data


I have a data frame such as data

data = data.frame(ID = as.factor(c("A", "A", "B","B","C","C")),
                  var.color= as.factor(c("red", "blue", "green", "red", "green", "yellow")))

I wonder whether it is possible to get the levels of each group in ID (e.g. A, B, C) and create a variable that pastes them. I have attempted to do so by running the following:

data  %>% group_by(ID) %>%
  mutate(ex = paste(droplevels(var.color), sep = "_"))

That yields:

Source: local data frame [6 x 3]
Groups: ID [3]

      ID var.color     ex
  <fctr>    <fctr>  <chr>
1      A       red    red
2      A      blue   blue
3      B       green   red
4      B       red    red
5      C     green  green
6      C    yellow yellow

However, my desired data.frame should be something like:

ID var.color     ex
  <fctr>    <fctr>  <chr>
1      A       red    red_blue
2      A      blue    red_blue
3      B       green    green_red
4      B       red    green_red
5      C     green    green_yellow
6      C    yellow    green_yellow

Solution

  • Basically, you need collapse instead of sep

    Instead of dropping levels , you can just paste the text together grouped by ID

    library(dplyr)
    data  %>% group_by(ID) %>%
             mutate(ex = paste(var.color, collapse = "_"))
    
    #     ID     var.color     ex
    #    <fctr>    <fctr>     <chr>
    #1      A       red     red_blue
    #2      A      blue     red_blue
    #3      B     green     green_red
    #4      B       red     green_red
    #5      C     green     green_yellow
    #6      C    yellow     green_yellow