Search code examples
rdplyrdistinctdata-manipulation

R: "stack" columns on top of each other


Suppose I have some data like this:

library(dplyr)

a = data.frame( "col" = c("red", "red", "green"), "coll" = c("blue", "blue", "yellow"))

I am trying to take all unique values from "a" and put them into a new frame:

final = data.frame("col" = c("red", "green", "blue", "yellow")

I tried the following approach:

first_col = a %>% distinct(col)
second_col = a %>% distinct(coll)

final = cbind(first_col, second_col)

But this does not seem to be correct.

Could someone please show me what I am doing wrong?

Thanks


Solution

  • You could unlist the dataframe into vector and get unique values from it.

    final <- data.frame(col = unique(unlist(a)))
    final
    #     col
    #1    red
    #2  green
    #3   blue
    #4 yellow
    

    A general tidyverse solution would be to get data in long format and get distinct values.

    library(dplyr)
    library(tidyr)
    
    a %>%
      pivot_longer(cols = everything()) %>%
      distinct(value)