Search code examples
rdplyrtidyr

Opposite of tidyr::separate, concatenating multiple columns into one


I have a data frame:

df <- data.frame(
    id = c(1, 2, 3),
    `1` = c("W4", "W5", 49),
    `2` = c("L", "O1", "P6"),
    `3` = c(1, 2, 10),
    `4` = c("H7", NA, "K"),
    `5` = c("J8", NA, NA)
)

How can I concatenate/paste the columns together with sep = ","

(The opposite of tidyr::separate(), I guess?)

Desired output:

id  string
1   W4, L, 1, H7, J8
2   W5, O1, 2
3   49, P6, 10, K

I'm wary of using paste because in my real dataset I have 1000 columns.


Solution

  • You can use the unite function from tidyr:

    library(tidyr)
    unite(df, string, X1:X5, sep = ", ")
    #  id            string
    #1  1  W4, L, 1, H7, J8
    #2  2 W5, O1, 2, NA, NA
    #3  3 49, P6, 10, K, NA
    

    Note that it also has a remove argument that is TRUE by default. If you set it to FALSE, the original columns are kept in the data.

    For the column specification (which columns to unite) you can use the colon operator (:) as I did above or use the special functions described in ?dplyr::select.