Search code examples
rdplyrstring-concatenation

Concatenating two text columns in dplyr


My data looks like this:

round <- c(rep("A", 3), rep("B", 3))
experiment <- rep(c("V1", "V2", "V3"), 2)
results <- rnorm(mean = 10, n = 6)

df <- data.frame(round, experiment, results)

> df
  round experiment   results
1     A         V1  9.782025
2     A         V2  8.973996
3     A         V3  9.271109
4     B         V1  9.374961
5     B         V2  8.313307
6     B         V3 10.837787

I have a different dataset that will be merged with this one where each combo of round and experiment is a unique row value, ie, "A_V1". So what I really want is a variable name that concatenates the two columns together. However, this is tougher to do in dplyr than I expected. I tried:

name_mix <- paste0(df$round, "_", df$experiment)
new_df <- df %>%
  mutate(name = name_mix) %>%
  select(name, results)

But I got the error, Column name must be length 1 (the group size), not 6. I also tried the simple base-R approach of cbind(df, name_mix) but received a similar error telling me that df and name_mix were of different sizes. What am I doing wrong?


Solution

  • You can use the unite function from tidyr

    require(tidyverse)
    
    df %>% 
      unite(round_experiment, c("round", "experiment"))
    
      round_experiment   results
    1             A_V1  8.797624
    2             A_V2  9.721078
    3             A_V3 10.519000
    4             B_V1  9.714066
    5             B_V2  9.952211
    6             B_V3  9.642900