Search code examples
rdplyrconcatenationsummarize

dplyr concatenate column by variable value


I can concatenate one column of data.frame, following the code as below if the column name is available.

  • However, How about the "column" name saved in the variable?
  • Further question is, how can I specify the columns by the value of a variable? (!!sym() ?)

Here are test code:

> library(dplyr)
> packageVersion("dplyr")
[1] ‘1.0.7’

> df <- data.frame(x = 1:3, y = c("A", "B", "A"))
> df %>%
  group_by(y) %>%
  summarise(z = paste(x, collapse = ","))

# A tibble: 2 x 2
  y     z    
  <chr> <chr>
1 A     1,3  
2 B     2  

I have a variable a, with the value x, How can I do above summarize?

> a <- "x"
> df %>%
  group_by(y) %>%
  summarise(z = paste(a, collapse = ","))

# A tibble: 2 x 2
  y     z    
  <chr> <chr>
1 A     x    
2 B     x

Solution-1: use !!sym()

> a <- "x"
> df %>%
  group_by(y) %>%
  summarise(z = paste(!!sym(a), collapse = ","))

# A tibble: 2 x 2
  y     z    
  <chr> <chr>
1 A     1,3  
2 B     2 

Solution-2: Assign the column to new variable

> df %>%
  group_by(y) %>%
  rename(new_col = a) %>%
  summarise(z = paste(new_col, collapse = ","))

# A tibble: 2 x 2
  y     z    
  <chr> <chr>
1 A     1,3  
2 B     2  

Are there any other ways to do the job?

similar questions could be found: https://stackoverflow.com/a/15935166/2530783 ,https://stackoverflow.com/a/50537209/2530783,


Solution

  • Here are some other options -

    1. Use .data -
    library(dplyr)
    
    a <- "x"
    df %>% group_by(y) %>% summarise(z = toString(.data[[a]]))
    
    #   y     z    
    #  <chr> <chr>
    #1 A     1, 3 
    #2 B     2    
    
    1. get
    df %>% group_by(y) %>% summarise(z = toString(get(a)))
    
    1. as.name
    df %>% group_by(y) %>% summarise(z = toString(!!as.name(a)))
    

    paste(..., collapse = ',') is equivalent to toString.