Search code examples
rdplyrtidyracross

Use ~separate after mutate and across


The aim is to transform all Species "setosa" rows to one row "setosa":(This is a minimal example(in real more columns and more groups):

I have this dataframe:

head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species)

  Sepal.Length Sepal.Width Species
         <dbl>       <dbl> <fct>  
1          5.1         3.5 setosa 
2          4.9         3   setosa 

I use summarise with toString to get:

  Species Sepal.Length Sepal.Width
  <fct>   <chr>        <chr>      
1 setosa  5.1, 4.9     3.5, 3  

Expected output: I want this dataframe:

  Species Sepal.Length1 Sepal.Length2 Sepal.Width1 Sepal.Width2
  <fct>           <dbl>         <dbl>        <dbl>        <int>
1 setosa            5.1           4.9          3.5            3

I achieve this with this working code:

head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~toString(.))) %>% 
  ungroup() %>% 
  separate(Sepal.Length, c("Sepal.Length1", "Sepal.Length2"),  sep = ", ", convert = TRUE) %>% 
  separate(Sepal.Width, c("Sepal.Width1", "Sepal.Width2"),  sep = ", ", convert = TRUE)

However I would like to be able to use separate after mutate across with anonymous function with this not working code:

head(iris, 2) %>%
  select(1,2,5) %>% 
  group_by(Species) %>% 
  summarise(across(everything(), ~toString(.))) %>% 
  ungroup() %>% 
  mutate(across(-1, ~separate(., into = paste0(., 1:2), sep = ", ", convert = TRUE)))

Error: Problem with `mutate()` input `..1`.
i `..1 = across(-1, ~separate(., into = paste0(., 1:2), sep = ", ", convert = TRUE))`.
x no applicable method for 'separate' applied to an object of class "character"

I want to learn how to apply separate function after mutate and across.


Solution

  • Main issue is that separate requires input as a data.frame. We may wrap in a tibble and then do separate if we want this within across and finally unnest the list output

    library(dplyr)
    library(tidyr)
    library(stringr)
    head(iris, 2) %>%
      select(1,2,5) %>% 
      group_by(Species) %>% 
      summarise(across(everything(), ~toString(.)), .groups = 'drop') %>%
      mutate(across(-1, ~ list(tibble(col1 = .) %>% 
            separate(col1, into = str_c(cur_column(), 1:2), sep = ",\\s+")))) %>% 
      unnest(cols = c(Sepal.Length, Sepal.Width))
    

    -output

    # A tibble: 1 × 5
      Species Sepal.Length1 Sepal.Length2 Sepal.Width1 Sepal.Width2
      <fct>   <chr>         <chr>         <chr>        <chr>       
    1 setosa  5.1           4.9           3.5          3