Search code examples
rdplyrrlangrowwise

R {dplyr}: `rename` or `mutate` data.frames in `rowwise` list-column with different column names on LHS


I'm playing around with list-columns of data.frames with {dplyr} 1.0.0 and I'm wondering whether it is possible to rename() and mutate() columns in each data.frame without leaving the pipe when the nested data.frame is grouped rowwise.

Why do I want to know / do this? As far I understand the philosophy of {dplyr} 1.0.0 it is recommending rowwise() instead of using {purrr}'s map-family on columns. Below I first show what I did before {dplyr} 1.0.0 and then show a couple of examples (most of them not working) for {dplyr} 1.0.0.

While {rlang} supports glue strings on the left hand side (LHS) which can be used when writing {dplyr} custom functions, the LHS of {dplyr} functions in a rowwise tibble seems not to be supported yet (at least my examples below are not working).

For rename I found a way using rename_with(), but I have no idea how to get it working with mutate.

I also do not understand most of the error message I get. They more or less say that I'm not using a string on the LHS before := but in rowwise mode my referenced column (new) is actually a character vector of length == 1.

library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(purrr)

myiris <- iris %>% 
  nest_by(Species, .key = "mydat") %>% 
  ungroup %>% 
  mutate(new = letters[1:3])

# our data looks like this
# we want to use the strings in column `new` on the LHS of `rename` and `mutate`
myiris
#> # A tibble: 3 x 3
#>   Species                 mydat new  
#>   <fct>      <list<tbl_df[,4]>> <chr>
#> 1 setosa               [50 x 4] a    
#> 2 versicolor           [50 x 4] b    
#> 3 virginica            [50 x 4] c

# For reference: under dplyr < 1.0 I did the following:

# rename in pipe
# working
myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ rename_at(.x, "Sepal.Length", function(z) paste(.y)))) %>% 
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>       a Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   5.1         3.5          1.4         0.2
#> 2   4.9         3            1.4         0.2
#> 3   4.7         3.2          1.3         0.2
#> 4   4.6         3.1          1.5         0.2
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>       b Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   7           3.2          4.7         1.4
#> 2   6.4         3.2          4.5         1.5
#> 3   6.9         3.1          4.9         1.5
#> 4   5.5         2.3          4           1.3
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>       c Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   6.3         3.3          6           2.5
#> 2   5.8         2.7          5.1         1.9
#> 3   7.1         3            5.9         2.1
#> 4   6.3         2.9          5.6         1.8
#> # ... with 46 more rows

# mutate in pipe
# was never working even under dplyr < 1.0.0
myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ mutate(.x, eval(.y) := .y))) %>% 
  pull(mydat)
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `map2(mydat, new, ~mutate(.x, `:=`(eval(.y), .y)))`.

# mutate with custom function
# working
mymutate <- function(df, y) {
  mutate(df, !! y := y)
}

myiris %>% 
  mutate(mydat = map2(mydat, new,
                      ~ mymutate(.x, .y))) %>% 
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width a    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          5.1         3.5          1.4         0.2 a    
#> 2          4.9         3            1.4         0.2 a    
#> 3          4.7         3.2          1.3         0.2 a    
#> 4          4.6         3.1          1.5         0.2 a    
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width b    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          7           3.2          4.7         1.4 b    
#> 2          6.4         3.2          4.5         1.5 b    
#> 3          6.9         3.1          4.9         1.5 b    
#> 4          5.5         2.3          4           1.3 b    
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width c    
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          6.3         3.3          6           2.5 c    
#> 2          5.8         2.7          5.1         1.9 c    
#> 3          7.1         3            5.9         2.1 c    
#> 4          6.3         2.9          5.6         1.8 c    
#> # ... with 46 more rows





# dplyr > 1.0.0
# objective: `rename()` or `mutate()` in pipe on list-column of data.frames 
#            while using different column names on LHS coming from another
#            column (here `new`)

myiris_row <- myiris %>% rowwise

# rename --------
# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename({{new}} := "Sepal.Length"))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! new := "Sepal.Length")))  
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(!! sym(new) := "Sepal.Length")))  
#> Error: Only strings can be converted to symbols

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename(all_of(new) := "Sepal.Length")))  
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% rename(`:=`(all_of(new), "Sepal.Length")))`.
#> i The error occured in row 1.

# working, but only with `rename_with()`
myiris_row %>% 
  mutate(mydat = list(mydat %>% rename_with(~ new, "Sepal.Length")))  %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#>       a Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   5.1         3.5          1.4         0.2
#> 2   4.9         3            1.4         0.2
#> 3   4.7         3.2          1.3         0.2
#> 4   4.6         3.1          1.5         0.2
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 4
#>       b Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   7           3.2          4.7         1.4
#> 2   6.4         3.2          4.5         1.5
#> 3   6.9         3.1          4.9         1.5
#> 4   5.5         2.3          4           1.3
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 4
#>       c Sepal.Width Petal.Length Petal.Width
#>   <dbl>       <dbl>        <dbl>       <dbl>
#> 1   6.3         3.3          6           2.5
#> 2   5.8         2.7          5.1         1.9
#> 3   7.1         3            5.9         2.1
#> 4   6.3         2.9          5.6         1.8
#> # ... with 46 more rows


# mutate ------
# the values of the new column don't matter
# here we just use the same input as the name, to show that RHS evaluation is easier.

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(!! new := new))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.

# not working
myiris %>% 
  mutate(mydat = list(mydat %>% mutate(!! sym(new) := new))) 
#> Error: Only strings can be converted to symbols

# not working
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate(all_of(new) := new))) 
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% mutate(`:=`(all_of(new), new)))`.
#> i The error occured in row 1.

# almost working (what's going on in the data[[1]] btw!)
myiris_row %>% 
  mutate(mydat = list(mydat %>% mutate("{{new}}" := new)))  %>%
  pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `promise_fn(3L)`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>           
#> 1          5.1         3.5          1.4         0.2 a               
#> 2          4.9         3            1.4         0.2 a               
#> 3          4.7         3.2          1.3         0.2 a               
#> 4          4.6         3.1          1.5         0.2 a               
#> # ... with 46 more rows
#> 
#> [[2]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `"b"`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          7           3.2          4.7         1.4 b    
#> 2          6.4         3.2          4.5         1.5 b    
#> 3          6.9         3.1          4.9         1.5 b    
#> 4          5.5         2.3          4           1.3 b    
#> # ... with 46 more rows
#> 
#> [[3]]
#> # A tibble: 50 x 5
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width `"c"`
#>          <dbl>       <dbl>        <dbl>       <dbl> <chr>
#> 1          6.3         3.3          6           2.5 c    
#> 2          5.8         2.7          5.1         1.9 c    
#> 3          7.1         3            5.9         2.1 c    
#> 4          6.3         2.9          5.6         1.8 c    
#> # ... with 46 more rows

Created on 2020-12-22 by the reprex package (v0.3.0)


Solution

  • You can protect your !! from the outside call by using quote(), and then use !! again in your nested call to unquote it :

    myiris_row %>% 
      mutate(mydat = list(mydat %>% mutate(!! quote(!!new) := new))) %>%
      pull(mydat)
    #> [[1]]
    #> # A tibble: 50 x 5
    #>    Sepal.Length Sepal.Width Petal.Length Petal.Width a    
    #>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
    #>  1          5.1         3.5          1.4         0.2 a    
    #>  2          4.9         3            1.4         0.2 a    
    #>  3          4.7         3.2          1.3         0.2 a    
    #>  4          4.6         3.1          1.5         0.2 a    
    #>  5          5           3.6          1.4         0.2 a    
    #>  6          5.4         3.9          1.7         0.4 a    
    #>  7          4.6         3.4          1.4         0.3 a    
    #>  8          5           3.4          1.5         0.2 a    
    #>  9          4.4         2.9          1.4         0.2 a    
    #> 10          4.9         3.1          1.5         0.1 a    
    #> # ... with 40 more rows
    #> 
    #> [[2]]
    #> # A tibble: 50 x 5
    #>    Sepal.Length Sepal.Width Petal.Length Petal.Width b    
    #>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
    #>  1          7           3.2          4.7         1.4 b    
    #>  2          6.4         3.2          4.5         1.5 b    
    #>  3          6.9         3.1          4.9         1.5 b    
    #>  4          5.5         2.3          4           1.3 b    
    #>  5          6.5         2.8          4.6         1.5 b    
    #>  6          5.7         2.8          4.5         1.3 b    
    #>  7          6.3         3.3          4.7         1.6 b    
    #>  8          4.9         2.4          3.3         1   b    
    #>  9          6.6         2.9          4.6         1.3 b    
    #> 10          5.2         2.7          3.9         1.4 b    
    #> # ... with 40 more rows
    #> 
    #> [[3]]
    #> # A tibble: 50 x 5
    #>    Sepal.Length Sepal.Width Petal.Length Petal.Width c    
    #>           <dbl>       <dbl>        <dbl>       <dbl> <chr>
    #>  1          6.3         3.3          6           2.5 c    
    #>  2          5.8         2.7          5.1         1.9 c    
    #>  3          7.1         3            5.9         2.1 c    
    #>  4          6.3         2.9          5.6         1.8 c    
    #>  5          6.5         3            5.8         2.2 c    
    #>  6          7.6         3            6.6         2.1 c    
    #>  7          4.9         2.5          4.5         1.7 c    
    #>  8          7.3         2.9          6.3         1.8 c    
    #>  9          6.7         2.5          5.8         1.8 c    
    #> 10          7.2         3.6          6.1         2.5 c    
    #> # ... with 40 more rows
    
    myiris_row %>% 
      mutate(mydat = list(mydat %>% rename(!! quote(!!new) := "Sepal.Length"))) %>%
      pull(mydat)
    #> [[1]]
    #> # A tibble: 50 x 4
    #>        a Sepal.Width Petal.Length Petal.Width
    #>    <dbl>       <dbl>        <dbl>       <dbl>
    #>  1   5.1         3.5          1.4         0.2
    #>  2   4.9         3            1.4         0.2
    #>  3   4.7         3.2          1.3         0.2
    #>  4   4.6         3.1          1.5         0.2
    #>  5   5           3.6          1.4         0.2
    #>  6   5.4         3.9          1.7         0.4
    #>  7   4.6         3.4          1.4         0.3
    #>  8   5           3.4          1.5         0.2
    #>  9   4.4         2.9          1.4         0.2
    #> 10   4.9         3.1          1.5         0.1
    #> # ... with 40 more rows
    #> 
    #> [[2]]
    #> # A tibble: 50 x 4
    #>        b Sepal.Width Petal.Length Petal.Width
    #>    <dbl>       <dbl>        <dbl>       <dbl>
    #>  1   7           3.2          4.7         1.4
    #>  2   6.4         3.2          4.5         1.5
    #>  3   6.9         3.1          4.9         1.5
    #>  4   5.5         2.3          4           1.3
    #>  5   6.5         2.8          4.6         1.5
    #>  6   5.7         2.8          4.5         1.3
    #>  7   6.3         3.3          4.7         1.6
    #>  8   4.9         2.4          3.3         1  
    #>  9   6.6         2.9          4.6         1.3
    #> 10   5.2         2.7          3.9         1.4
    #> # ... with 40 more rows
    #> 
    #> [[3]]
    #> # A tibble: 50 x 4
    #>        c Sepal.Width Petal.Length Petal.Width
    #>    <dbl>       <dbl>        <dbl>       <dbl>
    #>  1   6.3         3.3          6           2.5
    #>  2   5.8         2.7          5.1         1.9
    #>  3   7.1         3            5.9         2.1
    #>  4   6.3         2.9          5.6         1.8
    #>  5   6.5         3            5.8         2.2
    #>  6   7.6         3            6.6         2.1
    #>  7   4.9         2.5          4.5         1.7
    #>  8   7.3         2.9          6.3         1.8
    #>  9   6.7         2.5          5.8         1.8
    #> 10   7.2         3.6          6.1         2.5
    #> # ... with 40 more rows