I'm playing around with list-columns
of data.frames
with {dplyr} 1.0.0 and I'm wondering whether it is possible to rename()
and mutate()
columns in each data.frame
without leaving the pipe when the nested data.frame is grouped rowwise
.
Why do I want to know / do this? As far I understand the philosophy of {dplyr} 1.0.0 it is recommending rowwise()
instead of using {purrr}'s map
-family on columns. Below I first show what I did before {dplyr} 1.0.0 and then show a couple of examples (most of them not working) for {dplyr} 1.0.0.
While {rlang} supports glue strings on the left hand side (LHS) which can be used when writing {dplyr} custom functions, the LHS of {dplyr} functions in a rowwise
tibble
seems not to be supported yet (at least my examples below are not working).
For rename
I found a way using rename_with()
, but I have no idea how to get it working with mutate
.
I also do not understand most of the error message I get. They more or less say that I'm not using a string on the LHS before :=
but in rowwise
mode my referenced column (new
) is actually a character vector of length == 1
.
library(dplyr, quietly = TRUE, warn.conflicts = FALSE)
library(purrr)
myiris <- iris %>%
nest_by(Species, .key = "mydat") %>%
ungroup %>%
mutate(new = letters[1:3])
# our data looks like this
# we want to use the strings in column `new` on the LHS of `rename` and `mutate`
myiris
#> # A tibble: 3 x 3
#> Species mydat new
#> <fct> <list<tbl_df[,4]>> <chr>
#> 1 setosa [50 x 4] a
#> 2 versicolor [50 x 4] b
#> 3 virginica [50 x 4] c
# For reference: under dplyr < 1.0 I did the following:
# rename in pipe
# working
myiris %>%
mutate(mydat = map2(mydat, new,
~ rename_at(.x, "Sepal.Length", function(z) paste(.y)))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#> a Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3 1.4 0.2
#> 3 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 1.5 0.2
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 4
#> b Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 7 3.2 4.7 1.4
#> 2 6.4 3.2 4.5 1.5
#> 3 6.9 3.1 4.9 1.5
#> 4 5.5 2.3 4 1.3
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 4
#> c Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 6.3 3.3 6 2.5
#> 2 5.8 2.7 5.1 1.9
#> 3 7.1 3 5.9 2.1
#> 4 6.3 2.9 5.6 1.8
#> # ... with 46 more rows
# mutate in pipe
# was never working even under dplyr < 1.0.0
myiris %>%
mutate(mydat = map2(mydat, new,
~ mutate(.x, eval(.y) := .y))) %>%
pull(mydat)
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `map2(mydat, new, ~mutate(.x, `:=`(eval(.y), .y)))`.
# mutate with custom function
# working
mymutate <- function(df, y) {
mutate(df, !! y := y)
}
myiris %>%
mutate(mydat = map2(mydat, new,
~ mymutate(.x, .y))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width a
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 a
#> 2 4.9 3 1.4 0.2 a
#> 3 4.7 3.2 1.3 0.2 a
#> 4 4.6 3.1 1.5 0.2 a
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width b
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 7 3.2 4.7 1.4 b
#> 2 6.4 3.2 4.5 1.5 b
#> 3 6.9 3.1 4.9 1.5 b
#> 4 5.5 2.3 4 1.3 b
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width c
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 6.3 3.3 6 2.5 c
#> 2 5.8 2.7 5.1 1.9 c
#> 3 7.1 3 5.9 2.1 c
#> 4 6.3 2.9 5.6 1.8 c
#> # ... with 46 more rows
# dplyr > 1.0.0
# objective: `rename()` or `mutate()` in pipe on list-column of data.frames
# while using different column names on LHS coming from another
# column (here `new`)
myiris_row <- myiris %>% rowwise
# rename --------
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename({{new}} := "Sepal.Length")))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename(!! new := "Sepal.Length")))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename(!! sym(new) := "Sepal.Length")))
#> Error: Only strings can be converted to symbols
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% rename(all_of(new) := "Sepal.Length")))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% rename(`:=`(all_of(new), "Sepal.Length")))`.
#> i The error occured in row 1.
# working, but only with `rename_with()`
myiris_row %>%
mutate(mydat = list(mydat %>% rename_with(~ new, "Sepal.Length"))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#> a Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3 1.4 0.2
#> 3 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 1.5 0.2
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 4
#> b Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 7 3.2 4.7 1.4
#> 2 6.4 3.2 4.5 1.5
#> 3 6.9 3.1 4.9 1.5
#> 4 5.5 2.3 4 1.3
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 4
#> c Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 6.3 3.3 6 2.5
#> 2 5.8 2.7 5.1 1.9
#> 3 7.1 3 5.9 2.1
#> 4 6.3 2.9 5.6 1.8
#> # ... with 46 more rows
# mutate ------
# the values of the new column don't matter
# here we just use the same input as the name, to show that RHS evaluation is easier.
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% mutate(!! new := new)))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(...)`.
#> i The error occured in row 1.
# not working
myiris %>%
mutate(mydat = list(mydat %>% mutate(!! sym(new) := new)))
#> Error: Only strings can be converted to symbols
# not working
myiris_row %>%
mutate(mydat = list(mydat %>% mutate(all_of(new) := new)))
#> Error: Problem with `mutate()` input `mydat`.
#> x The LHS of `:=` must be a string or a symbol
#> i Input `mydat` is `list(mydat %>% mutate(`:=`(all_of(new), new)))`.
#> i The error occured in row 1.
# almost working (what's going on in the data[[1]] btw!)
myiris_row %>%
mutate(mydat = list(mydat %>% mutate("{{new}}" := new))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width `promise_fn(3L)`
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 a
#> 2 4.9 3 1.4 0.2 a
#> 3 4.7 3.2 1.3 0.2 a
#> 4 4.6 3.1 1.5 0.2 a
#> # ... with 46 more rows
#>
#> [[2]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width `"b"`
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 7 3.2 4.7 1.4 b
#> 2 6.4 3.2 4.5 1.5 b
#> 3 6.9 3.1 4.9 1.5 b
#> 4 5.5 2.3 4 1.3 b
#> # ... with 46 more rows
#>
#> [[3]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width `"c"`
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 6.3 3.3 6 2.5 c
#> 2 5.8 2.7 5.1 1.9 c
#> 3 7.1 3 5.9 2.1 c
#> 4 6.3 2.9 5.6 1.8 c
#> # ... with 46 more rows
Created on 2020-12-22 by the reprex package (v0.3.0)
You can protect your !!
from the outside call by using quote()
, and then use !!
again in your nested call to unquote it :
myiris_row %>%
mutate(mydat = list(mydat %>% mutate(!! quote(!!new) := new))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width a
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.1 3.5 1.4 0.2 a
#> 2 4.9 3 1.4 0.2 a
#> 3 4.7 3.2 1.3 0.2 a
#> 4 4.6 3.1 1.5 0.2 a
#> 5 5 3.6 1.4 0.2 a
#> 6 5.4 3.9 1.7 0.4 a
#> 7 4.6 3.4 1.4 0.3 a
#> 8 5 3.4 1.5 0.2 a
#> 9 4.4 2.9 1.4 0.2 a
#> 10 4.9 3.1 1.5 0.1 a
#> # ... with 40 more rows
#>
#> [[2]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width b
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 7 3.2 4.7 1.4 b
#> 2 6.4 3.2 4.5 1.5 b
#> 3 6.9 3.1 4.9 1.5 b
#> 4 5.5 2.3 4 1.3 b
#> 5 6.5 2.8 4.6 1.5 b
#> 6 5.7 2.8 4.5 1.3 b
#> 7 6.3 3.3 4.7 1.6 b
#> 8 4.9 2.4 3.3 1 b
#> 9 6.6 2.9 4.6 1.3 b
#> 10 5.2 2.7 3.9 1.4 b
#> # ... with 40 more rows
#>
#> [[3]]
#> # A tibble: 50 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width c
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 6.3 3.3 6 2.5 c
#> 2 5.8 2.7 5.1 1.9 c
#> 3 7.1 3 5.9 2.1 c
#> 4 6.3 2.9 5.6 1.8 c
#> 5 6.5 3 5.8 2.2 c
#> 6 7.6 3 6.6 2.1 c
#> 7 4.9 2.5 4.5 1.7 c
#> 8 7.3 2.9 6.3 1.8 c
#> 9 6.7 2.5 5.8 1.8 c
#> 10 7.2 3.6 6.1 2.5 c
#> # ... with 40 more rows
myiris_row %>%
mutate(mydat = list(mydat %>% rename(!! quote(!!new) := "Sepal.Length"))) %>%
pull(mydat)
#> [[1]]
#> # A tibble: 50 x 4
#> a Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5.1 3.5 1.4 0.2
#> 2 4.9 3 1.4 0.2
#> 3 4.7 3.2 1.3 0.2
#> 4 4.6 3.1 1.5 0.2
#> 5 5 3.6 1.4 0.2
#> 6 5.4 3.9 1.7 0.4
#> 7 4.6 3.4 1.4 0.3
#> 8 5 3.4 1.5 0.2
#> 9 4.4 2.9 1.4 0.2
#> 10 4.9 3.1 1.5 0.1
#> # ... with 40 more rows
#>
#> [[2]]
#> # A tibble: 50 x 4
#> b Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 7 3.2 4.7 1.4
#> 2 6.4 3.2 4.5 1.5
#> 3 6.9 3.1 4.9 1.5
#> 4 5.5 2.3 4 1.3
#> 5 6.5 2.8 4.6 1.5
#> 6 5.7 2.8 4.5 1.3
#> 7 6.3 3.3 4.7 1.6
#> 8 4.9 2.4 3.3 1
#> 9 6.6 2.9 4.6 1.3
#> 10 5.2 2.7 3.9 1.4
#> # ... with 40 more rows
#>
#> [[3]]
#> # A tibble: 50 x 4
#> c Sepal.Width Petal.Length Petal.Width
#> <dbl> <dbl> <dbl> <dbl>
#> 1 6.3 3.3 6 2.5
#> 2 5.8 2.7 5.1 1.9
#> 3 7.1 3 5.9 2.1
#> 4 6.3 2.9 5.6 1.8
#> 5 6.5 3 5.8 2.2
#> 6 7.6 3 6.6 2.1
#> 7 4.9 2.5 4.5 1.7
#> 8 7.3 2.9 6.3 1.8
#> 9 6.7 2.5 5.8 1.8
#> 10 7.2 3.6 6.1 2.5
#> # ... with 40 more rows