Finally! TidyEval is getting easier, which led me to do a pronoun test between the magrittr .
pronoun and the rlang pronoun .data
.
library(tidyverse)
identical(head(iris, 2) %>% mutate(col = .$Species),
head(iris, 2) %>% mutate(col = .data$Species))
#> [1] TRUE
Look at that. They are exactly the same. Except they're probably not. From the article linked above:
The . pronoun from magrittr is not appropriate here because it represents the whole data frame, whereas .data represents the subset for the current group.
What are the differences? You're probably thinking, "Just read that sentence above that you pasted". Unfortunately I need a little more explanation if you can provide it. Some kind of examples would be nice. The first thing I thought of trying (code above) show the two pronouns as "identical". I'm sensing a contradiction here. Thank you.
Hopefully this will illustrate the quote in your question :
``` r
library(dplyr)
iris[48:52,] %>%
group_by(Species) %>%
transmute(
Sepal.Length,
col0 = mean(Sepal.Length),
col1 = mean(.$Sepal.Length),
col2 = mean(.data$Sepal.Length))
#> # A tibble: 5 x 5
#> # Groups: Species [2]
#> Species Sepal.Length col0 col1 col2
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 setosa 4.6 4.97 5.66 4.97
#> 2 setosa 5.3 4.97 5.66 4.97
#> 3 setosa 5 4.97 5.66 4.97
#> 4 versicolor 7 6.7 5.66 6.7
#> 5 versicolor 6.4 6.7 5.66 6.7
```
I think some like to use it to pass arguments as strings without the !!sym(foo)
gymnastics :
col <- "Species"
iris[48:52,] %>%
mutate(
SPECIES1 = toupper(!!sym(col)),
SPECIES2 = toupper(.data[[col]]))
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species SPECIES1
#> 1 4.6 3.2 1.4 0.2 setosa SETOSA
#> 2 5.3 3.7 1.5 0.2 setosa SETOSA
#> 3 5.0 3.3 1.4 0.2 setosa SETOSA
#> 4 7.0 3.2 4.7 1.4 versicolor VERSICOLOR
#> 5 6.4 3.2 4.5 1.5 versicolor VERSICOLOR
#> SPECIES2
#> 1 SETOSA
#> 2 SETOSA
#> 3 SETOSA
#> 4 VERSICOLOR
#> 5 VERSICOLOR
For what it's worth I had to use .data
maybe 3 times all in all and when I did there was probably a better way to go at it. I think one or two of those where with ggplot2
.
You can mostly ignore the existence of .data
and still become a very decent tidyverse ninja.