I have been looking to a lot of answers and still I can't completely understand them. For example, the clearest one (here), among others (1,2,3) gives specific examples about the various uses of the dot but I cannot understand, for example, its application here:
car_data <-
mtcars %>%
subset(hp > 100) %>%
aggregate(. ~ cyl, data = ., FUN = . %>% mean %>% round(2)) %>%
transform(kpl = mpg %>% multiply_by(0.4251)) %>%
print
#result:
cyl mpg disp hp drat wt qsec vs am gear carb kpl
1 4 25.90 108.0 111.0 3.94 2.15 17.75 1.00 1.00 4.50 2.00 11.010
2 6 19.74 183.3 122.3 3.59 3.12 17.98 0.57 0.43 3.86 3.43 8.391
3 8 15.10 353.1 209.2 3.23 4.00 16.77 0.00 0.14 3.29 3.50 6.419
The code above is from an explanation for %>% in magrittr, where I'm trying to understand the pipe operator also (I know that it gives you the result of the previous computation, but I get lost in the aggregate
code line when it mixes .
, and %>%
inside the same function.
So, I can't understand what does the code above. I have the result (I put it above). But I don't get how it reach that result, specially the aggregate
code line, where it uses the dot and the ~
sign. I know that ~
means "all other variables", but what it means with the dot? It has another meaning or application? And what does the pipe operator inside a specific function?
That line uses the .
in three different ways.
[1] [2] [3]
aggregate(. ~ cyl, data = ., FUN = . %>% mean %>% round(2))
Generally speaking you pass in the value from the pipe into your function at a specific location with .
but there are some exceptions. One exception is when the .
is in a formula. The ~
is used to create formulas in R. The pipe wont change the meaning of the formula, so it behaves like it would without any escaping. For example
aggregate(. ~ cyl, data=mydata)
And that's just because aggregate
requires a formula with both a left and right hand side. So the .
at [1]
just means "all the other columns in the dataset." This use is not at all related to magrittr.
The .
at [2]
is the value that's being passed in as the pipe. If you have a plain .
as a parameter to the function, that's there the value will be placed. So the result of the subset()
will go to the data=
parameter.
The magrittr
library also allows you to define anonymous functions with the .
variable. If you have a chain that starts with a .
, it's treated like a function. so
. %>% mean %>% round(2)
is the same as
function(x) round(mean(x), 2)
so you're just creating a custom function with the .
at [3]