I am using the tidyverse to filter out a dataframe and would like a print at each step of the dimensions (or nrows) of the intermediate objects. I thought I could simply use a tee pipe operator from magrittr but it doesn't work. I think I understand the concept behind the tee pipe but can't figure out what is wrong. I searched extensively but didn't find much resources about the tee pipe.
I built a simple example with the mtcars dataset. Printing the intermediate objects works but not if I replace with dim() or nrow().
library(tidyverse)
library(magrittr)
mtcars %>%
filter(cyl > 4) %T>% dim() %>%
filter(am == 0) %T>% dim() %>%
filter(disp >= 200) %>% dim()
I can of course write that in R base but would like to stick to the tidyverse spirit. I probably underlooked something about tee pipe concept and any comments/solutions will be greatly appreciated.
EDIT: Following @hrbrmstr and @akrun nice and quick answers, I tried again to stick to tee pipe operator without writing a function. I don't know why I didn't find out the answer earlier myself but here is the syntax I was looking for:
mtcars %>%
filter(cyl > 4) %T>% {print(dim(.))} %>%
filter(am == 0) %T>% {print(dim(.))} %>%
filter(disp >= 200) %>% {print(dim(.))}
Despite the need of a function, @hrbrmstr solution is indeed easier to "clean up".
@akrun's idea works, but it's not idiomatic tidyverse. Other functions in the tidyverse
, like print()
and glimpse()
return the data parameter invisibly so they can be piped without resorting to {}
. Those {}
make it difficult to clean up pipes after your done exploring what's going on.
Try:
library(tidyverse)
tidydim <- function(x) {
print(dim(x))
invisible(x)
}
mtcars %>%
filter(cyl > 4) %>%
tidydim() %>%
filter(., am == 0) %>%
tidydim() %>%
filter(., disp >= 200) %>%
tidydim()
That way your "cleanup" (i.e. not producing interim console output) canbe to quickly/easily remove the tidydim()
lines or remove the print(…)
from the function.