I often apply filters in R, e.g. when selecting a particular sample or feature or excluding implausible values. Is there a fast and easy way to keep track on how many units were deleted after every filter? I like to save the number of observations in a .csv or .txt file. I think Stata is reporting observations used after every step in a log file. What can I do in R?
data(mtcars)
library(tidyverse)
sample <- mtcars %>% # 32 obs
filter(mpg > 20) %>% # 14 obs
filter(cyl == 4) %>% # 11 obs
filter(am == 0) # 3 obs
You can use tidylog
(a package built around the tidyverse
) to add information in the console when you perform filter
(s) and other tidyverse
functions:
library(tidylog)
sample <- mtcars %>% # 32 obs
filter(mpg > 20) %>% # 14 obs
filter(cyl == 4) %>% # 11 obs
filter(am == 0) # 3 obs
#filter: removed 18 rows (56%), 14 rows remaining
#filter: removed 3 rows (21%), 11 rows remaining
#filter: removed 8 rows (73%), 3 rows remaining