Search code examples
rdplyrtidyversememory-efficient

tidyverse VS dplyr in R - Processing Power / Performance


I'm relatively new to R programming and I've been doing research, but I can't find the answer to this topic.

Does it take more processing power to load the full tidyverse in the beginning of the code rather than to load just dplyr package. For example, I might only need functions that can be found in dplyr. Am I reducing the speed/performance of my code by loading the full tidyverse, which must be a larger package considering that it contains several other packages? Or would the processing speed be the same regardless of which package I choose to load. From an ease of coding, I'd rather use tidyverse since it's more comprehensive, but if I'm using more processing power, then perhaps loading the less comprehensive package is more efficient.


Solution

  • As NelsonGon commented, your processing speed is not reduced by loading packages. Although the packages themselves will take time to load, it may be negligible, especially if you are already wanting to load dplyr, tidyr, and purrr for example.

    Loading more libraries on the search path (using library(dplyr) for example) might not hurt your speed, but may cause namespace errors down the road.

    Now, there are some benchmarks out there comparing dpylr, data.table, and base R and dpylr tends to be slower, but YMMV. Here's one I found: https://www.r-bloggers.com/2018/01/tidyverse-and-data-table-sitting-side-by-side-and-then-base-r-walks-in/. So, if you are doing operations that take a long time, it might be worthwhile to use data.table instead.