I've found good tips about fast ways to import files into R, but I'm wondering if it is possible to import only a subset of a given file into a variable.
In my case, I have a file with 16 million rows saved as .rds (and also as .feather, as I was playing with the speed of both formats) and I'd like to import a subset of it (say, a few rows or a few columns) for initial analysis.
Is it possible? The readRDS() does not seem to accept any subsetting, while read_feather() does not seem to allow row selection (although you can specify the columns). Should I consider another data format?
The short answer is 'no'. A nice alternative is the fst
file format, which does allow the retrieval of a selection of columns and rows from a large dataset. More info here.