I have a strange error when I try to use the id
argument read_csv
together with the col_select()
but only if I refer to the columns using their numeric position in a non-continuous way (eg. c(1:2,4)).
Specifically when you import multiple csv files from a file path using readr::read_csv and use the id
argument and use the col_select
argument by referencing the columns by position in a a non-continuous way then there is an error.
However if we delete the id
argument, reference the columns by position without "gaps" or even reference the columns by name then it will work.
Is there a way to fix this? or is this a bug?
Please see below:
#pretend this file file path has at least two csv files in it
files <- "insert_file_path"
#this will not work, notice there is a "gap" in the refernece (no column 3)
##the error suggest it can't subset columns that don't exist
read_csv(files,
id = "source",
col_select = c(1:2,4),#notice that I do not pull in the 3rd column
)
#this will work now that I references without any "gaps"
read_csv(files,
id = "source",
col_select = c(1:3,4) #however it will work when I pull in columns in a continuous way
)
#howewever if we remove the "id" argument, the below code will work (even though there is a "gap")
read_csv(files,
#id = "source",
col_select = c(1:2,4),#however this will work (pulls in different columns though)
)
I have submitted the below as an issue on readr and it has been labeled as a bug.