fread
from the data.table
package can generally automatically determine the column separator (sep
) when reading a file.
For example, here fread
automatically detects |
as column delimiter:
library(data.table)
fread(paste(c("A|1", "B|2", "C|3"), collapse = "\n"))
# V1 V2
# 1: A 1
# 2: B 2
# 3: C 3
But how can I retrieve the column separator which eventually was used by fread
(here, the |
)?
As Henrik mentions, this info is printed to the console if verbose = TRUE
is chosen. You can capture the info printed about the separator with
library(magrittr)
example <- paste(c("A|1", "B|2", "C|3"), collapse = "\n")
capture.output(fread(example, verbose = TRUE) %>% {NULL}) %>%
.[grepl('Detecting sep', .)]
#[1] "Detecting sep ... '|'"
You could also just implement your own delimiter finder based on the description of how fread
finds the delimiter:
Defaults to the first character in the set
[,\t |;:]
that exists on lineautostart
outside quoted (""
) regions