Search code examples
regexrstrsplit

Splitting a string by space except when contained within quotes


I've been trying to split a space delimited string with double-quotes in R for some time but without success. An example of a string is as follows:

rainfall snowfall "Channel storage" "Rivulet storage"

It's important for us because these are column headings that must match the subsequent data. There are other suggestions on this site as to how to go about this but they don't seem to work with R. One example:

Regex for splitting a string using space when not surrounded by single or double quotes

Here is some code I've been trying:

str <- 'rainfall snowfall "Channel storage" "Rivulet storage"'
regex <- "[^\\s\"']+|\"([^\"]*)\""
split <- strsplit(str, regex, perl=T)

what I would like is

[1] "rainfall" "snowfall" "Channel storage" "Rivulet storage"

but what I get is:

[1] ""  " " " " " "

The vector is the right length (which is encouraging) but of course the strings are empty or contain a single space. Any suggestions?

Thanks in advance!


Solution

  • scan will do this for you

    scan(text=str, what='character', quiet=TRUE)
    [1] "rainfall"        "snowfall"        "Channel storage" "Rivulet storage"