Search code examples
rregexreadr

Reading blocks of text into R with parsing errors


I am reading the of functions of a package of mine into R replacing some of the variables then saving the files again. This is to standardise the naming of the functions throughout the package.

The code declares no errors or warnings on the functions apart from one (I haven't yet tested to see if the functions actually work!). The issue is a parsing problem. I have narrowed the lines of code down to the following.

I have used read_delim as readLines does not appear to work with quotation marks.

{test <- '
print(paste("Time taken for simulation", i, "is",
                     SimulationRoundTime,
                     "minutes. Est time to completion",
                     round(as.numeric(TimeToCompletion, units = TimeUnit)), #
                     TimeUnit,
                     "Est completion time is",
                     ExpectedCompletionTime))'
  }

read_delim(test, delim ="[\n]",
                      col_names = F, col_types = "c")

Why does read_delim say there are 3 lines not 7?

the warning message I get is

Warning: 4 parsing failures.
row col                     expected actual         file
  3  X1 delimiter or quote                , literal data
  3  X1 delimiter or quote                E literal data
  3  X1 delimiter or quote                , literal data
  3  X1 closing quote at end of file        literal data

Solution

  • It's probably getting confused because test isn't really a delimited text. Maybe it has something to do with the double quotes and newlines. In any case you would be better off using stringr::str_split(test, "\\n")[[1]] or readr::read_lines(test) (or the base R versions). Both will return this:

    [1] ""                                                                             
    [2] "print(paste(\"Time taken for simulation\", i, \"is\","                        
    [3] "                     SimulationRoundTime,"                                    
    [4] "                     \"minutes. Est time to completion\","                    
    [5] "                     round(as.numeric(TimeToCompletion, units = TimeUnit)), #"
    [6] "                     TimeUnit,"                                               
    [7] "                     \"Est completion time is\","                             
    [8] "                     ExpectedCompletionTime))"