Search code examples
rreadr

R readr: How do I define multiple comment characters in read_csv()?


I am trying to read a csv file with R readr::read_csv().

The csv file has comment lines that I would like to ignore, some of which start with "#" and others start with "Subject".

I can get R to ignore one of them, e.g. with

read_csv("data.csv", comment = "#") or read_csv("data.csv", comment = "Subject")

But how do I define both as comments? This was my idea, but it produced an error message:

read_csv("data.csv", comment = c("#", "Subject"))

Can anyone help me out? This is my first question here, I hope the format is alright. Thank you for your help!


Solution

  • Assuming you do not have legitimate lines that could start with the same character as the comments, I would just read it in with one comment character, e.g. # and then delete the lines that start with the other. Something like:

    library(readr)
    df  <- read_csv("./comments.csv", comment = "#")
    
    starts_with_subject  <- sapply(df[1], function(x) substr(x,1,7)=="Subject")
    df  <- df[!starts_with_subject,]