Search code examples
rread.csv

read.csv skip lines until string found in first column


I have 10 csv files that each have different numbers of lines to skip before the table starts with the data. Each table that I want to read has the same column name of 'Date Time' in the first column.

I'm reading in the data as a forloop and adding each data frame to a list. Is there a way to read in each dataframe and skip lines until the string 'Date Time' is detected and then read the table from that line?

Current Code

# Read troll data in
trollFiles <- list.files(pattern="_2019.csv")

dataList <- list()
baroList <- list()

for (var in trollFiles) {
  filepath <- file.path(paste(var))
  fileName <- (substr(var, 1,nchar(var)-4))
  file <- read.csv(filepath, sep="", header=T, col.names=c('DateTime', 'Pressure', 'Temperature'), check.names=FALSE, skip=25)
  file$Pressure <- as.numeric(file$Pressure)
  file$Temperature <- as.numeric(file$Temperature)
  ifelse(fileName == 'baro_2019', baroList[[fileName]] <- file,  dataList[[fileName]] <- file)  # save baro separately 
}

Solution

  • We can use data.table::fread, where we can specify first column name with skip:

    library(data.table)
    
    fread("
    sometext
    ignore
    c1,c2,c3
    11,22,33
    44,55,66", 
          skip = "c1")
    #   c1 c2 c3
    #1: 11 22 33
    #2: 44 55 66