Search code examples
rdplyrfixed-widthread.fwf

R: How to read fixed width datafile where the data is concatenated into two sets, stacked on top in one file


Hopefully the title make sense.

In essence, there are two datasets in one file.

Row 1 has the headings, by loc, for dataset1. Then lines 2-1500 are the entries for those locs.

At row 1501 is the heading, by loc, for dataset2. Then lines 1502-3001 are the entries for those locs.

How can I read in a fixed with file with these properties, providing the header spacings for each dataset (and the point at which dataset2 starts).


Solution

  • Here are two methods:

    Using the skip and nrows arguments:

    first <- read.table("file", header = T, nrows = 1500)
    second <- read.table("file", header = T, skip=1501, nrows = 1500)
    

    Reading in the entire file then splitting it up:

    allLines <- read.table("file", header = T)
    first <- allLines[1:1500, ]
    second <- allLines[1502:3002, ]
    names(second) <- allLines[1501, ] ## or colnames if working with a matrix