Search code examples
rreadr

How to read a fixed width file knowing column names but not the widths?


I recently encountered a problem where we had a fixed width file. For example -

Name   Income
John   $10,000
Mary   $15,000
Walter $25,000

How to read the fixed width files using just the column names?


Solution

  • In order to solve this problem I came across a readr function read_fwf() which takes file name as an argument and another argument fwf_empty() specifying the whether the fix width be guess or not.

    Say, my file name is fixed_width_file.csv, and I have a million rows. I would read the file just by using the column names.

    library(readr)
    read_fwf("fixed_width_file.csv",
             fwf_empty("fixed_width_file.csv", 
             col_names = c("Name", "Income")),
             skip = 1)
    

    Check to see that the columns are aligned by looking at head of the data.frame.

    I will update the answer as I know more.