Search code examples
rreadr

reading in tab delimited text file with tidyverse readr - columns not parsing


I am trying to read in a tab delimited text file that consists of a header or column names and decimal number data in the rows that follow. readr is however not parsing the data correctly - it sees the entire header as one long set of characters instead of columns so I end up with single column data - and I am not sure how to resolve this. Below is what the it looks like when I read in the data:

data <- read_tsv("data.txt")

Rows: 62893 Columns: 1
── Column specification ──────────────────────────────────────────────────────────────────────── Delimiter: "\t" chr (1): Lon Lat Col1 Col2 Col3...

ℹ Use spec() to retrieve the full column specification for this data.

ℹ Specify the column types or set show_col_types = FALSE to quiet this message.

When I print out the first few lines this what I get:

head(data)

# A tibble: 3 × 1

Lon Lat Col1 Col2 Col3

<chr>

1 "-123.25 38.25 0.000002 0.030371 0.000002"

2 "96.25 67.75 0.308539 0.063202 0.000019"

3 "-73.75 -15.75 0.018868 0.048092 0.013427"

# ℹ abbreviated name:

# ` Lon Lat Col1 Col2 Col3 `

Any ideas on how to get readr to pars this table correctly so so it sees the data as 3 rows and 5 columns instead of 3 rows and 1 column? And recognizes that the first row is a header column? (I tried using the col_names = TRUE attribute of the read_tsv() function and that did not work). It clearly sees each row a single long character string, including the first (header) row but shouldn't because the data is tab delimited.


Solution

  • I have just copy and paste the data from above:

    data.txt

    Lon Lat Col1 Col2 Col3
    -123.25 38.25 0.000002 0.030371 0.000002
    96.25 67.75 0.308539 0.063202 0.000019
    -73.75 -15.75 0.018868 0.048092 0.013427
    

    you can use read.csv and specify the sep parameter as space " ":

    read.csv("data/data.txt", sep=" ")
    #      Lon    Lat     Col1     Col2     Col3
    #1 -123.25  38.25 0.000002 0.030371 0.000002
    #2   96.25  67.75 0.308539 0.063202 0.000019
    #3  -73.75 -15.75 0.018868 0.048092 0.013427