Search code examples
rreadr

Read.Table vs. Read_Table in Readr Package - Extra Columns Added with Readr


I am trying to read a zip file using the readr package. My original csv file has 170 columns.

When using the regular read.table function to read a zip file into R as such, no extra columns are added:

data1 <- read.table(unz(zip_file,csv_file), skip = 10, header=T, quote="\"", sep=",")

When I try to reproduce this with read_table like below:

data2 <- read_table(unz(zip_file,csv_file), skip = 10)

there are many more extra columns.

There are 170 columns when I use read.table and 1461 with read_table.

Below are a list of some the columns from excel (so that you can get an idea of what the original looks like) and I was wondering how I can use the read_table function to read everything with no extra columns added:

Column Names: 
A
B
C
D (A)
D (B)
E F
G
A B C : 2017 D E - F G: H I
J.org - B : L -- K.org: F C
2016 TEST TESTING : Baltimore TEST TESt: H B

There are a bunch of spaces, dashes, colons, etc. that I think are causing the read_table to add the extra columns.

How do I avoid having the extra columns but at the same time keeping the columns in the original format?

Thanks!


Solution

  • If you use readr::read_csv it should work without adding additional columns as it correctly picks up the appropriate delimiters from your CSV file.

    data2 <- read_csv(unz(zip_file,csv_file), skip = 10)