Search code examples
r

I try to read gz file into R, but got an error message: line 1 did not have 9 elements


Here is my code:

imdb <- read.table(gzfile("/imdb_dataset/title.basics.tsv.gz"), sep = " ")

The error:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 1 did not have 9 elements

The file, where you can see how the columns is separated etc.

screenshot of tsv

In fact, the first line has 9 elements, so what could be the issue?

tt0000010   short   Exiting the Factory La sortie de l'usine Lumière à Lyon 0   1895    \N  1   Documentary,Short
tt0000011   short   Akrobatisches Potpourri Akrobatisches Potpourri 0   1895    \N  1   Documentary,Short
tt0000012   short   The Arrival of a Train  L'arrivée d'un train à La Ciotat    0   1896    \N  1   Action,Documentary,Short

Solution

  • I see 2 potential problems with your import:

    1. You give a space (" ") instead of a tab ("\t") as the delimiter but you say it's a tsv
    2. There are a bunch of \N characters that could throw it off - try replacing those