Search code examples
pythonmachine-learningpandas-datareader

[Python]; Parser error: Too many columns specified


I just want to read a simple .csv file with a header specifying the column types. The following is the code:

import pandas as pd
url="https://www.dropbox.com/s/n6yt908tgetuq63/LasVegasTripAdvisorReviews-Dataset.csv?dl=0"
names=['User country','Nr. reviews','Nr. hotel reviews','Helpful 
votes','Score','Period of stay','Traveler Type','Pool','Gym','Tennis 
court','Spa','Casino','Free internet','Hotel name','Hotel stars','Nr. 
rooms','User continent','Member years','Review month','Review weekday']
data=pd.read_csv(url, names=names, header=0, delimiter=';', 
error_bad_lines=False)
print(data.shape)

OUT:-

ParserError: Too many columns specified: expected 20 and found 2

P.S:The URL is public and can be accessed


Solution

  • The problem is the URL doesn't directly lead to the .csv file. It leads to the entire html page.

    You can see that by removing the names argument

    pd.read_csv(url, header=0, delimiter=';', error_bad_lines=False)

    This successfully executes, but when inspecting the returned values, you'll see html code and JavaScript scripts.

    What you need to do is make sure you provide actual csv as input (try another source for the .csv file)