I just want to read a simple .csv file with a header specifying the column types. The following is the code:
import pandas as pd
url="https://www.dropbox.com/s/n6yt908tgetuq63/LasVegasTripAdvisorReviews-Dataset.csv?dl=0"
names=['User country','Nr. reviews','Nr. hotel reviews','Helpful
votes','Score','Period of stay','Traveler Type','Pool','Gym','Tennis
court','Spa','Casino','Free internet','Hotel name','Hotel stars','Nr.
rooms','User continent','Member years','Review month','Review weekday']
data=pd.read_csv(url, names=names, header=0, delimiter=';',
error_bad_lines=False)
print(data.shape)
OUT:-
ParserError: Too many columns specified: expected 20 and found 2
P.S:The URL is public and can be accessed
The problem is the URL doesn't directly lead to the .csv
file. It leads to the entire html page.
You can see that by removing the names
argument
pd.read_csv(url, header=0, delimiter=';', error_bad_lines=False)
This successfully executes, but when inspecting the returned values, you'll see html code and JavaScript scripts.
What you need to do is make sure you provide actual csv as input (try another source for the .csv
file)