Search code examples
pythoncsvnewlineurlliburlopen

Error with urlopen: new-line character seen in unquoted field


I am using urllib.urlopen with Python 2.7 to read csv files located on an external webserver:

# Try & Except statements removed for clarity
import urllib
import csv
url = ...
csv_file = urllib.urlopen(url)
for row in csv.reader(csv_file):
    do_something()

All 100+ files can be read fine, except one that has been updated recently and that returns:

Error: new-line character seen in unquoted field - do you need to open the file in universal-newline mode?

The file is accessible here. According to my text editor, its mode is Mac (CR), as opposed to Windows (CRLF) for the other files.

I found that based on this thread, python urlopen will handle correctly all formats of newlines. Therefore, the problem is likely to come from somewhere else. I have no clue though. The file opens fine with all my text editors and my speadsheet editors.

Does any one have any idea how to diagnose the problem ?

* EDIT *

The creator of the file informed me by email that I was not the only one to experience such issues. Therefore, he decided to make it again. The code above now works fine again. Unfortunately, using a new file also means that the issue can no longer be reproduced, and the solutions tested properly.

Before closing the question, I want to thank all the stackers who dedicated some of their time to figure out a solution and post it here.


Solution

  • It might be a corrupt .csv file? Otherwise, this code runs perfectly.

    #!/usr/bin/python
    
    import urllib
    import csv
    
    url = "http://www.football-data.co.uk/mmz4281/1213/I1.csv"
    csv_file = urllib.urlopen(url)
    
    for row in csv.reader(csv_file):
      print row
    

    Credits to J.F. Sebastian for the .csv file.

    Altough, you might want to consider sharing the specific .csv file with us? So we can try to re-create the error.