I am trying to read a gzipped csv file from a url. This is a very big file with more than 50.000 lines. When I try the code below I get an error: _csv.Error: line contains NULL byte
import csv
import urllib2
url = '[my-url-to-csv-file].gz'
response = urllib2.urlopen(url)
cr = csv.reader(response)
for row in cr:
if len(row) <= 1: continue
print row
If I try to print the content of the file before I try to read it I get something like this:
?M}?7?M==??7M???z?YJ?????5{Ci?jK??3b??p?
?[?=?j&=????=?0u'???}mwBt??-E?m??Ծ??????WM??wj??Z??ėe?D?VF????4=Y?Y?tA???
How can I read the gzipped csv file from this URL properly?
How to Open a .gz (gzip) csv File from a URL with urllib2.urlopen
StringIO.StringIO()
.gzip.Gzipfile()
.To use your example:
from StringIO import StringIO
import gzip
import urllib2
url = '[my-url-to-csv-file].gz'
mem = StringIO(urlopen(url).read())
f = gzip.GzipFile(fileobj=mem, mode='rb')
data = f.read()
for line in data:
print line