These two approaches yield different results in python 3.7.3:
res = urllib.request.urlopen(url, timeout=timeout)
content = res.read().decode('utf-8')
reader = csv.reader(StringIO(content))
lines = list(reader)
And
res = urllib.request.urlopen(url, timeout=timeout)
content = res.read().decode('utf-8')
reader = csv.reader(content)
lines = list(reader)
The former gives me what I want, a list of the rows from the CSV, the latter gives me a list containing lists of length 1 of single characters only (each character in the text is its own list), so:
Year,PID
2019,1
2018,2
And
Y
e
a
r,
P
i
d
(etc)
What's the difference?
in-memory stream for text I/O
For strings StringIO can be used like a file opened in text mode
csv.reader treats StringIO(content)
as open file.
And reader
is
a reader object which will iterate over lines in the given csvfile
lines = list(reader)
will return you a list of lines in content
In the second case content
is of type string.
csv.reader(content)
will return an iterator over the string.And this is because:
csv.reader(csvfile, dialect='excel', **fmtparams) csvfile can be any object which supports the iterator protocol and returns a string each time its next() method is called
lines = list(reader)
returns a list of characters, as it treats each character in content
as a row.