Search code examples
pythonparsingcsvline

Python - how to read/parse csv like line?


I have done some search but most answer is about reading a complete csv file and none of these is like the problem I'm facing.

I'm trying to read a file from net using urllib2:

request = urllib2.Request('http://.../tv.txt')
response = urllib2.urlopen(request)
lines = response.readlines()
for line in lines:
    ...

The "line" format looks like these:

"ABC", "XYZ,MNO", "KLM"
"ABC", "MN"
"ABC", "123", "10", "OPPA GANGNAM STYLE", "LADY"

As seen above, these lines are not actually CSV lines. The number of columns keeps changing.

Is there a way to split each line into a list? The desire result should be:

["ABC", "XYZ,MNO", "KLM"]
["ABC", "MN"]
["ABC", "123", "10", "OPPA GANGNAM STYLE", "LADY"]

I've tried using line.split(",") but it cannot split correctly because there is comma inside each pair of double quotes.

Please help me if you know how to. Thank you very much.

Cheers,

PHP-Python-Java-MySQL-newbie.


Solution

  • use the csv module, it does what you need.

    yourstring= '"ABC", "XYZ,MNO", "KLM"\n"ABC", "MN"\n"ABC", "123", "10", "OPPA GANGNAM STYLE", "LADY"'
    
    import csv
    import io
    
    class MyDialect(csv.Dialect):
        strict = True
        skipinitialspace = True
        quoting = csv.QUOTE_ALL
        delimiter = ','
        quotechar = '"'
        lineterminator = '\n'
    
    
    b = io.StringIO(yourstring)
    r = csv.reader(b, MyDialect())
    
    for i in r:
        print len(i), ':',' @ '.join(i)