Search code examples
pythoncsvcsvreader

How do I make csv.reader() index by field/string instead of by character?


I am trying to use csv.reader() to pull values from a database for later comparison. I want to have the reader index the row so that each element in the list corresponds to a comma separated value, instead of the character in the row.

My code:

with open(sys.argv[1]) as str_db:
    str_reader = csv.reader(str_db)
    line_count = 0
    fields = []
    for row in str_db:
        if line_count == 0:
            fields = re.split(",", row)
            line_count += 1
        else:
            print(f"{fields[0]}: {row[0]}, {fields[1]}: {row[1]}, {fields[2]}: {row[2]}, {fields[3]}: {row[3]}")

Pulling from the file = argv[1]:

name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5

Where I expect to see:

name: Alice, AGATC: 2, AATG: 8, TATC: 3
name: Bob, AGATC: 4, AATG: 1, TATC: 5
name: Charlie, AGATC: 3, AATG: 2, TATC: 5

Instead this is my output:

name: A, AGATC: l, AATG: i, TATC
: c
name: B, AGATC: o, AATG: b, TATC
: ,
name: C, AGATC: h, AATG: a, TATC
: r

Bonus thanks if you can tell me why a new line starts at the end of TATC.

I've tried:

with open(sys.argv[1]) as str_db:
    str_reader = csv.reader(str_db, delimiter = ',')
    line_count = 0
    fields = []
    for row in str_db:
        if line_count == 0:
            fields = re.split(",", row)
            line_count += 1
        else:
            print(f"{fields[0]}: {row[0]}, {fields[1]}: {row[1]}, {fields[2]}: {row[2]}, {fields[3]}: {row[3]}")

but there is no change.


Solution

  • Actually using the reader produces the correct results. Note newline='' is recommended for opening files with csv.reader and csv.writer:

    import csv
    
    with open('input.csv',newline='') as str_db:
        reader = csv.reader(str_db)
        fields = next(reader)
        for row in reader:
            print(f"{fields[0]}: {row[0]}, {fields[1]}: {row[1]}, {fields[2]}: {row[2]}, {fields[3]}: {row[3]}")
    

    Output:

    name: Alice, AGATC: 2, AATG: 8, TATC: 3
    name: Bob, AGATC: 4, AATG: 1, TATC: 5
    name: Charlie, AGATC: 3, AATG: 2, TATC: 5
    

    DictReader can help a little:

    import csv
    
    with open('input.csv') as str_db:
        reader = csv.DictReader(str_db)
        for row in reader:
            print(', '.join([f'{key}: {value}' for key,value in row.items()]))
    

    (same output)