Search code examples
pythoncsvnamedtuple

What is the pythonic way to read CSV file data as rows of namedtuples?


What is the best way to take a data file that contains a header row and read this row into a named tuple so that the data rows can be accessed by header name?

I was attempting something like this:

import csv
from collections import namedtuple

with open('data_file.txt', mode="r") as infile:
    reader = csv.reader(infile)
    Data = namedtuple("Data", ", ".join(i for i in reader[0]))
    next(reader)
    for row in reader:
        data = Data(*row)

The reader object is not subscriptable, so the above code throws a TypeError. What is the pythonic way to reader a file header into a namedtuple?


Solution

  • Use:

    Data = namedtuple("Data", next(reader))
    

    and omit the line:

    next(reader)
    

    Combining this with an iterative version based on martineau's comment below, the example becomes for Python 2

    import csv
    from collections import namedtuple
    from itertools import imap
    
    with open("data_file.txt", mode="rb") as infile:
        reader = csv.reader(infile)
        Data = namedtuple("Data", next(reader))  # get names from column headers
        for data in imap(Data._make, reader):
            print data.foo
            # ...further processing of a line...
    

    and for Python 3

    import csv
    from collections import namedtuple
    
    with open("data_file.txt", newline="") as infile:
        reader = csv.reader(infile)
        Data = namedtuple("Data", next(reader))  # get names from column headers
        for data in map(Data._make, reader):
            print(data.foo)
            # ...further processing of a line...