How to create a class to define a CSV file layout including a header?

I would like to create an approach where I can define the structure of a CSV file (obvious extension to excel should follow) where there is a row definition as well has the header. In this approach a simple re-ordering of the definition will move the columns in the output.

My first attempt was to use a namedtuple. Actually handled most of my needs but I can't create an empty row the populate it as needed. I tried to use a recordclass but have much the same problem.

My output file might have > 30 columns so it gets very sloppy to have to create a new instance with a bunch of Nones. I also want to be able to add a column to the structure without having to update the __init__, etc.

My idea pseudo-code (using namedtuples for illustration) would be:

class TableRow(namedtuple(TableRow, "id name password hostip"))
    __slots__ = ()


class TableRowHeader:
    def __init__(self):
        header = TableRow()
        header.id = 'ID'
        header.name = "Name"
        header.password = "Password"
        header.hostip = "Host IP"


class OutputTable():
    def __init__(self):
        self.header = TableRowHeader()
        self.rows = list()

    def add(self, new_row):
        # Example assumes new_row is an instance of TableRow
        self.rows.append(new_row)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.header)

            for row in sorted(self.rows):
                csv_writer.writerow(row)  


outtable = OutputTable()
row = TableRow()
row.id = 1
row.name = 'Matt'
row.hostip = '10.0.0.1'
row.password = 'obvious'      
outtable.add(row)

outtable.to_csv('./example.csv')

I like the pattern but can't figure out a clean way to handle this in Python.

Solution

Do you want something like that?

import csv
from collections import namedtuple

TableRowShort = namedtuple('TableRow', "id name password hostip")
TableRowFull = namedtuple('TableRowFull', "id name password hostip description source admin_name")


class TableRowOptional:
    def __init__(self, id, name, password=None, hostip=None, description=None, source=None, admin_name=None):
        super().__init__()

        self.id = id
        self.name = name
        self.password = password
        self.hostip = hostip
        self.description = description
        self.source = source
        self.admin_name = admin_name


class OutputTable():
    def __init__(self):
        self.headers = []
        self.rows = list()

    def add(self, row):
        if hasattr(row, '_asdict'):
            value = row._asdict()
        elif hasattr(row, '__dict__'):
            value = row.__dict__
        elif isinstance(row, dict):
            value = row
        else:
            raise ValueError('Not supported row type: {}'.format(type(row)))

        for header in value.keys():
            if header not in self.headers:
                self.headers.append(header)

        self.rows.append(value)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.headers)

            for row in self.rows:
                csv_writer.writerow([row.get(header, None) for header in self.headers])


outtable = OutputTable()
outtable.add(TableRowShort(1, 'Matt', 'obvious', '10.0.0.1'))
outtable.add(TableRowFull(2, 'Maria', 'obvious as usual', '10.1.0.1', 'some description', 'localnet', 'super_admin'))
outtable.add(TableRowOptional(3, 'Maria', hostip='10.1.0.1', description='some description', source='localnet'))
outtable.add({
    'id': 1337,
    'name': 'hacker',
    'hostip': '127.0.0.1',
    'extra': "I've hacked you guys lol!",
})

outtable.to_csv('./example.csv')

This solution provides you interface to store some "prepared namedtuples, normal objects (using __dict__ interface) and raw dict objects as rows. It manages CSV headers automatically based on provided rows structures :)

Looks pretty clear & useful to me. What do you think?

Output CSV

# > cat example.csv
id,name,password,hostip,description,source,admin_name,extra
1,Matt,obvious,10.0.0.1,,,,
2,Maria,obvious as usual,10.1.0.1,some description,localnet,super_admin,
3,Maria,,10.1.0.1,some description,localnet,,
1337,hacker,,127.0.0.1,,,,I've hacked you guys lol!