Search code examples
pythonexport-to-csvnamedtuple

How to create a class to define a CSV file layout including a header?


I would like to create an approach where I can define the structure of a CSV file (obvious extension to excel should follow) where there is a row definition as well has the header. In this approach a simple re-ordering of the definition will move the columns in the output.

My first attempt was to use a namedtuple. Actually handled most of my needs but I can't create an empty row the populate it as needed. I tried to use a recordclass but have much the same problem.

My output file might have > 30 columns so it gets very sloppy to have to create a new instance with a bunch of Nones. I also want to be able to add a column to the structure without having to update the __init__, etc.

My idea pseudo-code (using namedtuples for illustration) would be:

class TableRow(namedtuple(TableRow, "id name password hostip"))
    __slots__ = ()


class TableRowHeader:
    def __init__(self):
        header = TableRow()
        header.id = 'ID'
        header.name = "Name"
        header.password = "Password"
        header.hostip = "Host IP"


class OutputTable():
    def __init__(self):
        self.header = TableRowHeader()
        self.rows = list()

    def add(self, new_row):
        # Example assumes new_row is an instance of TableRow
        self.rows.append(new_row)

    def to_csv(self, file_name):
        with open(file_name, 'w') as csv_file:
            # creating a csv writer object
            csv_writer = csv.writer(csv_file)

            # writing the fields
            csv_writer.writerow(self.header)

            for row in sorted(self.rows):
                csv_writer.writerow(row)  


outtable = OutputTable()
row = TableRow()
row.id = 1
row.name = 'Matt'
row.hostip = '10.0.0.1'
row.password = 'obvious'      
outtable.add(row)

outtable.to_csv('./example.csv') 

I like the pattern but can't figure out a clean way to handle this in Python.


Solution

  • Do you want something like that?

    import csv
    from collections import namedtuple
    
    TableRowShort = namedtuple('TableRow', "id name password hostip")
    TableRowFull = namedtuple('TableRowFull', "id name password hostip description source admin_name")
    
    
    class TableRowOptional:
        def __init__(self, id, name, password=None, hostip=None, description=None, source=None, admin_name=None):
            super().__init__()
    
            self.id = id
            self.name = name
            self.password = password
            self.hostip = hostip
            self.description = description
            self.source = source
            self.admin_name = admin_name
    
    
    class OutputTable():
        def __init__(self):
            self.headers = []
            self.rows = list()
    
        def add(self, row):
            if hasattr(row, '_asdict'):
                value = row._asdict()
            elif hasattr(row, '__dict__'):
                value = row.__dict__
            elif isinstance(row, dict):
                value = row
            else:
                raise ValueError('Not supported row type: {}'.format(type(row)))
    
            for header in value.keys():
                if header not in self.headers:
                    self.headers.append(header)
    
            self.rows.append(value)
    
        def to_csv(self, file_name):
            with open(file_name, 'w') as csv_file:
                # creating a csv writer object
                csv_writer = csv.writer(csv_file)
    
                # writing the fields
                csv_writer.writerow(self.headers)
    
                for row in self.rows:
                    csv_writer.writerow([row.get(header, None) for header in self.headers])
    
    
    outtable = OutputTable()
    outtable.add(TableRowShort(1, 'Matt', 'obvious', '10.0.0.1'))
    outtable.add(TableRowFull(2, 'Maria', 'obvious as usual', '10.1.0.1', 'some description', 'localnet', 'super_admin'))
    outtable.add(TableRowOptional(3, 'Maria', hostip='10.1.0.1', description='some description', source='localnet'))
    outtable.add({
        'id': 1337,
        'name': 'hacker',
        'hostip': '127.0.0.1',
        'extra': "I've hacked you guys lol!",
    })
    
    outtable.to_csv('./example.csv')
    
    
    

    This solution provides you interface to store some "prepared namedtuples, normal objects (using __dict__ interface) and raw dict objects as rows. It manages CSV headers automatically based on provided rows structures :)

    Looks pretty clear & useful to me. What do you think?

    Output CSV

    # > cat example.csv
    id,name,password,hostip,description,source,admin_name,extra
    1,Matt,obvious,10.0.0.1,,,,
    2,Maria,obvious as usual,10.1.0.1,some description,localnet,super_admin,
    3,Maria,,10.1.0.1,some description,localnet,,
    1337,hacker,,127.0.0.1,,,,I've hacked you guys lol!