Search code examples
pythonrow

How do I access the row index in an agate Formula?


Using Python's tabular data library agate I want to define a compute Formula, which accesses the row index. I tried

agate.Formula(agate.Text(), lambda r: r.index())

but this doesn't work, because the Row object does not provide a (row) index (unlike the Column object!). Is there a way to access the row index inside the formula?

(I need this in order to create a new column with values unique for each row.)


Solution

  • From my research I concluded, that theer is no way to access the row number in the function of a standard Formula. (Of course I'm happy to be proven wrong!)

    However in order to achieve what's asked in the question I can subclass Formula, change the signature of the called function add the row number as parameter in:

    class EnumeratedFormula(agate.Formula):
        """
        An agate formula which provides a row index to its compute function
        The function used has now the signature f(i,r)
        """
        def run(self, table):
            new_column = []
    
            for i, row in enumerate(table.rows):
                v = self._func(i,row)
    
                if self._cast:
                    v = self._data_type.cast(v)
    
                new_column.append(v)
    
            return new_column
    

    With this I can write a compute expression which creates a new column with unique values unique for each row:

    EnumeratedFormula(agate.Text(), lambda i, r: str(i)))