Using Python's tabular data library agate I want to define a compute Formula
, which accesses the row index. I tried
agate.Formula(agate.Text(), lambda r: r.index())
but this doesn't work, because the Row
object does not provide a (row) index (unlike the Column object!). Is there a way to access the row index inside the formula?
(I need this in order to create a new column with values unique for each row.)
From my research I concluded, that theer is no way to access the row number in the function of a standard Formula
. (Of course I'm happy to be proven wrong!)
However in order to achieve what's asked in the question I can subclass Formula
, change the signature of the called function add the row number as parameter in:
class EnumeratedFormula(agate.Formula):
"""
An agate formula which provides a row index to its compute function
The function used has now the signature f(i,r)
"""
def run(self, table):
new_column = []
for i, row in enumerate(table.rows):
v = self._func(i,row)
if self._cast:
v = self._data_type.cast(v)
new_column.append(v)
return new_column
With this I can write a compute expression which creates a new column with unique values unique for each row:
EnumeratedFormula(agate.Text(), lambda i, r: str(i)))