Search code examples
pythonhadoopimpalaibis

Is there a way to iterate over table rows using Ibis (impala)


I have a fairly large Ibis TableExpr for which I would like to iterate over the rows to produce a specialized file output (FASTA nucleotide sequences). Is there any way to do this with Ibis, or should I just call execute to create a pandas DataFrame for which I can call iterrows?

I cannot find anything in the API or tutorials.


Solution

  • You should iterate over the pandas DataFrame as you say.

    Or you should be able to also get the Impyla cursor that the backend generates calling lower level functions than .execute(). But those functions are likely to change when we release Ibis 2.0, so your code is likely to break.

    Happy to receive feedback if that's something you'd be interested in. You can open an issue in the project GitHub.